An Ensemble of Deep Learning Models for Bengali Cyberbullying Detection and Interpretability

Foysal, Sazzad Hossain; Ahad, Abdul; Hossain, Md Mithun; Nahin, Sk Munzurul Islam; Nayem, Sayada Jannatun

BUBTLR Home
→
Faculty of Engineering & Applied Sciences
→
Computer Science and Engineering
→
Internship Report
→
2023
→
View Item

dc.contributor.author	Foysal, Sazzad Hossain
dc.contributor.author	Ahad, Abdul
dc.contributor.author	Hossain, Md Mithun
dc.contributor.author	Nahin, Sk Munzurul Islam
dc.contributor.author	Nayem, Sayada Jannatun
dc.date.accessioned	2023-12-24T03:34:24Z
dc.date.available	2023-12-24T03:34:24Z
dc.date.issued	2023-11
dc.identifier.uri	http://103.15.140.189/handle/123456789/266
dc.description	Internship Report	en_US
dc.description.abstract	The present world has given people a huge amount of freedom and people frequently misuse this great opportunity by harassing others. Modern people use the internet as an essential part of their lives and there are almost 4.9 billion active users of the internet and 4.66 billion active social media users. As people can easily reach each other and freely share their thoughts, many of them abuse, harass, or threaten other people on social media. In spite of having a huge number of Bangla speakers and a huge risk and potential of cyberbullying, there are very few studies to identify bullying messages or comments in the Bengali language. Artificial intelligence has made an amazing development in recent years and researchers have decided to build an ensemble model based on deep learning models to identify the bully comments on cyberspace so that they can remove them and decrease the rate of cyberbullying. A Kaggle dataset with 44001 Bangla comments has been used in the study for training and testing the ensemble model. An ensemble model based on GRU, LSTM, and CNN was developed in this study which showed 97.4% accuracy. Before training and testing the dataset, several data pre-processing methods including data cleaning, stop words removal, and tokenization were followed. In this study, we used BERT tokenization for tokenizing texts and used Explainable AI (XAI) to understand the procedure of the model. The results of single models were compared with the ensemble model to understand the efficiency of the model which can be implemented to reduce cyberbullying problems.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Department of Computer Science & Engineering (CSE) , BUBT	en_US
dc.subject	CSE	en_US
dc.subject	Interpretability	en_US
dc.subject	Cyberbullying Detection	en_US
dc.subject	Ensemble	en_US
dc.subject	Deep Learning	en_US
dc.subject	Bengali	en_US
dc.title	An Ensemble of Deep Learning Models for Bengali Cyberbullying Detection and Interpretability	en_US
dc.type	Technical Report	en_US