COSTECH Integrated Repository

A framework for automated detection of offensive messages in social networks in Kiswahili

Show simple item record

dc.creator Barongo, Everyjustus
dc.date 2018-10-15T17:55:53Z
dc.date 2018-10-15T17:55:53Z
dc.date 2017
dc.date.accessioned 2019-12-06T12:19:52Z
dc.date.available 2019-12-06T12:19:52Z
dc.identifier Barongo, E. (2017). A framework for automated detection of offensive messages in social networks in Kiswahili. Dodoma: The university of Dodoma.
dc.identifier http://hdl.handle.net/123456789/507
dc.identifier.uri http://hdl.handle.net/123456789/15187
dc.description Dissertation (MSc Computer Science)
dc.description The diffusion of information generated in Social Networks Sites is the result of more people being connected. The connected people chats and comments by posting contents like images, video, and messages. In fact the social networks have been and are useful to communities in such they bring relatives together especially in sharing experiences and feelings. Although social networks have been beneficial to users, some of the shared messages and comments contain sexual and political harassments. This is particularly the same in Kiswahili speaking countries like Tanzania. Most if not all of the Kiswahili social networks sites, the offensive messages have been and are publicly posted. These messages harass, embarrass, and even assault users and to some extent lead to psychological effect. This study proposes a framework for automating the detection of offensive messages on social networks in Kiswahili settings by applying some selected machine learning techniques. Specifically, the study created Kiswahili dataset containing sexual and political offensive messages and normal messages1. All of these messages were collected from Facebook, YouTube, and JamiiForum and they were used for evaluating the performance of the selected text classification algorithms. The collected messages were preprocessed by using Bag-of-Word (BoW) model, Term Frequency Inverse Document Frequency (TF-IDF) and N-grams techniques to generate feature vectors. The experimental findings using the generated feature vectors showed that the Random Forest classifier was capable of correctly assigning a message into a correct class label with an accuracy of 95.0259 %, f1- Measure of 0.950 (95.0%) and false positive rate of 2.8 % when applied to three categorical dataset. On the other hand, the SVM-Linear showed better results when applied in two categorical data. The study suggests the REST API based framework with random forest classifier and Kiswahili dataset to be deployed in real social net
dc.language en
dc.publisher The University of Dodoma
dc.subject Social Media
dc.subject Kiswahili
dc.subject Jamii Forum
dc.subject Social networks
dc.subject Social media
dc.subject Offensive messages
dc.subject Sexual offensive messages
dc.subject Political offensive messages
dc.subject Sexual harassment
dc.subject Political harassment
dc.subject Harassment
dc.subject offensive messages
dc.title A framework for automated detection of offensive messages in social networks in Kiswahili
dc.type Dissertation


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search COSTECH


Advanced Search

Browse

My Account