Applying feature transformation using relative frequency with power transformation and lemmatization in automatic spam filtering

No Thumbnail Available

Date

Journal Title

Journal ISSN

Volume Title

Publisher

International Journal of Computer Science & Network Solutions (IJCSNS)

Abstract

Description

Abstract. Full text article available at https://doi.org/10.5897/IJBC2019.126
Advances in Information and communication technology have paved a way for electronic mail commonly referred as email to become the medium of communication. Over the recent years this medium has become the target of abuse through spamming. One of the approaches of combating spamming is the use of automatic spam filtering through machine learning. The conventional features in automatic spam filtering are Term Frequency with Inverse Document Frequency (TFIDF). In this paper, an alternative approach is presented with the use of Relative Frequency with Power Transformation (RFPT) coupled with lemmatization technique. The techniques used considerably show improvements over the conventional one that is TFIDF.

Keywords

Spam filtering, Machine learning, Lemmatization, Power transformation, Term frequency, Information Communication Technology, ICT, Electronic mail

Citation