Investigating the Use of Machine Learning Algorithms in Detecting Gender of the Arabic Tweet Author

Abstract

Twitter is one of the most popular social network sites on the Internet to share opinions and knowledge extensively. Many advertisers use these Tweets to collect some features and attributes of Tweeters to target specific groups of highly engaged people. Gender detection is a sub-field of sentiment analysis for extracting and predicting the gender of a Tweet author. In this paper, we aim to investigate the gender of Tweet authors using different classification mining techniques on Arabic language, such as Naïve Bayes (NB), Support vector machine (SVM), Naïve Bayes Multinomial (NBM), J48 decision tree, KNN. The results show that the NBM, SVM, and J48 classifiers can achieve accuracy above to 98%, by adding names of Tweet author as a feature. The results also show that the preprocessing approach has negative effect on the accuracy of gender detection. In nutshell, this study shows that the ability of using machine learning classifiers in detecting the gender of Arabic Tweet author.

Authors and Affiliations

Emad AlSukhni, Qasem Alequr

Keywords

Related Articles

Cluster Based Routing Protocols for Wireless Sensor Networks: An Overview

Energy consumption of nodes in Wireless Sensor Networks (WSNs) is a very critical issue, particularly in scenarios where the energy of nodes cannot be recharged. Optimal routing approaches play a key role in energy utili...

Printed Arabic Text Recognition using Linear and Nonlinear Regression

Arabic language is one of the most popular languages in the world. Hundreds of millions of people in many countries around the world speak Arabic as their native speaking. However, due to complexity of Arabic language, r...

Triple SV: A Bit Level Symmetric Block-Cipher Having High Avalanche Effect

The prolific growth of network communication system entails high risk of breach in information security. This substantiates the need for higher security for electronic information. Cryptography is one of the ways to secu...

SVM based Emotional Speaker Recognition using MFCC-SDC Features

Enhancing the performance of emotional speaker recognition process has witnessed an increasing interest in the last years. This paper highlights a methodology for speaker recognition under different emotional states base...

Detection of Cardiac Disease using Data Mining Classification Techniques

Cardiac Disease (CD) is one of the major causes of death. An important task is to identify the Cardiac disease very minutely and precisely. Generally medical diagnostic errors are dangerous and costly. Worldwide they are...

Download PDF file
  • EP ID EP112537
  • DOI 10.14569/IJACSA.2016.070746
  • Views 76
  • Downloads 0

How To Cite

Emad AlSukhni, Qasem Alequr (2016). Investigating the Use of Machine Learning Algorithms in Detecting Gender of the Arabic Tweet Author. International Journal of Advanced Computer Science & Applications, 7(7), 319-328. https://europub.co.uk/articles/-A-112537