Investigating the Use of Machine Learning Algorithms in Detecting Gender of the Arabic Tweet Author

Abstract

Twitter is one of the most popular social network sites on the Internet to share opinions and knowledge extensively. Many advertisers use these Tweets to collect some features and attributes of Tweeters to target specific groups of highly engaged people. Gender detection is a sub-field of sentiment analysis for extracting and predicting the gender of a Tweet author. In this paper, we aim to investigate the gender of Tweet authors using different classification mining techniques on Arabic language, such as Naïve Bayes (NB), Support vector machine (SVM), Naïve Bayes Multinomial (NBM), J48 decision tree, KNN. The results show that the NBM, SVM, and J48 classifiers can achieve accuracy above to 98%, by adding names of Tweet author as a feature. The results also show that the preprocessing approach has negative effect on the accuracy of gender detection. In nutshell, this study shows that the ability of using machine learning classifiers in detecting the gender of Arabic Tweet author.

Authors and Affiliations

Emad AlSukhni, Qasem Alequr

Keywords

Related Articles

Spin-Then-Sleep: A Machine Learning Alternative to Queue-based Spin-then-Block Strategy

One of the issues with spinlock protocols is excessive spinning which results in a waste of CPU cycles. Some protocols use the hybrid, spin-then-block approach to avoid this problem. In this case, the contending thread m...

The Development Process of the Semantic Web and Web Ontology

This paper deals with the semantic web and web ontology. The existing ontology development processes are not catered towards casual web ontology development, a notion analogous to standard web page development. Ontologie...

A Review of Data Synchronization and Consistency Frameworks for Mobile Cloud Applications

Mobile devices are rapidly becoming the predom-inant means of accessing the Internet due to advances in wireless communication techniques. The development of Mobile applications (“apps”) for various platforms is on the r...

Development of a Two Factor Authentication for Vehicle Parking Space Control based on Automatic Number Plate Recognition and Radio Frequency Identification

This paper proposed a two factor authentication for vehicle access controls using Automatic Number Plate Recognition (ANPR) and Radio Frequency Identification system (RFID) for the University of Zambia (UNZA) vehicle acc...

Multi-Stage Algorithms for Solving a Generalized Capacitated P-median Location Problem

The capacitated p-median location problem is one of the famous problems widely discussed in the literature, but its generalization to a multi-capacity case has not. This generalization, called multi-capacitated location...

Download PDF file
  • EP ID EP112537
  • DOI 10.14569/IJACSA.2016.070746
  • Views 94
  • Downloads 0

How To Cite

Emad AlSukhni, Qasem Alequr (2016). Investigating the Use of Machine Learning Algorithms in Detecting Gender of the Arabic Tweet Author. International Journal of Advanced Computer Science & Applications, 7(7), 319-328. https://europub.co.uk/articles/-A-112537