Social Media Text Classification by Enhancing Well-Formed Text Trained Model

Journal Title: Journal of ICT Research and Applications - Year 2016, Vol 10, Issue 2

Abstract

Social media are a powerful communication tool in our era of digital information. The large amount of user-generated data is a useful novel source of data, even though it is not easy to extract the treasures from this vast and noisy trove. Since classification is an important part of text mining, many techniques have been proposed to classify this kind of information. We developed an effective technique of social media text classification by semi-supervised learning utilizing an online news source consisting of well-formed text. The computer first automatically extracts news categories, well-categorized by publishers, as classes for topic classification. A bag of words taken from news articles provides the initial keywords related to their category in the form of word vectors. The principal task is to retrieve a set of new productive keywords. Term Frequency-Inverse Document Frequency weighting (TF-IDF) and Word Article Matrix (WAM) are used as main methods. A modification of WAM is recomputed until it becomes the most effective model for social media text classification. The key success factor was enhancing our model with effective keywords from social media. A promising result of 99.50% accuracy was achieved, with more than 98.5% of Precision, Recall, and F-measure after updating the model three times.

Authors and Affiliations

Phat Jotikabukkana

Keywords

Related Articles

Improving Floating Search Feature Selection using Genetic Algorithm

Classification, a process for predicting the class of a given input data, is one of the most fundamental tasks in data mining. Classification performance is negatively affected by noisy data and therefore selecting featu...

Iris Segmentation using Gradient Magnitude and Fourier Descriptor for Multimodal Biometric Authentication System

Perfectly segmenting the area of the iris is one of the most important steps in iris recognition. There are several problematic areas that affect the accuracy of the iris segmentation step, such as eyelids, eyelashes, gl...

Generic Animation Method for Multi-Objects in IFS Fractal Form

Both non-metamorphic animation and metamorphic animation of objects or multi-objects in IFS fractal form as basic animation method can be implemented by a modified version of the random iteration algorithm as basic algor...

Performance Improvement of LeastSquares Adaptive Filter for High-Speed Train Communication Systems

The downlink communication channel from high-altitude platform (HAP) to high-speed train (HST) in the Ka-band is a slowly time-varying Rician distributed flat fading channel with 10-25 dB Rician K factor. In this respect...

Automatic Title Generation in Scientific Articles for Authorship Assistance: A Summarization Approach

This paper presents a study on automatic title generation for scientific articles considering sentence information types known as rhetorical categories. A title can be seen as a high-compression summary of a document. A...

Download PDF file
  • EP ID EP331715
  • DOI 10.5614/itbj.ict.res.appl.2016.10.2.6
  • Views 125
  • Downloads 0

How To Cite

Phat Jotikabukkana (2016). Social Media Text Classification by Enhancing Well-Formed Text Trained Model. Journal of ICT Research and Applications, 10(2), 177-196. https://europub.co.uk/articles/-A-331715