Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm

Journal Title: EMITTER International Journal of Engineering Technology - Year 2017, Vol 5, Issue 2

Abstract

Radical content in procedural meaning is content which have provoke the violence, spread the hatred and anti nationalism. Radical definition for each country is different, especially in Indonesia. Radical content is more identical with provocation issue, ethnic and religious hatred that is called SARA in Indonesian languange. SARA content is very difficult to detect due to the large number, unstructure system and many noise can be caused multiple interpretations. This problem can threat the unity and harmony of the religion. According to this condition, it is required a system that can distinguish the radical content or not. In this system, we propose text mining approach using DF threshold and Human Brain as the feature extraction. The system is divided into several steps, those are collecting data which is including at preprocessing part, text mining, selection features, classification for grouping the data with class label, simillarity calculation of data training, and visualization to the radical content or non radical content. The experimental result show that using combination from 10-cross validation and k-Nearest Neighbor (kNN) as the classification methods achieve 66.37% accuracy performance with 7 k value of kNN method [1].

Authors and Affiliations

Muh. Subhan, Amang Sudarsono, Ali Ridho Barakbah

Keywords

Related Articles

Mobile Application to Identify Indonesian Flowers on Android Platform

Although many people love flowers, they do not know their name. Especially, many people do not recognize local flowers. To find the flower image, we can use search engine such as Google, but it does not give much help to...

Centronit: Initial Centroid Designation Algorithm for K-Means Clustering

Clustering performance of the K-means highly depends on the correctness of initial centroids. Usually initial centroids for the K- means clustering are determined randomly so that the determined initial centers may cause...

Spatio-Temporal Deforestation Measurement Using Automatic Clustering

Deforestation is one of the crucial issues in Indonesia. In 2012, deforestation rate in Indonesia reached 0.84 million hectares, exceeding Brazil. According to the 2009 Guinness World Records, Indonesia's deforestation r...

Differential Spatio-temporal Multiband Satellite Image Clustering using K-means Optimization With Reinforcement Programming

Deforestration is one of the crucial issues in Indonesia because now Indonesia has world's highest deforestation rate. In other hand, multispectral image delivers a great source of data for studying spatial and temporal...

Performance Analysis of CP-Based and CAZAC Training Sequence-Based Synchronization in OFDM System

Orthogonal Frequency Division Multiplexing (OFDM) is a popular wireless data transmission scheme. However, its synchronization is still being a major problem when it is applied in real hardware. Cyclic Prefix (CP) based...

Download PDF file
  • EP ID EP320663
  • DOI 10.24003/emitter.v5i2.214
  • Views 104
  • Downloads 0

How To Cite

Muh. Subhan, Amang Sudarsono, Ali Ridho Barakbah (2017). Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm. EMITTER International Journal of Engineering Technology, 5(2), 328-348. https://europub.co.uk/articles/-A-320663