Efficient Document Clustering System Based on Probability Distribution of K-Means (PD K-Means) Model
Journal Title: International Journal of the Computer, the Internet and Management - Year 2018, Vol 26, Issue 1
Abstract
In document clustering system, some documents with the same similarity scores may fall into different clusters instead of same cluster due to calculate similarity distance between pairs of documents based on geometric measurements. To tackle this point, probability distribution of K-Means (PD K-Means) algorithm is proposed. In this system, documents are clustered based on proposed probability distribution equation instead of similarity measure between objects. It can also solve initial centroids problems of K-Means by using Systematic Selection of Initial Centroid (SSIC) approach. So, it not only can generate compact and stable results but also eliminates initial cluster problem of K-Means. According to the experiment, F-measure values increase about 0.28 in 20 NewsGroup dataset, 0.26 in R8 and 0.14 in R52 from Reuter21578 datasets. The evaluations demonstrate that the proposed solution outperforms than original method and can be applied for various standard and unsupervised datasets.
Authors and Affiliations
Tin Thu Zar Win, Nang Aye Aye Htwe, Moe Moe Aye
A Conceptual Model of Big Data for Electrical Energy Management in Smart City
Smart City was the new age technological system for management and admin the facilities system control with Information technology. But, only general information technology incommensurate and not cover the requirement of...
A Conceptual Framework of Collaborative-Based Recommender System in Risk Management for Curriculum Management Under the Thai Qualification Framework for Higher Education
The aim of this research is to develop a conceptual framework of collaborative-based recommender system in risk management for curriculum management under the Thai Qualification Framework for Higher Education (TFQ:HEd)....
The Development of Schools and the Quality of the Teachers’ Lives under Municipalities in Thailand
The purpose of this research was to study and analyze factors affecting the development of schools and how this development is reflected in the Quality of the teachers’ lives under municipalities in Thailand. The sample...
Curriculum Development of Building Inspectors According to Building Control Act (2522) Using Delphi Technique
Research and data collection contained 3 steps. The first step was the interview of building inspectors expert as Opened End including issues of Curriculum Development of Building Inspectors and opinion collection of eac...
Text Neck Epidemic: a Growing Problem for Smart Phone Users in Thailand
This research study examined the rapid changes in computer user behavior among Thai internet users, and analyzed differences in the computer health risk between desktop users and mobile device users. The emphasis is on “...