Document Grouping by Using Meronyms and Type-2 Fuzzy Association Rule Mining
Journal Title: Journal of ICT Research and Applications - Year 2017, Vol 11, Issue 3
Abstract
The growth of the number of textual documents in the digital world, especially on the World Wide Web, is incredibly fast. This causes an accumulation of information, so we need efficient organization to manage textual documents. One way to accurately classify documents is using fuzzy association rules. The quality of the document clustering is affected by phase extraction of key terms and type of fuzzy logic system (FLS) used for clustering. The use of meronyms in the extraction of key terms to obtain cluster labels helps obtaining meaningful cluster labels and in addition ambiguities and uncertainties that occur in the rules of type-1 fuzzy logic systems can be overcome by using type-2 fuzzy sets. This study proposes a method of key term extraction based on meronyms with an initialization cluster using fuzzy association rule mining for document clustering. This method consists of four stages, i.e. preprocessing of the document, extraction of key terms with meronyms, extraction of candidate clusters, and cluster tree construction. Testing of this method was done with three different datasets: classic, Reuters, and 20 Newsgroup. Testing was done by comparing the overall F-measure of the method without meronyms and with meronyms. Based on the testing, the method with meronyms in the extraction of keywords produced an overall F-measure of 0.5753 for the classic dataset, 0.3984 for the Reuters dataset, and 0.6285 for the 20 Newsgroup dataset.
Authors and Affiliations
Fahrur Rozi, Farid Sukmana
Adjusting Time of Flight in Ultrasound B-mode Imaging for Accurate Measurement of Fat using Image Segmentation Technique
This research attempted to measure chicken intramuscular fat content using improved ultrasound B-mode images and image segmentation. Adapted B-mode imaging is proposed to increase the positioning accuracy of B-mode image...
High Performance CDR Processing with MapReduce
A call detail record (CDR) is a data record produced by telecommunication equipment consisting of call detail transaction logs. It contains valuable information for many purposes in several domains, such as billing, frau...
Deep Convolutional Level Set Method for Image Segmentation
Level Set Method is a popular method for image segmentation. One of the problems in Level Set Method is finding the right initial surface parameter, which implicitly affects the curve evolution and ultimately the segment...
Improvement of Fuzzy Geographically Weighted Clustering-Ant Colony Optimization Performance using Context-Based Clustering and CUDA Parallel Programming
Geo-demographic analysis (GDA) is the study of population characteristics by geographical area. Fuzzy Geographically Weighted Clustering (FGWC) is an effective algorithm used in GDA. Improvement of FGWC has been done by...
A Comprehensive Performance Analysis of IEEE 802.11p based MAC for Vehicular Communications Under Non-saturated Conditions
Reliable and efficient data broadcasting is essential in vehicular networks to provide safety-critical and commercial service messages on the road. There is still no comprehensive analysis of IEEE 802.11p based MAC that...