An Effective Approach for Web Document Classification using FP-Growth and Naïve Bayes Techniques
Journal Title: International Journal of Computer Science & Engineering Technology - Year 2012, Vol 3, Issue 10
Abstract
Exponential growth of the web increased the importance of web documents classification and data mining. To get the exact information, in the form of knowing what classes a web document belongs to, is expensive. Automatic classification of web documents is of great use to search engines which provides this information at a low cost. In this paper, we propose an approach for classifying the web documents using the frequent item word sets generated by the Frequent Pattern(FP) Growth technique. These set of associated words act as feature set. The final classification obtained after Naïve Bayes classifier used on the feature set. For the experimental work, we use Gensim package, as it is simple and robust. Results show that our approach can be effectively classifying the web documents.
Authors and Affiliations
Rajendra Kumar Roul , Dr. Sanjay Kumar Sahay
Predicting Students Attrition using Data Mining
Student attrition has become one of the most important measures of success for higher education institutions. It is an important issue for all institutions due to the potential negative impact on the image of the univers...
Design and Implementation of an Active RFID Tag
The Active Radio Frequency Identification tag that is RFID tag with battery is promising for RFID low power consumption and precise localization in indoor cluttered as well as for outdoor environment. In this paper, Desi...
Social-aware Context Based Approach for Forwarding Data in Wireless Network Comprising of Selfish Individuals
In social based mobility network comprises of selfish individuals that are not willing to forward the packets but wants to forward their own messages. Proposed system uses a context based protocol that can be used in soc...
Analyze the impact of Transmission rate on the Performance of AODV and DSR Protocols in MANETs under Responsive and Non-responsive Traffic.
The massive boom in the wireless technology has led to heavy utilization. Due to the heavy utilization and shared nature of resources causes QoS problems in ad hoc networks. Providing QoS is a severe problem in mobile ad...
An Overview of Energy Consumption Techniques in Wireless Sensor Networks
A Wireless Sensor Network (WSN) could be a wireless network consisting of spatially distributed autonomous devices that use sensors for watching and recording the physical conditions of the surroundings and organizing th...