Frequent Pattern Mining using CATSIM Tree
Journal Title: International Journal on Computer Science and Engineering - Year 2012, Vol 4, Issue 9
Abstract
Efficient algorithms to discover frequent patterns are essential in data mining research. Frequent pattern mining is emerging as powerful tool for many business applications such as e-commerce, recommender systems and supply chain management and group decision support systems to name a few. Several effective data structures, such as two-dimensional arrays, graphs, trees and tries have been proposed to collect candidate and frequent itemsets. It seems as the tree structure is most extractive to storing itemsets. The outstanding tree has been proposed so far is called FP-tree which is a prefix tree structure. Some advancement with the FP tree structure is proposed as CATS tree. CATS Tree extends the idea of FP-Tree to improve storage compression and allow frequent pattern mining without generation of candidate itemsets. It allows to mine only through a single pass over the database. The efficiency of Apriori, FP-Growth, CATS Tree for incremental mining is very poor. In all of the above mentioned algorithms, it is required to generate tree repeatedly to support incremental mining. The implemented CATSIM Tree uses more memory compared to Apriori, FP-Growth and CATS Tree, but with advancement in technology, is not a major concern. In this work CATSIM Tree with modifications in CATS Tree is implemented to support incremental mining with better results.
Authors and Affiliations
Ketan Modi , B. L. Pal
Relation based Ontology Matching using Alignment Strategies
The set of relation within a knowledge domain will be expressed with a help of Ontology, but data within the knowledge domain get scattered all over its space. To get a most precise result there must be necessary to rela...
Optimization Technique for Maximization Problem in Evolutionary Programming of Genetic Algorithm in Data Mining
The optimization technique is used for the identification of some best values from the various populations. The Evolutionary algorithm is used as a basic concept of the Evolutionary Programming Strategy. To solve many of...
E–Learning Using Mapreduce
E-Learning is the learning process created by interaction with digitally delivered content, services and support. Learner’s profile plays a crucial role in the evaluation process and to improve the elearning process. The...
Robust TCP: An improvement on TCP protocol
The Transmission Control Protocol (TCP) is the most popular transport layer protocol for the internet. Congestion Control is used to increase the congestion window size if there is additional bandwidth on the network, an...
An Efficient Semantic Model For Concept Based Clustering And Classification
Usually in text mining techniques the basic measures like term frequency of a term (word or phrase) is computed to compute the importance of the term in the document. But with statistical analysis, the original semantics...