Outlier Analysis of Categorical Data Using Infrequency
Journal Title: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY - Year 2013, Vol 8, Issue 3
Abstract
Anomalies are those objects, which will act with different behavior and do not follow with the remaining records in the databases. Detecting anomalies is an important issue in many fields. Though many methods are available to detect anomalies in numerical datasets, only a few methods are available for categorical datasets. In this work, a new method has been proposed. This algorithm finds anomalies based on infrequent itemsets in each record. These outliers are generated by Apriori property on each record values in datasets. Previous methods may not distinguish different records with the same frequency. These give same score for each record. For each record a score is generated based on infrequent itemsets which is called MAD score in this paper. This algorithm utilizes the frequency of each value in the dataset. FPOF method is used the concept of frequent itemset and otey method used infrequent itemset. But these cannot distinguish records perfectly. The proposed algorithm has been applied on Nursery dataset and Bank dataset taken from “UCI Machine Learning Repositoryâ€. Numerical attributes are excluded from Datasets for this analysis. The experimental results show that it is efficient for outlier detection in categorical dataset.
Authors and Affiliations
Lakshmi Sreenivasareddy Dirisinapu, Krishna Murthy Mudumbi, Govardhan Aliseri
Balanced Scorecard Model for Hazards Risk Management at Limpopo River Basin A Country Participatory Approach for MCDA with Scenario Planning
This paper focuses on the application of both Balanced Scorecard (BSC) conceptual framework and Multi-criteria Decision Analysis (MCDA) a tool for Scenario Planning as a tool for Strategic Decision Thinking, on hazard ri...
A Comprehensive Analysis of Mobile Ad-Hoc Routing Protocols Under Varying Node Densities
Mobile ad-hoc network (MANET) is a well known wireless technology being used in present wireless systems and it influences the development of new structures and theories for the communication. The mobile communication in...
Image Edge Detection Using FPGA
Medical imaging often involves the injection of contrast agents and subsequent analysis of tissue enhancement patterns. X-ray angiograms are projections of 3D reality into 2D representations, there is a fair amount of se...
Fuzzy Weighted Ordered Weighted Average-Gaussian Mixture Model for Feature Reduction
Feature reduction finds the optimal feature subset using machine learning techniques and evaluation criteria. Some of the irrelevant features are existed in the real-world datasets that should be removed by using the mul...
Performance Analysis of Amplify and Forward Cooperative Networks over Rayleigh and Nakagami-m Channels based Relaying Selection
In this paper investigates the performance, analysis of amplify and forward (AF) cooperative networks with the relay selection over  Nakagami-m and Rayleigh fading channels and  M-ary phase shift keying (MPSK) modulati...