Outlier Analysis of Categorical Data Using Infrequency

Journal Title: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY - Year 2013, Vol 8, Issue 3

Abstract

Anomalies are those objects, which will act with different behavior and do not follow with the remaining records in the databases. Detecting anomalies is an important issue in many fields. Though many methods are available to detect anomalies in numerical datasets, only a few methods are available for categorical datasets. In this work, a new method has been proposed. This algorithm finds anomalies based on infrequent itemsets in each record. These outliers are generated by Apriori property on each record values in datasets. Previous methods may not distinguish different records with the same frequency. These give same score for each record. For each record a score is generated based on infrequent itemsets which is called MAD score in this paper.  This algorithm utilizes the frequency of each value in the dataset. FPOF method is used the concept of frequent itemset and otey method used infrequent itemset. But these cannot distinguish records perfectly. The proposed algorithm has been applied on Nursery dataset and Bank dataset taken from “UCI Machine Learning Repository”. Numerical attributes are excluded from Datasets for this analysis. The experimental results show that it is efficient for outlier detection in categorical dataset.

Authors and Affiliations

Lakshmi Sreenivasareddy Dirisinapu, Krishna Murthy Mudumbi, Govardhan Aliseri

Keywords

Related Articles

Predilection of Reusability over Maintainability in Aspect-Oriented Systems

Maintenance is the important phase in software development lifecycle which initiates after the software has been deployed for use. Reusability is an important area of concern which depicts the extent to which a module c...

Roulette Wheel Selection based Heuristic Algorithm for the Orienteering Problem

Orienteering problem (OP) is an NP-Hard graph problem. The nodes of the graph are associated with scores or rewards and the edges with time delays. The goal is to obtain a Hamiltonian path connecting the two necessary ch...

Social Networks and WEB ontologys as perfect couple for knowledge management tool/system

The social networks have become a main tool of the XXI century for personal presentation and promotion, connectivity with others, belonging to same ideology, area of expertise or culture. Beside all the exploitations mad...

Dynamic Energy Aware Gur Game based Algorithms for Self Optimizing Wireless Sensor Networks

This paper presents, Application of Gur Game Based Algorithm on Wireless Sensor Networks (WSNs) deployed to monitor Homogenous and Heterogeneous Grid in order to achieve Quality of Service (QoS) = 0.40 and 0.50. Further,...

Brain Drain from Pakistan: Problem or Opportunity

Study was conducted to check the Brain Drain as an opportunity or Problem by using Remittances as the result of brain drain and economic variables of GDP, Per Capita Income, and Public Debt was used in the analysis to ch...

Download PDF file
  • EP ID EP650133
  • DOI 10.24297/ijct.v8i3.3397
  • Views 99
  • Downloads 0

How To Cite

Lakshmi Sreenivasareddy Dirisinapu, Krishna Murthy Mudumbi, Govardhan Aliseri (2013). Outlier Analysis of Categorical Data Using Infrequency. INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY, 8(3), 868-873. https://europub.co.uk/articles/-A-650133