Outlier Analysis of Categorical Data Using Infrequency

Journal Title: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY - Year 2013, Vol 8, Issue 3

Abstract

Anomalies are those objects, which will act with different behavior and do not follow with the remaining records in the databases. Detecting anomalies is an important issue in many fields. Though many methods are available to detect anomalies in numerical datasets, only a few methods are available for categorical datasets. In this work, a new method has been proposed. This algorithm finds anomalies based on infrequent itemsets in each record. These outliers are generated by Apriori property on each record values in datasets. Previous methods may not distinguish different records with the same frequency. These give same score for each record. For each record a score is generated based on infrequent itemsets which is called MAD score in this paper.  This algorithm utilizes the frequency of each value in the dataset. FPOF method is used the concept of frequent itemset and otey method used infrequent itemset. But these cannot distinguish records perfectly. The proposed algorithm has been applied on Nursery dataset and Bank dataset taken from “UCI Machine Learning Repository”. Numerical attributes are excluded from Datasets for this analysis. The experimental results show that it is efficient for outlier detection in categorical dataset.

Authors and Affiliations

Lakshmi Sreenivasareddy Dirisinapu, Krishna Murthy Mudumbi, Govardhan Aliseri

Keywords

Related Articles

A Novel Way to Detect Hard Exudates Using Dynamic Thresholding Technique in Digital Retinal Fundus Image

Diabetic retinopathy is considered to be one of the major causes of blindness among diabetes mellitus patients. Due to diabetic retinopathy blood vessels of retina gets damaged and fat, lipoprotein substances gets leaked...

DESIGN AND IMPLEMENTATION OF AN OTP BASED DATA SECURITY MODEL INCOPERATING AES AND SHA2 IN CLOUD ENVIRONMENT

Cloud computing has revolutionized the way computing and software services are delivered to the clients on demand. It offers users the ability to connect to computing resources and access IT managed services with a previ...

A Cost Effective DVI interface on Virtex-5 FPGA Through Verilog HDL

There is a definite need for video and image processing technologies in today's world. However the computer vision technologies need to be tested and optimized. There is need for testing these interfaces for the platform...

ZigBee-Based Wireless Sensor Network for Temperature Monitoring

The proposed system in this paper describes the design and implementation of a Wireless Sensor Network (WSN) based on ZigBee technology to monitor the history and current temperature information of remote locations. The...

Comparative Analysis Domino Logic Based Techniques For VLSI Circuit

Domino logic is a CMOS-based evolution of the dynamic logic  techniques  based  on  either  PMOS  or  NMOS transistors. Domino logic technique is widely used in modern digital VLSI circuit. Dynamic logic is twice...

Download PDF file
  • EP ID EP650133
  • DOI 10.24297/ijct.v8i3.3397
  • Views 83
  • Downloads 0

How To Cite

Lakshmi Sreenivasareddy Dirisinapu, Krishna Murthy Mudumbi, Govardhan Aliseri (2013). Outlier Analysis of Categorical Data Using Infrequency. INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY, 8(3), 868-873. https://europub.co.uk/articles/-A-650133