A Model for Improving Classifier Accuracy using Outlier Analysis

Journal Title: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY - Year 2013, Vol 7, Issue 1

Abstract

Anomalies are those records, which have different behavior and do not comply with the remaining records in the dataset. Outlier analysis is the concept to find anomalies in Datasets.  Detecting outliers efficiently is an important issue in many fields of science, medicine and technology. Many methods are available to detect anomalies in numerical datasets but a limited number of methods available for categorical datasets. In this work, a novel method to detect outliers in categorical data based on entropy is proposed. This algorithm finds anomalies based on each record score and has great intuitive appeal. These scores called BAD scores. This algorithm utilizes the frequency of each value in the dataset. Greedy method needs k- scans of dataset to find ‘k’ outliers where as the proposed method needs only one scan of dataset and it calculates BAD score of each record directly. It avoids the problem of giving ‘k’ as an input and can find any number of outliers based on our data set directly.AVF method has less time complexity when compared with the other methods like Greedy, FPOF and FDOD. Greedy has good accuracy when compared with other methods like AVF and FPOF, FDOD (which are based on frequency patterns of all combinations of values in each record). Our algorithm shows better results in accuracy than AVF algorithm and Greedy. But this method has reached nearest to AVF in time complexity.  This algorithm has been applied on Nursery dataset and Bank dataset taken from “UCI Machine Learning Repository”. In this work, it is proposed to extend Normal distribution [11], and Fuzzy concept [12] to BAD score [13] that is NAVF combined with Fuzzy AVF is applied to BAD Score.  Numerical attributes are excluded from Datasets for our analysis. The experimental results show that it is efficient for outlier detection in categorical dataset.

Authors and Affiliations

Lakshmi Sreenivasa Reddy. D, Dr B. Raveendrababu, Dr A. Govardhan

Keywords

Related Articles

Continuous & Piecewise Convex Behavior Of Maximum Values Of Some Generalized Measures Of Fuzzy Cross Entropy.

Consider two fuzzy sets A and B with same supporting points and the corresponding fuzzy vectors and  respectively, where each can vary subject to the total fuzziness and each is known to us .In this paper, a compreh...

Gradient observability and sensors for hyperbolic systems

The aim of this paper is to develop useful rigorous results related tothe gradient observability and sensors. The concept of gradient strategic sensors is characterized and applied to the wave equation. This emphasizes t...

A Framework for e-Democracy implementation in the Developing Nations

Several e-Democracy implementations started as an offshoot of e-Government implementation where other models of e-Government such as Government to Government (G2G), Government to Business (G2B), Government to Employees (...

Different Approaches for Design of Android based Media Player: A Review

Many users like to watch video by a mobile phone, but the media player has many limitations. With a rapid development of communication and network, multimedia based technology is adopted in media player. Different approa...

Low Power/ High Speed Design in VLSI with the application of Pipelining and Parallel processing

The main objectives of any VLSI design are Power, Delay andArea. Minimizing all the objectives is a challenge in presentsituation but all efforts to achieve one of these can lead to abetter design. This paper proposes an...

Download PDF file
  • EP ID EP650088
  • DOI 10.24297/ijct.v7i1.3480
  • Views 73
  • Downloads 0

How To Cite

Lakshmi Sreenivasa Reddy. D, Dr B. Raveendrababu, Dr A. Govardhan (2013). A Model for Improving Classifier Accuracy using Outlier Analysis. INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY, 7(1), 500-509. https://europub.co.uk/articles/-A-650088