Anomaly Detection in Data with Extremely High Dimensional Space via Online Oversampling Principal Component Analysis
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 3
Abstract
Abstract: Anomaly detection is a crucial analysis topic in the field of data mining as well as machine learning. Several real-world applications like Intrusion or MasterCard fraud detection need a good and efficient framework to spot deviated data instances. A good anomaly detection methodology must be able to accurately establish many varieties of anomalies, robust, need comparatively very little resources, and perform detection in period of time. In this paper we proposed the idea of combining the two different algorithms i.e. Median Based Outlier Detection and Online Oversampling PCA for effective detection of anomaly in online updating mode. Median Based outlier detection uses the interquartile range which is a measure of statistical dispersion being equal to the difference between the upper and lower quartiles. Whereas oversampling PCA does not need to store the entire covariance matrix or data matrix and thus this approach is a more useful in online or large scale problem. Compared with other anomaly detection algorithm our experimental result verifies the feasibility of our proposed method.
Authors and Affiliations
Swapnil S. Raut , Sachin N. Deshmukh
A Study on Clustering High Dimensional Data Using Hubness Phenomenon
Abstract: Data mining is the non-trivial process of extracting information from the very large database. In recent years, data repository has a high dimensional data, which makes a complete search in most of the da...
Immersive Energy Network Operation System for Renewable Resources
Abstract: with the proliferation of scattered new energy sources, there becomes a need to manage this growingphenomenon. The tracking gives a comprehensive view of the availability of the energy sources and aids in the u...
Comparative study of IPv4 & IPv6 Point to Point Architecture on various OS platforms
In this thesis, a comparative study on the performance analysis of IPv4 and IPv6 protocol stacks under Microsoft Windows 2007, MAC and Red Hat Linux Enterprise version 4 in point-to-point and router-torouter archi...
Secure Data Sharing Using Cryptography in Cloud Environment
Abstract : Cloud computing is rapidly growing due to the provisioning of elastic, flexible, and on-demand storage and computing services for users. In cloud based storage concept, data owner does not have full control ov...
High Efficient Complex Parallelism for Cryptography
Cryptography is an important in security purpose applications. This paper contributes the complex,parallelism mechanism to protect the information by using Advanced Encryption Standard (AES) Technique. AES is an encrypti...