Anomaly Detection in Data with Extremely High Dimensional Space via Online Oversampling Principal Component Analysis
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 3
Abstract
Abstract: Anomaly detection is a crucial analysis topic in the field of data mining as well as machine learning. Several real-world applications like Intrusion or MasterCard fraud detection need a good and efficient framework to spot deviated data instances. A good anomaly detection methodology must be able to accurately establish many varieties of anomalies, robust, need comparatively very little resources, and perform detection in period of time. In this paper we proposed the idea of combining the two different algorithms i.e. Median Based Outlier Detection and Online Oversampling PCA for effective detection of anomaly in online updating mode. Median Based outlier detection uses the interquartile range which is a measure of statistical dispersion being equal to the difference between the upper and lower quartiles. Whereas oversampling PCA does not need to store the entire covariance matrix or data matrix and thus this approach is a more useful in online or large scale problem. Compared with other anomaly detection algorithm our experimental result verifies the feasibility of our proposed method.
Authors and Affiliations
Swapnil S. Raut , Sachin N. Deshmukh
Selective Encryption of Plaintext Using Multiple Indexing
Abstract: Selective Encryption is one of the encryption algorithm in the field of multimedia security. They are used for the purpose of hiding image, video or audio files. The main feature of selective encryption i...
Grid Computing- An Emerging Technology that enables large-scale resource sharing
Abstract: In the last few years there has been a rapid exponential increase in computer processing power, data storage and communication. But still there are many complex and computation intensive problems, which c...
Predicting software aging related bugs from imbalanced datasets by using data mining techniques
Abstract: Software aging bugs are related with the life-span of the software. Rebooting is one of the solutions of this problem, however, it is time consuming and causes resources loss. It is difficult to detect these bu...
Design and Development of an Automatic Online Newspaper Archiving System
Abstract: News archive has always been a great source of information. Till date, several printed and manual information retrieval systems served the purposes. But these systems are gradually becoming out of date, sinceth...
Improving Web Image Search Re-ranking
Nowadays, web-scale image search engines (e.g. Google Image Search, Microsoft Live Image Search) rely almost purely on surrounding text features. This leads to the ambiguous and noisy results. We propose an...