Big Data Analytics for Net Flow Analysis in Distributed Environment using Hadoop
Journal Title: International Journal of Research in Computer and Communication Technology - Year 2015, Vol 4, Issue 7
Abstract
Network traffic measurement and analysis have been regularly performed on a high performance server that collects and analysis packet flow. When we monitor a large volume of network traffic data for detailed statistics, a large-scale network, it is not easy to handle Tera or Peta byte data with a single server, there is need to thousands of machines. As distributed parallel processing scheme have been recently developed due to the cluster file system, which beneficially applied to analyzing big network traffic data. Hadoop is a popular parallel processing framework that is widely used for working with large datasets. We analyze the netflow data monitoring single node to multi nodes hadoop cluster and provide an algorithm to calculate packet count and packet size of each source ip address for every fix interval of time, with low rate of false positives to detect malicious activity. Finally, we highlight performance and benefits of hadoop distributed cluster when we used large data sets as well as small data sets.
Authors and Affiliations
Amreesh Kumar Patel, D. S. Bhilare, Sushil buriya, Satyendra Singh Yadav
Recognition besides Adjustment of Inaccurate Fingerprints Matching
Biometric distinguishing proof has included noticeably for people with unique mark developing as the prevailing one. The strength of unique mark is been set up by the persistent rise of diverse types of Automated Fin...
An algorithm for normal profile generation and for attack detection in terms of detection accuracy
We present a DoS attack detection system that uses Multivariate Correlation Analysis (MCA) for precise network traffic description by take out the geometrical correlations between network traffic features. Our MCA-ba...
Comparison Study Among Various Anomaly Detection Techniques
Many approaches are implemented for the detection of anomalies on the system. Anomalies based approaches are considered as efficient from that user intention based approach is preferred for the implementation of anom...
Pseudo Random Number Generator Using Reseeding Module
In cryptography encoding, is the process of encrypting messages or knowledge in a way such that hackers cannot read it. In an encoding scheme the message or knowledge is encoded by using an encryption method, which t...
DHT - Reducing Energy Cost and Identifying Duplicate Nodes in Wireless Sensor Networks
Wireless sensor networks is a collection of sensor nodes scattered over an area for data collection. But the main problem is the danger of the node clone,which arises due to the low-cost, resource-constrained, distrib...