A New Architecture for Real Time Data Stream Processing

Abstract

Processing a data stream in real time is a crucial issue for several applications, however processing a large amount of data from different sources, such as sensor networks, web traffic, social media, video streams and other sources, represents a huge challenge. The main problem is that the big data system is based on Hadoop technology, especially MapReduce for processing. This latter is a high scalability and fault tolerant framework. It also processes a large amount of data in batches and provides perception blast insight of older data, but it can only process a limited set of data. MapReduce is not appropriate for real time stream processing, and is very important to process data the moment they arrive at a fast response and a good decision making. Ergo the need for a new architecture that allows real-time data processing with high speed along with low latency. The major aim of the paper at hand is to give a clear survey of the different open sources technologies that exist for real-time data stream processing including their system architectures. We shall also provide a brand new architecture which is mainly based on previous comparisons of real-time processing powered with machine learning and storm technology.

Authors and Affiliations

Soumaya Ounacer, Mohamed Amine TALHAOUI, Soufiane Ardchir, Abderrahmane Daif, Mohamed Azouazi

Keywords

Related Articles

Learning on High Frequency Stock Market Data Using Misclassified Instances in Ensemble

Learning on non-stationary distribution has been shown to be a very challenging problem in machine learning and data mining, because the joint probability distribution between the data and classes changes over time. Many...

Tuning of Customer Relationship Management (CRM) via Customer Experience Management (CEM) using Sentiment Analysis on Aspects Level

This study proposes a framework that combines a supervised machine learning and a semantic orientation approach to tune Customer Relationship Management (CRM) via Customer Experience Management (CEM). The framework extra...

Predicting Return Donor and Analyzing Blood Donation Time Series using Data Mining Techniques

Since blood centers in most countries typically rely on volunteer donors to meet the hospitals' needs, donor retention is critical for blood banks. Identifying regular donors is critical for the advance planning of blood...

Sensitivity Analysis for Water Vapor Profile Estimation with Infrared: IR Sounder Data Based on Inversion

Sensitivity analysis for water vapor profile estimation with Infrared: IR sounder data based on inversion is carried out. Through simulation study, it is found that influence due to ground surface relative humidity estim...

An Efficent Lossless Compression Scheme for ECG Signal

Cardiac diseases constitute the main cause of mortality around the globe. For detection and identification of cardiac problems, it is very important to monitor the patient's heart activities for long periods during his n...

Download PDF file
  • EP ID EP240422
  • DOI 10.14569/IJACSA.2017.081106
  • Views 74
  • Downloads 0

How To Cite

Soumaya Ounacer, Mohamed Amine TALHAOUI, Soufiane Ardchir, Abderrahmane Daif, Mohamed Azouazi (2017). A New Architecture for Real Time Data Stream Processing. International Journal of Advanced Computer Science & Applications, 8(11), 44-51. https://europub.co.uk/articles/-A-240422