MapReduce Programs Simplification using a Query Criteria API

Abstract

A Hadoop HDFS is an organized and distributed collection of files. It is created to store a huge part of data and then retrieve it and analyze it efficiently in a less amount of time. To retrieve and analyze data from the Hadoop HDFS, MapReduce Jobs must be created directly using some programming languages like Java or indirectly using some high level languages like HiveQL and PigLatin. Everyone knows that creating MapReduce programs using programming languages is a difficult task that requires a remarkable effort for their creation and also for their maintenance. Writing MapReduce code by hand needs a lot of time, introduce bugs, harm readability, and impede optimizations. Profiles working in the field of big data always try to avoid hard and long programs in their work. They are always looking for much simpler alternatives like graphical interfaces or reduced scripts like PIG Latin or even SQL queries. This article proposes to use a MapReduce Query API inspired from Hibernate Criteria to simplify the code of MapReduce programs. This API proposes a set of predefined methods for making restrictions, projections, logical conditions and so on. An implementation of the Word Count example using the Query Criteria API is illustrated in this paper.

Authors and Affiliations

Boulchahoub Hassan, Khalil Namir, Amina Rachiq, Labriji Elhoussin, Benabbou Fouzia

Keywords

Related Articles

Performance Evaluation of Completed Local Ternary Pattern (CLTP) for Face Image Recognition

Feature extraction is the most important step that affects the recognition accuracy of face recognition. One of these features are the texture descriptors that are playing an important role as local features descriptor i...

Improvement of Persian Spam Filtering by Game Theory

There are different methods for dealing with spams; however, since spammers continuously use tricks to defeat the proposed methods, hence, filters should be constantly updated. In this study, Stackelberg game was used to...

Optimizing the Hyperparameter of Feature Extraction and Machine Learning Classification Algorithms

The process of assigning a quantitative value to a piece of text expressing a mood or effect is called Sentiment analysis. Comparison of several machine learning, feature extraction approaches, and parameter optimization...

MULTITHREADING IMAGE PROCESSING IN SINGLE-CORE AND MULTI-CORE CPU USING JAVA

Multithreading has been shown to be a powerful approach for boosting a system performance. One of the good examples of applications that benefits from multithreading is image processing. Image processing requires many re...

Data-driven based Fault Diagnosis using Principal Component Analysis

Modern industrial systems are growing day by day and unlikely their complexity is also increasing. On the other hand, the design and operations have become a key focus of the researchers in order to improve the productio...

Download PDF file
  • EP ID EP319654
  • DOI 10.14569/IJACSA.2018.090607
  • Views 85
  • Downloads 0

How To Cite

Boulchahoub Hassan, Khalil Namir, Amina Rachiq, Labriji Elhoussin, Benabbou Fouzia (2018). MapReduce Programs Simplification using a Query Criteria API. International Journal of Advanced Computer Science & Applications, 9(6), 50-54. https://europub.co.uk/articles/-A-319654