MapReduce Programs Simplification using a Query Criteria API
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2018, Vol 9, Issue 6
Abstract
A Hadoop HDFS is an organized and distributed collection of files. It is created to store a huge part of data and then retrieve it and analyze it efficiently in a less amount of time. To retrieve and analyze data from the Hadoop HDFS, MapReduce Jobs must be created directly using some programming languages like Java or indirectly using some high level languages like HiveQL and PigLatin. Everyone knows that creating MapReduce programs using programming languages is a difficult task that requires a remarkable effort for their creation and also for their maintenance. Writing MapReduce code by hand needs a lot of time, introduce bugs, harm readability, and impede optimizations. Profiles working in the field of big data always try to avoid hard and long programs in their work. They are always looking for much simpler alternatives like graphical interfaces or reduced scripts like PIG Latin or even SQL queries. This article proposes to use a MapReduce Query API inspired from Hibernate Criteria to simplify the code of MapReduce programs. This API proposes a set of predefined methods for making restrictions, projections, logical conditions and so on. An implementation of the Word Count example using the Query Criteria API is illustrated in this paper.
Authors and Affiliations
Boulchahoub Hassan, Khalil Namir, Amina Rachiq, Labriji Elhoussin, Benabbou Fouzia
Performance Evaluation of Completed Local Ternary Pattern (CLTP) for Face Image Recognition
Feature extraction is the most important step that affects the recognition accuracy of face recognition. One of these features are the texture descriptors that are playing an important role as local features descriptor i...
Improvement of Persian Spam Filtering by Game Theory
There are different methods for dealing with spams; however, since spammers continuously use tricks to defeat the proposed methods, hence, filters should be constantly updated. In this study, Stackelberg game was used to...
Optimizing the Hyperparameter of Feature Extraction and Machine Learning Classification Algorithms
The process of assigning a quantitative value to a piece of text expressing a mood or effect is called Sentiment analysis. Comparison of several machine learning, feature extraction approaches, and parameter optimization...
MULTITHREADING IMAGE PROCESSING IN SINGLE-CORE AND MULTI-CORE CPU USING JAVA
Multithreading has been shown to be a powerful approach for boosting a system performance. One of the good examples of applications that benefits from multithreading is image processing. Image processing requires many re...
Data-driven based Fault Diagnosis using Principal Component Analysis
Modern industrial systems are growing day by day and unlikely their complexity is also increasing. On the other hand, the design and operations have become a key focus of the researchers in order to improve the productio...