Efficient Distributed SPARQL Queries on Apache Spark
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 8
Abstract
RDF is a widely-accepted framework for describing metadata in the web due to its simplicity and universal graph-like data model. Owing to the abundance of RDF data, existing query techniques are rendered unsuitable. To this direction, we adopt the processing power of Apache Spark to load and query a large dataset much more quickly than classical approaches. In this paper, we have designed experiments to evaluate the performance of several queries ranging from single attribute selection to selection, filtering and sorting multiple attributes in the dataset. We further experimented with the performance of queries using distributed SPARQL query on Apache Spark GraphX and studied different stages involved in this pipeline. The execution of distributed SPARQL query on Apache Spark GraphX helped us study its performance and gave insights into which stages of the pipeline can be improved. The query pipeline comprised of Graph loading, Basic Graph Pattern and Result calculating. Our goal is to minimize the time during graph loading stage in order to improve overall performance and cut the costs of data loading.
Authors and Affiliations
Saleh Albahli
Cyber Romance Scam Victimization Analysis using Routine Activity Theory Versus Apriori Algorithm
The advance new digital era nowadays has led to the increasing cases of cyber romance scam in Malaysia. These technologies have offered both opportunities and challenge, depending on the purpose of the user. To face this...
Comparison Contour Extraction Based on Layered Structure and Fourier Descriptor on Image Retrieval
In this paper, a new content-based image retrieval technique using shape feature is proposed. A shape features extracted by layered structure representation has been implemented. The approach is extract feature shape by...
A Novel Multiple Session Payment System
A wireless smartphone can be designed to process a financial payment efficiently. A user can just swipe his/her credit/debit card over the counter and all the processing needed shall be done seamlessly. A smartphone is a...
A New DTC Scheme using Second Order Sliding Mode and Fuzzy Logic of a DFIG for Wind Turbine System
This article present a novel direct torque control (DTC) scheme using high order sliding mode (HOSM) and fuzzy logic of a doubly fed induction generator (DFIG) incorporated in a wind turbine system. Conventional direct t...
Improvement of Control System Performance by Modification of Time Delay
This paper presents a mathematical approach for improving the performance of a control system by modifying the time delay at certain operating conditions. This approach converts a continuous time loop into a discrete tim...