Automatic RDF-ization of big data semi-structured datasets

Journal Title: MASKANA - Year 2016, Vol 7, Issue 3

Abstract

Linked data adoption continues to grow in many fields at a considerable pace. However, some of the most important datasets usually remain underexploited because of two main reasons: the huge volume of the datasets and the lack of methods for automatic conversion to RDF. This paper presents an automatic approach to tackle these problems by leveraging recent Big Data tools and a program for automatic conversion from a relational model to RDF. Overall, the process can be summarized in three steps: 1) bulk transfer of data from different sources to Hive/HDFS; 2) transformation of data on Hive to RDF using D2RQ; and 3) storing the resulting RDF in CumulusRDF. By using these Big Data tools, the platform will cope with the handling of big amounts of data available in different sources, which can include structured or semi-structured data. Moreover, since the RDF data are stored in CumulusRDF in the final step, users or applications can consume the resulting data by means of web services or SPARQL queries. Finally, an evaluation in the hydro-meteorological domain demonstrates the soundness of our approach.

Authors and Affiliations

Ronald Gualán, Renán Freire, Andrés Tello, Mauricio Espinoza, Víctor Saquicela

Keywords

Related Articles

A performance evaluation between Docker container and Virtual Machines in cloud computing architectures

Reliability, portability, scalability and availability of applications are essential features of cloud computing in the software architecture of enterprises, that usually makes use of virtual machines (VM’s). The hardw...

El desarrollo psicomotor y la calidad de la atención temprana

Este estudio transversal determinó la relación entre alerta del desarrollo psicomotor de los niños de 0 a 5 años de edad y la calidad individual de atención temprana en los Centros de Desarrollo Infantil públicos y pri...

Inicio network design for data transmission of weather sensors

To acquire information from the environment in a large area of the south PROMAS1 has deployed weather sensors in a large area to acquire information from the environment, which must be collected from the remote locatio...

Detección automatizada de desfiguraciones en un sitio web mediante una solución basada en Software Libre

La desfiguración de sitios web es uno de los ataques más populares hoy en día y se basan en realizar un cambio en el código HTML de una página web, que se presenta como un cambio visual en la imagen del mismo. En el pr...

Evaluación de los estilos educativos familiares en la ciudad de Cuenca

Baumrind, psicóloga clínica del siglo 20, reconocida por su investigación sobre estilos de crianza distingue tres estilos de la educación familiar. Sustentado en este enfoque se inició un estudio para identificar el es...

Download PDF file
  • EP ID EP42148
  • DOI -
  • Views 239
  • Downloads 0

How To Cite

Ronald Gualán, Renán Freire, Andrés Tello, Mauricio Espinoza, Víctor Saquicela (2016). Automatic RDF-ization of big data semi-structured datasets. MASKANA, 7(3), -. https://europub.co.uk/articles/-A-42148