LabelFlow Framework for Annotating Workflow Provenance
Journal Title: Informatics - Year 2018, Vol 5, Issue 1
Abstract
Scientists routinely analyse and share data for others to use. Successful data (re)use relies on having metadata describing the context of analysis of data. In many disciplines the creation of contextual metadata is referred to as reporting. One method of implementing analyses is with workflows. A stand-out feature of workflows is their ability to record provenance from executions. Provenance is useful when analyses are executed with changing parameters (changing contexts) and results need to be traced to respective parameters. In this paper we investigate whether provenance can be exploited to support reporting. Specifically; we outline a case-study based on a real-world workflow and set of reporting queries. We observe that provenance, as collected from workflow executions, is of limited use for reporting, as it supports queries partially. We identify that this is due to the generic nature of provenance, its lack of domain-specific contextual metadata. We observe that the required information is available in implicit form, embedded in data. We describe LabelFlow, a framework comprised of four Labelling Operators for decorating provenance with domain-specific Labels. LabelFlow can be instantiated for a domain by plugging it with domain-specific metadata extractors. We provide a tool that takes as input a workflow, and produces as output a Labelling Pipeline for that workflow, comprised of Labelling Operators. We revisit the case-study and show how Labels provide a more complete implementation of reporting queries.
Authors and Affiliations
Pinar Alper, Khalid Belhajjame, Vasa Curcin and Carole A. Goble
Self-Adaptive Multi-Sensor Activity Recognition Systems Based on Gaussian Mixture Models
Personal wearables such as smartphones or smartwatches are increasingly utilized in everyday life. Frequently, activity recognition is performed on these devices to estimate the current user status and trigger automate...
Interactive Graph Layout of a Million Nodes
Sensemaking of large graphs, specifically those with millions of nodes, is a crucial task in many fields. Automatic graph layout algorithms, augmented with real-time human-in-the-loop interaction, can potentially suppo...
Medical and Para-Medical Personnel’ Perspectives on Home Health Care Technology
User-based research is strongly recommended in design for older adults. The aim of this paper is to focus the attention on the poorly explored role of medical and para-medical personnel’s perspective on home health car...
Design, Use and Evaluation of E-Learning Platforms: Experiences and Perspectives of a Practitioner from the Developing World Studying in the Developed World
Electronic learning platforms are evolving and their evaluation is becoming more complex and challenging with time. Yet, the evaluation of electronic learning services is intrinsically linked to improving the performan...
Teaching HCI Skills in Higher Education through Game Design: A Study of Students’ Perceptions
Human-computer interaction (HCI) is an area with a wide range of concepts and knowledge. Therefore, a need to innovate in the teaching-learning processes to achieve an effective education arises. This article describes...