Aggregating textual and video data from movies

Journal Title: Romanian Journal of Human - Computer Interaction - Year 2016, Vol 9, Issue 3

Abstract

In this paper, we present an automatically annotated corpus based on movie screenplays (script) and subtitles. We extract the relevant textual information from movie screenplays and subtitles using a regular expression approach. Then, we synchronize screenplays with subtitles using a matching algorithm, thus bounding each sentence from a script between two temporal limits. We also developed an application using the corpus to test our approach and to show practical situations where this corpus is useful. The application employs topic detection and it involves searching for a specified topic in the movie text and marking the topic as non-existent, episodic or primary topic for the analyzed text. The major problem we faced while working on this system was the unexpected structure of the screenplay sheets as this kind of files are not entirely written using a standardized format which can be easily parsed and structured automatically. Some types of errors can be overcome with regular expressions, but there are other errors that need a machine learning approach to be surpassed.

Authors and Affiliations

Alexandru Hulea, Traian Rebedea

Keywords

Related Articles

New Technologies and Children’s Cognitive Development: Some Guidelines and Recommendations for Design

Most HCI principles and design recommendations have been tested and refined in the process of developing computer interfaces for the adult user. During the last few years, a growing demand for new sets of recommendations...

Visual Communication through Infographics

Interaction techniques and visual representations allow users to view, explore and understand large amounts of information. The research made in Information Visualization area has focused on finding ways to render the ab...

The EMOINAD Guide construction proposal: An emotive interface design guide for attention deficit disorder in children

This article presents a proposal for construction of a guide which aims to contribute to the lack of guidelines for the design of therapeutic interfaces for attention deficit disorder in children, and describes the first...

Modeling the repurposing of teaching resources in medical domain through social networks and semantic web

The development of teaching materials has currently became one of the main concerns of the specialists from various domains, including medicine. The continuous development of e-Learning applications demands the creation...

Equality of Chances by Promoting the Accessibility in the Development of Interactive Systems

The equality of rights and chances for all citizens is one of the most discussed and supported issues in the international public opinion. This issue has a special value also in projects started by the European Union. Pa...

Download PDF file
  • EP ID EP28990
  • DOI -
  • Views 409
  • Downloads 10

How To Cite

Alexandru Hulea, Traian Rebedea (2016). Aggregating textual and video data from movies. Romanian Journal of Human - Computer Interaction, 9(3), -. https://europub.co.uk/articles/-A-28990