Aggregating textual and video data from movies

Journal Title: Romanian Journal of Human - Computer Interaction - Year 2016, Vol 9, Issue 3

Abstract

In this paper, we present an automatically annotated corpus based on movie screenplays (script) and subtitles. We extract the relevant textual information from movie screenplays and subtitles using a regular expression approach. Then, we synchronize screenplays with subtitles using a matching algorithm, thus bounding each sentence from a script between two temporal limits. We also developed an application using the corpus to test our approach and to show practical situations where this corpus is useful. The application employs topic detection and it involves searching for a specified topic in the movie text and marking the topic as non-existent, episodic or primary topic for the analyzed text. The major problem we faced while working on this system was the unexpected structure of the screenplay sheets as this kind of files are not entirely written using a standardized format which can be easily parsed and structured automatically. Some types of errors can be overcome with regular expressions, but there are other errors that need a machine learning approach to be surpassed.

Authors and Affiliations

Alexandru Hulea, Traian Rebedea

Keywords

Related Articles

Interactive Components in a Environment for Grid Applications Development

The degree of usability of the Grid applications by specialists in other fields than computer science is low. This is due to the lack of interactive components integrated in the Grid platforms that allow a transparent ac...

Usability Specific Heuristics for Parallel and Distributed Aplications

The usability of the applications based on new technologies arises new issues. New evaluation methods or at least classical methods adapted to the new real case requirements have to be defined and developed. One of the m...

Recent Approaches In The Formative Usability Evaluation

The advances in the digital information quality and the degree of its utilization require a user-centred design aproach in order to increase the usability of IT applications. The purpose of this paper is to present a syn...

Image recommendation system based on social, semantic and visual characteristics

The article presents recommendation systems in terms of the most important aspects and types of algorithms used in different approaches and implementations, as well as issues which arise and need to be overcome. The appl...

The Pedagogical Usability - Methodological and Practical Considerations

The paper presents some specific aspects of the usability of web-based e-learning systems, focusing on the concept of pedagogical usability as a quality feature distinct from general usability. There are briefly analyzed...

Download PDF file
  • EP ID EP28990
  • DOI -
  • Views 357
  • Downloads 10

How To Cite

Alexandru Hulea, Traian Rebedea (2016). Aggregating textual and video data from movies. Romanian Journal of Human - Computer Interaction, 9(3), -. https://europub.co.uk/articles/-A-28990