Aggregating textual and video data from movies

Journal Title: Revista Romana de Interactiune Om-Calculator - Year 2016, Vol 9, Issue 3

Abstract

In this paper, we present an automatically annotated corpus2 based on movie screenplays (script) and subtitles. We extract the relevant textual information from movie screenplays and subtitles using a regular expression approach. Then, we synchronize screenplays with subtitles using a matching algorithm, thus bounding each sentence from a script between two temporal limits. We also developed an application using the corpus to test our approach and to show practical situations where this corpus is useful. The application employs topic detection and it involves searching for a specified topic in the movie text and marking the topic as a non-existent, episodic or primary topic for the analyzed text. The major problem we faced while working on this system was the unexpected structure of the screenplay sheets as this kind of files is not entirely written using a standardized format which can be easily parsed and structured automatically. Some types of errors can be overcome with regular expressions, but there are other errors that need a machine learning approach to be surpassed.

Authors and Affiliations

Alexandru Hulea, Traian Rebedea

Keywords

Related Articles

Creating enhanced interfaces for cyber-physical-social systems: the remote drone control experiment

In the area of cyber-physical-social systems there is always a huge demand for integration and adaptation; most often, these two tasks go together. There are very few practical cases when every module fits perfectly into...

Improving parsing using morpho-syntactic and semantic information

In this paper, we present the efforts of creating a syntactic parser with a very good performance on Romanian sentences. Instead of creating a parser from scratch, we decided to test the freely available existing ones an...

Artificial Intelligence: Perception, expectations, hopes and benefits

The study presents the research outcomes regarding the attitude of the students from Timişoara, from humanities and technical specializations, toward the emergence and development of artificial intelligence (AI). How wil...

2D graphical interaction in elearning

Sketching is often used by people to express ideas. Some concepts that are hard to explain in words can be easily expressed using a figure or drawing. As the pen-based user interfaces became common, many systems that use...

A Knowledge-Based Model for Human Motion Tracking and Gesture Recognition in the Context of Natural Interaction with Kinect Devices

Human-computer interaction (HCI) was revolutionized by the emergence of new technologies, be they proof of concept or fully functional, which created a more immersive, integrated and interactive experience, interactive s...

Download PDF file
  • EP ID EP241846
  • DOI -
  • Views 126
  • Downloads 0

How To Cite

Alexandru Hulea, Traian Rebedea (2016). Aggregating textual and video data from movies. Revista Romana de Interactiune Om-Calculator, 9(3), 233-254. https://europub.co.uk/articles/-A-241846