Aggregating textual and video data from movies

Journal Title: Revista Romana de Interactiune Om-Calculator - Year 2016, Vol 9, Issue 3

Abstract

In this paper, we present an automatically annotated corpus2 based on movie screenplays (script) and subtitles. We extract the relevant textual information from movie screenplays and subtitles using a regular expression approach. Then, we synchronize screenplays with subtitles using a matching algorithm, thus bounding each sentence from a script between two temporal limits. We also developed an application using the corpus to test our approach and to show practical situations where this corpus is useful. The application employs topic detection and it involves searching for a specified topic in the movie text and marking the topic as a non-existent, episodic or primary topic for the analyzed text. The major problem we faced while working on this system was the unexpected structure of the screenplay sheets as this kind of files is not entirely written using a standardized format which can be easily parsed and structured automatically. Some types of errors can be overcome with regular expressions, but there are other errors that need a machine learning approach to be surpassed.

Authors and Affiliations

Alexandru Hulea, Traian Rebedea

Keywords

Related Articles

Controlling the applications running on a windows system by means of android devices

This article presents an application that the authors have developed for the Android platform, which allows a user to remotely control the applications on a computer which has the operating system Microsoft Windows. Ther...

Descrierea diagramatică a prelucrării imaginilor satelitare în aplicaţia GreenLand

Cerinţele societăţii moderne şi predicţiile fenomenelor naturale stau la baza dezvoltării aplicaţiilor, folosite pentru monitorizarea şi analizarea diferitelor fenomene din domeniul Ştiinţelor Pământului. Lucrarea de faţ...

Developing edutainment applications for Romanian preschoolers

Edutainment is a worldwide known concept, adopted by many countries for its educational uses and studied in various forms, unfortunately less explored in the Romanian teaching-learning space due to other priorities regar...

A Multidimensional Model of the Usefulness of Facebook for University Students 

The popularity of social networking websites among university students stimulated the interest for studying the potential of use for educational purposes. The objective of this study is to test and validate a multidimens...

Educational Content Management System for Competence-Based Learning

The paper presents some aspects of design and development of an educational content management system for competence-oriented learning. During the system development were taken into account a number of specific issues a...

Download PDF file
  • EP ID EP241846
  • DOI -
  • Views 143
  • Downloads 0

How To Cite

Alexandru Hulea, Traian Rebedea (2016). Aggregating textual and video data from movies. Revista Romana de Interactiune Om-Calculator, 9(3), 233-254. https://europub.co.uk/articles/-A-241846