Aggregating textual and video data from movies
Journal Title: Romanian Journal of Human - Computer Interaction - Year 2016, Vol 9, Issue 3
Abstract
In this paper, we present an automatically annotated corpus based on movie screenplays (script) and subtitles. We extract the relevant textual information from movie screenplays and subtitles using a regular expression approach. Then, we synchronize screenplays with subtitles using a matching algorithm, thus bounding each sentence from a script between two temporal limits. We also developed an application using the corpus to test our approach and to show practical situations where this corpus is useful. The application employs topic detection and it involves searching for a specified topic in the movie text and marking the topic as non-existent, episodic or primary topic for the analyzed text. The major problem we faced while working on this system was the unexpected structure of the screenplay sheets as this kind of files are not entirely written using a standardized format which can be easily parsed and structured automatically. Some types of errors can be overcome with regular expressions, but there are other errors that need a machine learning approach to be surpassed.
Authors and Affiliations
Alexandru Hulea, Traian Rebedea
Evaluation of Collaborative Learning in Chats, Based on the Analysis of Repetitions and Altruism
The paper presents a system for the evaluation of the students participating to a computer supported collaborative learning session using chat in English language. To each participant was assigned a different concept to...
Modeling the repurposing of teaching resources in medical domain through social networks and semantic web
The development of teaching materials has currently became one of the main concerns of the specialists from various domains, including medicine. The continuous development of e-Learning applications demands the creation...
Psychopathological symptoms of the cell phone addiction
Excessive use of mobile telephony and Internet has become a public health problem in many developed countries. In solving this problem have been strongly involved psychologists and psychiatrists. Worldwide, the research...
An Ontology-Based Competence Management Interactive System For It Companies
The paper presents an interactive information system for competence management based on ontologies for information technology companies. The advantage of using an ontology-based system is the possibility of the identific...
Improving parsing using morpho-syntactic and semantic information
In this paper we present the efforts of creating a syntactic parser with a very good performance on Romanian sentences. Instead of creating a parser from scratch, we decided to test the freely available existing ones and...