COMPARABLE EVALUATION OF CONTEMPORARY CORPUS-BASED AND KNOWLEDGE-BASED SEMANTIC SIMILARITY MEASURES OF SHORT TEXTS
Journal Title: Journal of Information Technology and Application (JITA) - Year 2011, Vol 1, Issue 1
Abstract
This paper presents methods for measuring the semantic similarity of texts, where we evaluated different approaches based on existing similarity measures. On one side word similarity was calculated by processing large text corpuses and on the other, commonsense knowledgebase was used. Given that a large fraction of the information available today, on the Web and elsewhere, consists of short text snippets (e.g. abstracts of scientifi c documents, image captions or product descriptions), where commonsense knowledge has an important role, in this paper we focus on computing the similarity between two sentences or two short paragraphs by extending existing measures with information from the ConceptNet knowledgebase. On the other hand, an extensive research has been done in the fi eld of corpus-based semantic similarity, so we also evaluated existing solutions by imposing some modifi cations. Through experiments performed on a paraphrase data set, we demonstrate that some of proposed approaches can improve the semantic similarity measurement of short text.
Authors and Affiliations
Bojan Furlan, Vladimir Sivački, Davor Jovanović, Boško Nikolić
SELECTION OF TELECOMMUNICATION ACCESS NETWORKS
The development of Internet technology and computer networks leads to the convergence of traditional systems voice, video and data into a unique IP-based TriplePlay system. Differences in implementing TriplePlay service...
HYBRID METHODOLOGY OF NONLINEAR GOAL PROGRAMMING
What we demonstrate here is a nonlinear goal-programming (NGP) algorithm based on hybrid connection of the modifi ed simplex method of goal programming, gradient method of feasible directions and method of optimal displa...
SAFETY AND RISK MANAGEMENT
The article is devoted to the problem of creating a system of security and risk management. Formulated in relation to the process of movement of trains: - factor of safety of the train - the probability of traversing the...
TIME COMPLEXITY ANALYSIS OF THE BINARY TREE ROLL ALGORITHM
This paper presents the time complexity analysis of the Binary Tree Roll algorithm. The time complexity is analyzed theoretically and the results are then confirmed empirically. The theoretical analysis consists of findi...
E-MAIL FORENSICS: TECHNIQUES AND TOOLS FOR FORENSICINVESTIGATION OF ONE COURT CASE
E-mail has emerged as the most important application on the Internet for communication of messages, delivery of documents and carrying out transactions and is used not only from computers, but many other electronic gadge...