Resolving Ambiguous Entity through Context Knowledge and Fuzzy Approach
Journal Title: International Journal on Computer Science and Engineering - Year 2011, Vol 3, Issue 1
Abstract
Entity extraction is considered as a fundamental step in many text mining applications such as machine translation, text summarization and text categorization. However, the major challenging issue in extracting the entity from a sentence is the ambiguity problem, namely lexical ambiguity. While a human has a cognitive capability to resolve the meaning easily based on his/her knowledge, it is very difficult for a machine to do so. This paper proposed a new technique for resolving the ambiguity problem through a fuzzy approach and context knowledge. The technique integrates subject and lexical knowledge, the possibility theory, and fuzzy sets into natural language processing. Lexical knowledge was obtained from WordNet, while subject and lexical knowledge have been deployed as context knowledge. Possibility theory and fuzzy sets were applied to select the most possible meaning of an ambiguous entity based on the context. The work was conducted on the noun part-of-speech only. The technique was implemented and tested with 1110 sentences. Precision and recall measurement metrics were used as an evaluation metric. The obtained precision rate is 85.7% and 80.3% for recall. The results indicate that the proposed technique is successful.
Authors and Affiliations
Hejab M. Alfawareh , Shaidah Jusoh
A Quantitative Measure for Object Oriented Design Approach for Large-Scale Systems
Object Oriented development methodology is a trend in software development for complex systems. The architecture of the application domain depends on the nature of problem statement in hand. Success depends on the overal...
Improvement in Word Sense Disambiguation by introducing enhancements in English WordNet Structure
Word sense disambiguation (WSD) is an open problem of natural language processing, which governs the process of identifying the appropriate sense of a word (i.e. intended meaning) in a sentence, when the word has multipl...
SUPPORT VECTOR MACHINE BASED GUJARATI NUMERAL RECOGNITION
In this paper we propose the Support Vector Machine (SVM) based recognition scheme towards the recognition of Gujarati handwritten numerals. The preprocessing is done considering morphological operations. For computing t...
OUTLOOK ON VARIOUS SCHEDULING APPROACHES IN HADOOP
MapReduce is used for processing and generating sets large data .A open source framework of MapReduce is Hadoop [1]. MapReduce and Hadoop represent a good alternative for efficient large scale data processing and advance...
Energy-Balanced Transmission Policies for Wireless Sensor Networks
Wireless Sensor Network’s lifetime depends on the energy levels of individual nodes in the network. The energy usage depends on the MAC and routing protocols, topology, and transmission policy. The transmission policies...