Using Word Embeddings for Ontology Enrichment
Journal Title: International Journal of Intelligent Systems and Applications in Engineering - Year 2016, Vol 4, Issue 3
Abstract
Word embeddings, distributed word representations in a reduced linear space, show a lot of promise for accomplishing Natural Language Processing (NLP) tasks in an unsupervised manner. In this study, we investigate if the success of word2vec, a Neural Networks based word embeddings algorithm, can be replicated in an aggluginative language like Turkish. Turkish is more challenging than languages like English for complex NLP tasks because of her rich morphology. We picked ontology enrichment, again a relatively harder NLP task, as our test application. Firstly, we show how ontological relations can be extracted automaticaly from Turkish Wikipedia to construct a gold standard. Then by running experiments we show that the word vector representations produced by word2vec are useful to detect ontological relations encoded in Wikipedia. We propose a simple but yet effective weakly supervised ontology enrichment algorithm where for a given word a few know ontologically related concepts coupled with similarity scores computed via word2vec models can result in discovery of other related concepts. We argue how our algorithm can be improved and augmented to make it a viable component of an ontoloy learning and population framework.
Authors and Affiliations
İzzet Pembeci*| Muğla Sıtkı Koçman University. Department of Computer Engineering
Solution for the Travelling Salesman Problem with a Microcontrollerbased Instantaneous System
The travelling salesman problem (TSP) is one of the most frequently researched combinational optimization problems. Despite its trivial definition, the problem is very difficult to solve. Therefore, it is categorized as...
Application of global thresholding in bread porosity evaluation
The white bread is one of most popular food in Bulgaria. Its quality is defined by standards and control is also standardized. The white bread has four groups of quality parameters - organoleptic, physicochemical, chemic...
Classification of Siirt and Long Type Pistachios (Pistacia vera L.) by Artificial Neural Networks
Quality is one of the important factors in agricultural products marketing. Grading machines have great role in quality control systems. The most efficient method used in grading machines today is image processing. This...
Using Word Embeddings for Ontology Enrichment
Word embeddings, distributed word representations in a reduced linear space, show a lot of promise for accomplishing Natural Language Processing (NLP) tasks in an unsupervised manner. In this study, we investigate if the...
Predicting Student Success in Courses via Collaborative Filtering
Based on their skills and interests, students’ success in courses may differ greatly. Predicting student success in courses before they take them may be important. For instance, students may choose elective courses that...