Using Word Embeddings for Ontology Enrichment
Journal Title: International Journal of Intelligent Systems and Applications in Engineering - Year 2016, Vol 4, Issue 3
Abstract
Word embeddings, distributed word representations in a reduced linear space, show a lot of promise for accomplishing Natural Language Processing (NLP) tasks in an unsupervised manner. In this study, we investigate if the success of word2vec, a Neural Networks based word embeddings algorithm, can be replicated in an aggluginative language like Turkish. Turkish is more challenging than languages like English for complex NLP tasks because of her rich morphology. We picked ontology enrichment, again a relatively harder NLP task, as our test application. Firstly, we show how ontological relations can be extracted automaticaly from Turkish Wikipedia to construct a gold standard. Then by running experiments we show that the word vector representations produced by word2vec are useful to detect ontological relations encoded in Wikipedia. We propose a simple but yet effective weakly supervised ontology enrichment algorithm where for a given word a few know ontologically related concepts coupled with similarity scores computed via word2vec models can result in discovery of other related concepts. We argue how our algorithm can be improved and augmented to make it a viable component of an ontoloy learning and population framework.
Authors and Affiliations
İzzet Pembeci*| Muğla Sıtkı Koçman University. Department of Computer Engineering
Adaptive Control Solution for a Class of MIMO Uncertain Underactuated Systems with Saturating Inputs
This paper addresses the issue of controller design for a class of multi-input multi-output (MIMO) uncertain underactuated systems with saturating inputs. A systematic controller framework, composed of a hierarchically g...
The Classification of Diseased Trees by Using kNN and MLP Classification Models According to the Satellite Imagery
In this study, the Japanese Oak and Pine Wilt in forested areas of Japan was classified into two group as diseased trees and all other land cover area according to the 6 attributes in the spectral data set of the forest....
SVM-Based Sleep Apnea Identification Using Optimal RR-Interval Features of the ECG Signal
Sleep apnea (SA) is the most commonly known sleeping disorder characterized by pauses of airflow to the lungs and often results in day and night time symptoms such as impaired concentration, depression, memory loss, snor...
Development Of HealthCare System For Smart Hospital Based On UML and XML Technology
The convergence of information technology systems in health care system building is causing us to look at more effective integration of technologies. Facing increased competition, tighter spaces, staff retention and redu...
Estimating of Compressive Strength of Concrete with Artificial Neural Network According to Concrete Mixture Ratio and Age
Compressive strength of concrete is one of the most important elements for an existing building and a new structure to be built. While obtaining the desired compressive strength of concrete with an appropriate mix and cu...