A Semantic Approach to Person Profile Extraction from Farsi Web Documents
Journal Title: Journal of Information Systems and Telecommunication - Year 2016, Vol 4, Issue 4
Abstract
Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewpoint, the Farsi language is one of the less-studied and less-resourced languages, and suffers from the lack of high quality language processing tools. This problem emphasizes the necessity of developing Farsi text processing systems. As an element of EP research, we present a semantic approach to extract profile of person entities from Farsi Web documents. Our approach includes three major components: (i) pre-processing, (ii) semantic analysis and (iii) attribute extraction. First, our system takes as input the raw text, and annotates the text using existing pre-processing tools. In semantic analysis stage, we analyze the pre-processed text syntactically and semantically and enrich the local processed information with semantic information obtained from a distant knowledge base. We then use a semantic rule-based approach to extract the related information of the persons in question. We show the effectiveness of our approach by testing it on a small Farsi corpus. The experimental results are encouraging and show that the proposed method outperforms baseline methods.
Authors and Affiliations
Hojjat Emami, Hossein Shirazi, Ahmad Abdollahzadeh Barforoush
Low Complexity Median Filter Hardware for Image Impulsive Noise Reduction
Median filters are commonly used for removal of the impulse noise from images. De-noising is a preliminary step in online processing of images, thus hardware implementation of median filters is of great interest. Hence,...
An Improved Method for TOA Estimation in TH-UWB System considering Multipath Effects and Interference
UWB ranging is usually based on the time-of-arrival (TOA) estimation of the first path. There are two major challenges in TOA estimation. One challenge is to deal with multipath channel, especially in indoor environments...
Node to Node Watermarking in Wireless Sensor Networks for Authentication of Self Nodes
In order to solve some security issues in Wireless Sensor Networks (WSNs), node to node authentication method based on digital watermarking technique for verification of relative nodes is proposed. In the proposed method...
Privacy Preserving Big Data Mining: Association Rule Hiding
Data repositories contain sensitive information which must be protected from unauthorized access. Existing data mining techniques can be considered as a privacy threat to sensitive data. Association rule mining is one of...
Data Aggregation Tree Structure in Wireless Sensor Networks Using Cuckoo Optimization Algorithm
Wireless sensor networks (WSNs) consist of numerous tiny sensors which can be regarded as a robust tool for collecting and aggregating data in different data environments. The energy of these small sensors is supplied by...