A Semantic Approach to Person Profile Extraction from Farsi Web Documents

Journal Title: Journal of Information Systems and Telecommunication - Year 2016, Vol 4, Issue 4

Abstract

Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewpoint, the Farsi language is one of the less-studied and less-resourced languages, and suffers from the lack of high quality language processing tools. This problem emphasizes the necessity of developing Farsi text processing systems. As an element of EP research, we present a semantic approach to extract profile of person entities from Farsi Web documents. Our approach includes three major components: (i) pre-processing, (ii) semantic analysis and (iii) attribute extraction. First, our system takes as input the raw text, and annotates the text using existing pre-processing tools. In semantic analysis stage, we analyze the pre-processed text syntactically and semantically and enrich the local processed information with semantic information obtained from a distant knowledge base. We then use a semantic rule-based approach to extract the related information of the persons in question. We show the effectiveness of our approach by testing it on a small Farsi corpus. The experimental results are encouraging and show that the proposed method outperforms baseline methods.

Authors and Affiliations

Hojjat Emami, Hossein Shirazi, Ahmad Abdollahzadeh Barforoush

Keywords

Related Articles

Analysis and Evaluation of Techniques for Myocardial Infarction Based on Genetic Algorithm and Weight by SVM

Although decreasing rate of death in developed countries because of Myocardial Infarction, it is turned to the leading cause of death in developing countries. Data mining approaches can be utilized to predict occurrence...

The Surfer Model with a Hybrid Approach to Ranking the Web Pages

Users who seek results pertaining to their queries are at the first place. To meet users’ needs, thousands of webpages must be ranked. This requires an efficient algorithm to place the relevant webpages at first ranks. R...

Digital Video Stabilization System by Adaptive Fuzzy Kalman Filtering

Digital video stabilization (DVS) allows acquiring video sequences without disturbing jerkiness, removing unwanted camera movements. A good DVS should remove the unwanted camera movements while maintains the intentional...

Achieving Better Performance of S-MMA Algorithm in the OFDM Modulation

Effective algorithms in modern digital communication systems provide a fundamental basis for increasing the efficiency of the application networks which are in many cases neither optimized nor very close to their practic...

Safe Use of the Internet of Things for Privacy Enhancing

New technologies and their uses have always had complex economic, social, cultural, and legal implications, with accompanying concerns about negative consequences. So it will probably be with the IoT and their use of dat...

Download PDF file
  • EP ID EP183955
  • DOI 10.7508/jist.2016.04.004
  • Views 134
  • Downloads 0

How To Cite

Hojjat Emami, Hossein Shirazi, Ahmad Abdollahzadeh Barforoush (2016). A Semantic Approach to Person Profile Extraction from Farsi Web Documents. Journal of Information Systems and Telecommunication, 4(4), 232-243. https://europub.co.uk/articles/-A-183955