A Hybrid Multi-Word Terms Extraction System Applied to Topic Detection

Journal Title: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY - Year 2014, Vol 13, Issue 10

Abstract

Mutli-word Terms extraction plays an important role in many Natural Language Processing (NLP) tasks. Despite their major importance, few works were dedicated to Arabic multi-word terms extraction. This paper proposes an automatic Arabic multi-word terms (MWTs) extraction system based on two major filtering steps: linguistics filter using a part-of-speech tagger along with morphological patterns and statistical filter based on probabilistic methods, namely: Log-Likelihood Ratio (LLR) and C-value. We evaluate the performances of the realized systems on Wattan; an Arabic oriented topic newspaper corpus. Our system manages to achieve 90.23% in term of multi-word extraction precision. We also study the use of MWTs as features in Arabic Topic Detection. The conducted experiments show good results.

Authors and Affiliations

Rim Koulali, Abdelouafi Meziane

Keywords

Related Articles

Digital Image Filtering Techniques- A survey

This paper presents a review on digital image filtering techniques. The main emphasis is on median filtering and its extended versions like hybrid median filtering, relaxed median filtering etc. It is found that still me...

Performance Comparison of IMABN-2 and MALN-2 in Faulty and Non-Faulty Network Conditions

In this paper, we have compared the performance of IMABN-2 and MALN-2. To compare the performance, data packets were passed via both networks in faulty and non-faulty environments. Results show that MALN-2 performs bette...

MOBILE AGENT APPLICATION DEVELOPMENT IN A SIMPLE JAVA-BASED MOBILE AGENT SYSTEM (SIMMAS)

As network information resources grew in size, it was most efficient to process queries and updates at the site where the data was located. The processing accomplished by a traditional client-server network interface con...

Implementation of Extemporaneous Preparation Using Bar Coding Technology

Extemporaneous Preparation (Drug Compounding) - the creation of a drug product by mixing ingredients -  is an important part of ensuring that medications are available to meet individual patient needs, the quality a...

Do search engine data improve financial time series volatility predictions in different market periods? An empirical analysis on major world financial indices.

In this paper, we investigate the different influence of search engine data in different market periods on the improvement of the prediction of the financial time series volatility. We use the EGARCH and the EGARCH-SVI m...

Download PDF file
  • EP ID EP650603
  • DOI 10.24297/ijct.v13i10.2333
  • Views 66
  • Downloads 0

How To Cite

Rim Koulali, Abdelouafi Meziane (2014). A Hybrid Multi-Word Terms Extraction System Applied to Topic Detection. INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY, 13(10), 5105-5112. https://europub.co.uk/articles/-A-650603