Review of issues in automatic labelling of formatted document 

Abstract

The labelling framework, which is proposed to label topic models, essentially consists of a multinomial word distribution, a set of candidate labels, and a context collection. Thus it could be applied to any text mining problems, in which a multinomial distribution of word is involved. To generate labels that are understandable, semantically relevant, discriminative across topics, and of high coverage of each topic, first extract a set of understandable candidate labels in a pre-processing step, then design a relevance scoring function to measure the semantic similarity between a label and a topic, and finally propose label selection methods. This paper presents all such issues involved in the problem of knowledge discovery using text mining. Our paper aims to review various issues described or presented by various researchers in this area.  

Authors and Affiliations

Pallavi Galgale , Priyanka Ahire , Snehal Ingavale , Dr. R. S. Prasad

Keywords

Related Articles

Percentage Based Trust Model with Bandwidth Reservation Technique for Privacy Preserving Routing in MANETs 

Routing in Mobile Ad-Hoc Networks are vulnerable to malicious traffic analysis, harmful attackers can mitigate paths and malicious intermediate nodes breaks security, ineffective reserve of available resources ( u...

Intrusion Detection System for Database with Dynamic Threshold Value  

In this paper, we propose an approach for database intrusion detection. Database management system are key component in the information field of most organization now days so security of DBMS has become more impo...

Automatic Assessment Generation Service 

Assessment is an essential element for every student in learning processes. Learning management systems (LMSs) provide support for assessment. As we are computer science students assessment is required for progra...

BBO Comparison with other Nature Inspired Algorithms to Resolve Mixels  

Remote sensing is defined as a technique for acquiring the information about an object without making physical contact with that image via remote sensors. But the major problem of remotely sensed images is mixed...

Computer Assisted Testing and Evaluation System: Distance Evaluation Using Mobile Agent Technology 

The growth of Internet has led to new avenues for distance education. A crucial factor for the success of distance education is effective mechanisms for distance evaluation (DE). Existing Internet evaluation mechanisms,...

Download PDF file
  • EP ID EP125830
  • DOI -
  • Views 95
  • Downloads 0

How To Cite

Pallavi Galgale, Priyanka Ahire, Snehal Ingavale, Dr. R. S. Prasad (2012). Review of issues in automatic labelling of formatted document . International Journal of Advanced Research in Computer Engineering & Technology(IJARCET), 1(10), 301-304. https://europub.co.uk/articles/-A-125830