Review of issues in automatic labelling of formatted document 

Abstract

The labelling framework, which is proposed to label topic models, essentially consists of a multinomial word distribution, a set of candidate labels, and a context collection. Thus it could be applied to any text mining problems, in which a multinomial distribution of word is involved. To generate labels that are understandable, semantically relevant, discriminative across topics, and of high coverage of each topic, first extract a set of understandable candidate labels in a pre-processing step, then design a relevance scoring function to measure the semantic similarity between a label and a topic, and finally propose label selection methods. This paper presents all such issues involved in the problem of knowledge discovery using text mining. Our paper aims to review various issues described or presented by various researchers in this area.  

Authors and Affiliations

Pallavi Galgale , Priyanka Ahire , Snehal Ingavale , Dr. R. S. Prasad

Keywords

Related Articles

BBO Comparison with other Nature Inspired Algorithms to Resolve Mixels  

Remote sensing is defined as a technique for acquiring the information about an object without making physical contact with that image via remote sensors. But the major problem of remotely sensed images is mixed...

Improving the Performance of a Single Model and Test Prioritization Strategy for Event Driven Software  

Event-driven software is very diverse, e.g., in form of Graphical User Interfaces (GUIs), Web applications, or embedded software. All EDS take sequences of events (e.g., messages and mouse-clicks) as input, change their...

Implementing Software as a Service in Cloud using Android Applications 

The software as a service is provided using android applications. The cloud client is authenticated to get software services from the cloud server. The services include running java applications without installing...

SURVEY ON DYNAMIC ANALYSIS TO DETECT VULNERABILITIES AND UNSAFE COMPONENT LOADINGS  

Dynamic loading is an important mechanism for software development. It allows an application, the flexibility to dynamically link a component and use its exported functionalities. Dynamic loading is a mechanism by...

LICENSE PLATE CHARACTER RECOGNITION USING BACK PROPAGATION ALGORITHM  

License Plate Recognition (LPR) technology is one of the most important parts in Intelligent Transport System (ITS), including License Plate Location, Characters Segmentation and Characters Recognition. The neural networ...

Download PDF file
  • EP ID EP125830
  • DOI -
  • Views 79
  • Downloads 0

How To Cite

Pallavi Galgale, Priyanka Ahire, Snehal Ingavale, Dr. R. S. Prasad (2012). Review of issues in automatic labelling of formatted document . International Journal of Advanced Research in Computer Engineering & Technology(IJARCET), 1(10), 301-304. https://europub.co.uk/articles/-A-125830