Ontology Based Automatic Text Mining Using TF and IDF Algorithms for Summarization of Multiple Files
Journal Title: Saudi Journal of Engineering and Technology - Year 2018, Vol 3, Issue 6
Abstract
Abstract: In the present world, due to tremendous development in technology, a huge amount of information is available everywhere. Therefore, it is difficult for the users to understand the main content of the entire document as it takes a lot of time. In this work we use extractive text summarization which uses a method to give the version of summary for one or more file or document. Here we give an approach that maps sentences to nodes of a hierarchical ontology. Ontology explains what exists in a particular domain. For the ontology creation, vocabularies are collected. It is used as background knowledge and helps to find the related meaning of the terms which occur in the source documents. Text mining is the technique from which high quality information is derived from text. Clustering is a significant task. The clustering method groups similar or related terms into a single group. In the first stage, data collection takes place. The pre-processing stage includes stemming and stop words removal.TF-IDF process occurs after which clustering takes place. In the ontology creation, first the determination of the main sub topics of the article of interest is done. We classify sentences to nodes which have a predefined hierarchical ontology. Each ontology node has bag-of-words from a web search. We represent sentences by sub trees that permit to apply measures of similarity and find relations between sentences. The ontology used in this work is not domain-specific; it does not require labelled data. this work can be extended to topics focused on summarization framework to news articles or blogs and to also to various machine learning approaches Keywords: Ontology, Text Summarization, TF-IDF, Files, Documents, Extract, Summary.
Authors and Affiliations
Chinmayee C, Meenakshi Sundaram
Optimizationof Badarawa/MalaliWaterDistributionNetworkUsingGeneticsAlgorithm
Abstract: InthisstudyEPANET,awidelyusedwaterdistributionpackagewaslinkedtoOptiGa,aVisualBasic ActiveXcontrolfor implementationofgeneticalgorithm, through Visual Basic programmingtechnique,tomodify thecomputersoftwarecall...
Survey of Semi Automatic Viscous Fluid Filling Machine
Abstract: Changes in today’s manufacturing environment allow tedious, fatiguing and repetitive tasks to be mechanically performed by robots, as manually controlled work is transition to auto-cycle control equipment. Such...
Stability of Micronutrients (Vitamin A, Iron And Iodine) Content in Fortified Rice
Abstract:More than 2 billion people in the world today suffer from micronutrient deficiencies caused largely by a dietary eficiency of vitamins and minerals. The public health importance of these deficiencies lies upon t...
Weighted Exponential - G Family of Probability Distributions
Abstract:Many generalized families of distributions have been proposed and studied over the last two decades for modeling data in many applied areas. In this paper, depending on the idea of Bourguignon et al., a new fami...
Effect of Temperature on The Bioleaching of Iron from Silica Sand with AspergillusnigerCorrelated with Shrinking Core and Mixed Kinetic Models
Abstract: The influence of temperature on the bioleaching of iron from silica sand with Aspergillusniger has been studied with three size fractions of silica sand: +120-212µm, +212-300µm, and +300-425µm over the temperat...