Ontology Based Automatic Text Mining Using TF and IDF Algorithms for Summarization of Multiple Files

Journal Title: Saudi Journal of Engineering and Technology - Year 2018, Vol 3, Issue 6

Abstract

Abstract: In the present world, due to tremendous development in technology, a huge amount of information is available everywhere. Therefore, it is difficult for the users to understand the main content of the entire document as it takes a lot of time. In this work we use extractive text summarization which uses a method to give the version of summary for one or more file or document. Here we give an approach that maps sentences to nodes of a hierarchical ontology. Ontology explains what exists in a particular domain. For the ontology creation, vocabularies are collected. It is used as background knowledge and helps to find the related meaning of the terms which occur in the source documents. Text mining is the technique from which high quality information is derived from text. Clustering is a significant task. The clustering method groups similar or related terms into a single group. In the first stage, data collection takes place. The pre-processing stage includes stemming and stop words removal.TF-IDF process occurs after which clustering takes place. In the ontology creation, first the determination of the main sub topics of the article of interest is done. We classify sentences to nodes which have a predefined hierarchical ontology. Each ontology node has bag-of-words from a web search. We represent sentences by sub trees that permit to apply measures of similarity and find relations between sentences. The ontology used in this work is not domain-specific; it does not require labelled data. this work can be extended to topics focused on summarization framework to news articles or blogs and to also to various machine learning approaches Keywords: Ontology, Text Summarization, TF-IDF, Files, Documents, Extract, Summary.

Authors and Affiliations

Chinmayee C, Meenakshi Sundaram

Keywords

Related Articles

Effects of Teacher Assessment and Cognitive Ability on Self-Concepts: Longitudinal Mechanisms for Children from Diverse Backgrounds

Abstract: This study sought to determine whether the academic self-concepts of children come from teacher appraisal or their own cognitive abilities. Longitudinal data from the Millennium Cohort Study were used to answer...

Modeling and Simulation of Lac Operon Regulation of E. coli bacterium Using Intelligent Fuzzy System

Abstract: Current methodologies in displaying dynamic organic frameworks regularly need fathomability, particularly for users without scientific foundation. In this paper propose another way to deal with defeat such conf...

Study of Third Order Optical Nonlinearity in DASPB Dye-doped Polymer Films using CW Laser

Abstract: The optimum self-focussing materials with an intensity-dependent refractive index and the realization of the ability of these materials to produce intensity dependent phase shift in all-optical photonic devices...

Regional Medical Platform Based on Middleware

Abstract: In order to realize the sharing of medical data and solve the current situation of excessive medical treatment and waste of resources between patients and medical institutions, medical information interface eng...

The impact of natural gas addition to liquefied petroleum gas on the carbon monoxide emitted from a spark ignition engine

Abstract:A single cylinder, 4-stroke spark ignition engine type Prodit; fueled with supplementary Natural gas to liquefied petroleum gas (LPG) was used in this paper to investigate the emitted CO pollutants. The effect o...

Download PDF file
  • EP ID EP402282
  • DOI -
  • Views 184
  • Downloads 0

How To Cite

Chinmayee C, Meenakshi Sundaram (2018). Ontology Based Automatic Text Mining Using TF and IDF Algorithms for Summarization of Multiple Files. Saudi Journal of Engineering and Technology, 3(6), 410-419. https://europub.co.uk/articles/-A-402282