A FRAME WORK FOR WEB INFORMATION EXTRACTION AND ANALYSIS
Journal Title: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY - Year 2013, Vol 7, Issue 2
Abstract
Day by day the volume of information availability in the web is growing significantly. There are several data structures for information available in the web such as structured, semi-structured and unstructured. Majority of information in the web is presented in web pages. The information presented in web pages is semi-structured. But the information required for a context are scattered in different web documents. It is difficult to analyze the large volumes of semi-structured information presented in the web pages and to make decisions based on the analysis. The current research work proposed a frame work for a system that extracts information from various sources and prepares reports based on the knowledge built from the analysis. This simplifies  data extraction, data consolidation, data analysis and decision making based on the information presented in the web pages.The proposed frame work integrates web crawling, information extraction and data mining technologies for better information analysis that helps in effective decision making.  It enables people and organizations to extract information from various sourses of web and to make an effective analysis on the extracted data for effective decision making. The proposed frame work is applicable for any application domain. Manufacturing,sales,tourisum,e-learning are various application to menction few.The frame work is implemetnted and tested for the effectiveness of the proposed system and the results are promising.
Authors and Affiliations
Dr Sunitha Abburu, G. Suresh Babu
Simulation Tool for Assignment Models: SIMASI
In this paper, an integrated simulation optimization model for the assignment problems is developed. An effective algorithm is developed to evaluate and analyze the back-end stored simulation results. This paper proposes...
Diabetic Exudate Detection in Color Retinal Images
Diabetic retinopathy is a vascular complication of long-term diabetes. It causes damage to the small blood vessels positioned in the retina. These damaged blood vessels affect the macula and lead to vision loss. Exudates...
Performance Analysis of Planar Nanocavities-Based Wavelength Demultiplexer for Optical Communication Systems
This paper presents the design and analysis of planar plasmonic wavelength demultiplexer for optical communication systems. The demultiplexer is based on silver-air-silver plasmonic waveguide supported by two nanocavitie...
A Comparative Analysis of Feed-Forward and Generalized Regression Neural Networks for Face Recognition Using Principal Component Analysis
In this paper we give a comparative analysis of performance of feed forward neural network and generalized regression neural network based face recognition. We use different inner epoch for different input pattern accor...
GRASPING SPATIAL SOLUTIONS IN DISTRIBUTED DYNAMIC WORLDS
A high-level ideology and technology will be revealed that can effectively convert any distributed system (manned, unmanned or mixed) into a globally programmable spatial machine capable of operating without central reso...