Advanced Annotation Creator for Search Results from Web Databases

Abstract

A large portion of the deep web is database based, i.e., for many search engines, data encoded in the returned result pages come from the underlying structured data-bases. Such type of search engines is often referred as Web databases (WDB). An increasing number of databases have become web accessible through HTML form-based search interfaces. The data units returned from the underlying database are usually encoded into the result pages dynamically for human browsing. For the encoded data units to be machine processable, which is essential for many applications such as deep web data collection and Internet comparison shopping, they need to be extracted out and assigned meaningful labels. In this paper, we present an automatic annotation approach that first aligns the data units on a result page into different groups such that the data in the same group have the same semantic. Then, for each group we annotate it from different aspects and aggregate the different annotations to predict a final annotation label for it. An annotation wrapper for the search site is automatically constructed and can be used to annotate new result pages from the same web database. Our experiments indicate that the proposed approach is highly effective. The application is designed using Microsoft Visual Studio .Net 2005 as front end. The coding language used is Visual C# .Net. MS-SQL Server 2000 is used as back end database.

Authors and Affiliations

Gayathri Thangavel, Menaka Chinnasamy

Keywords

Related Articles

Study on Replacement Level of Concrete Waste as Fine Aggregate in Concrete

Fine aggregate (river sand) is the most important ingredients for making concrete but scarcity for river sand due to continuous exploitation and high transportation cost has become major problem in the field of construc...

Cutting Parameter Optimization for Minimizing Machining Distortion of Thin Wall Thin Floor Avionic Components by Changing Of Tool Material

Distortion of thin wall thin floor aluminum components during and after machining is one of the main challenges faced by aerospace manufacturing industries. These parts have to be machined from prismatic blanks to featu...

Design & Implementation of High Speed Data Transmission

In this paper, we present a set of data encoding schemes aimed at reducing the power dissipated by the links of an NoC. The proposed schemes are general and transparent with respect to the underlying NoC fabric (i.e., t...

Comparative Analysis of Different Methods of Tuning Load Frequency Control Problem

This paper studies control of load frequency in single area power systems with Ziegler-Nicholas PID controller. In this study, Ziegler-Nicholas PID controller is used to determine the parameters of the Ziegler-Nicholas...

Thermal Imaging for Facial Expression– Fatigue Detection

Facial expressions play significant roles in our daily communication. Recognizing these expressions has extensive applications, such as human-computer interface, multimedia, and security. However, as the basis of expres...

Download PDF file
  • EP ID EP20856
  • DOI -
  • Views 280
  • Downloads 4

How To Cite

Gayathri Thangavel, Menaka Chinnasamy (2015). Advanced Annotation Creator for Search Results from Web Databases. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 3(5), -. https://europub.co.uk/articles/-A-20856