Pattern Recognition for Finding Similarity of Webpages
Journal Title: International Journal of Computer & organization Trends(IJCOT) - Year 2013, Vol 3, Issue 4
Abstract
We proposed a functional technique for identifying similar Web pages that is based on measuring tree similarity. In this paper we introduce an experiment with two methods for evaluating the similarity of web pages. The results of these methods can be used in different ways for the reordering and clustering a web page set. Both of these methods belong to the field web content mining. The first method is purely focused on the similarity of web pages. This method segments web pages and compares their layouts based on the image processing and graph matching. The second is based on detecting of objects that result from the user point of view on the web page. The similarity of web page is measured as an object match on the analyzed web pages. The key idea behind the method is to transform each Web page into a compressed, normalized tree that effectively represents its visual structure.
Authors and Affiliations
N. Pughazendi , G . Pattusamy
Security requirements in Software Requirements Engineering
In the last few decades, software projects have encountered major difficulties. Most software engineering projects tend to be late and over budget. Several of the causes of these failures are related to requirement...
LFSR Based Watermark and Address Generator for Digital Image Watermarking SRAM
In digital image watermarking authentication methods and techniques, the original image will be watermarked with a text, image, audio or any signature. To overcome the uneven and enormous distribution of multi...
Mathematical Analysis of the Control of the Spread of Infectious Disease in a Prey-Predator Ecosystem
We present a model for the mathematical analysis of the control of the spread of an infectious disease in a predator-prey ecosystem. In this work, we present a compartmental mathematical model expressed by a sy...
A Brief Survey On Document Clustering Techniques Using MATLAB
Document clustering is a more specific technique for unsupervised document organization, it is generally considered to be a centralized process. Clustering methods can be used to automatically group the retrieved documen...
Modified Ant Colony Based Routing Algorithm in Manet
In MANET, without the aid of any established infrastructure or centralized administration, a temporary network needs to be established whenever a node tries to send data to another node. Each node in MANET acts as an end...