An Improved Approach to perform Crawling and avoid Duplicate Web Pages
Journal Title: International Journal of Computer Science and Management Studies (IJCSMS) www.ijcsms.com - Year 2012, Vol 12, Issue 0
Abstract
When a web search is performed it includes many duplicate web pages or the websites. It means we can get number of similar pages at different web servers. We are proposing a Web Crawling Approach to Detect and avoid Duplicate or Near Duplicate WebPages. In this proposed work we are presenting a keyword Prioritization based approach to identify the web page over the web. As such pages will be identified it will optimize the web search.
Authors and Affiliations
Dhiraj Khurana , Satish Kumar
Experiential Marketing: Changing xperiential Paradigm for Marketers
With changing economics of customer relationships there is needed to implement new solutions and strategies that address these changes. Experiential marketing helps in the extraction of hidden predictive information fro...
ANALYSIS OF A SINGLE UNIT SHOCK MODEL BY USING REGENERATIVE POINT GRAPHICAL TECHNIQUE
For the doing the analysis of a stochastic system rapidly, the key reliability characteristics should be asily and quickly evaluated. Gupta [4] introduced a new technique called Regenerative Point Graphical Te...
“A STUDY ON SECURE SHELL (SSH) PROTOCOL”
Secure Shell (SSH) provides an open protocol for securing network communications that is less complex and expensive than hardware-based VPN solutions. Secure Shell client/server solutions provide command shell, file tran...
Performance Analysis of FMCW Sub Surface Penetrating Radar
This paper explains a approach towards the implementation of a frequency modulated continuous wave (FMCW) standard system in Advanced Design System (ADS) software for Sub surface penetrating radar (SSPR). For performing...
Auction Oriented Approach for Resource Management in Grid Computing
Grid computing, emerging as a new paradigm for next-generation computing, enables the sharing, selection, and aggregation of geographically distributed heterogeneous resources for solving large-scale problems in science...