Efficient Term Frequency and Optimal Similarity Measure of Snippet for Web Search Results
Journal Title: Engineering and Scientific International Journal - Year 2015, Vol 2, Issue 1
Abstract
All clustering methods have to assume some cluster relationship among the data objects that they are applied on. Similarity between a pair of objects can be defined either explicitly or implicitly. In this paper, we introduce a novel multi-viewpoint based similarity measure and two related clustering methods. The major difference between a traditional similarity measure and ours is that the former uses only a multi-viewpoint on clustered, which is the origin, while the latter utilizes many different viewpoints, which are objects, assumed to not be in the same cluster with the two objects being measured. Using multiple viewpoints, more informative assessment of similarity could be achieved. It combines the neighbourhood preservation capability of multidimensional content with the familiar optimal snippet-based representation by employing a multidimensional content to derive two-dimensional layouts of the query search results that preserve text similarity relations, or neighbour hoods. Theoretical analysis and empirical study are conducted to support this claim. Two criterion functions for document clustering are proposed based on this new measure. We compare them with several well-known clustering algorithms that use other popular similarity measures on various document collections to verify the advantages of our proposal.
Authors and Affiliations
Rohini D
Real Time Vehicle Vision and Road Detection using Radio Frequency Identification
In this paper, Radio Frequency IDentification (RFID) technology is used in order to identify vehicles. RFID technology that will detect road accidents and provide details about the vehicle involved in the accident. Reade...
Biosynthesis and Characterization of Silver nanoparticles from the marine seaweedSargassumplagiophyllum
—Synthesis of Nanomaterials from biological source is a relatively new bloom in nanotechnology which is cheaper and has benefits over chemical and physical process of synthesis. The present work focussed on the synthesis...
A Generalized Class of Jack-Knifed Estimator for Population Mean using an Auxiliary Variable and Attribute under Measurement Errors
In this paper, we have considered a generalized class of estimators using auxiliary information in both the form variable and attribute under measurement error. We also suggest a class of unbiased estimators using the Ja...
Fraud Detection Using Data Mining Techniques
The purpose of this study is to develop a data mining model for fraud detection in various fields such as online transactions (i.e) net banking, tax payers, credit cards etc. and securing data from the third parties(intr...
Efficient Resource Scheduling for Cloud Computing Jobs with Sensitive Deadline Alerts
Cloud computing is one of the fastest growing technologies in the world to storing a huge amount of data. The tenants require from cloud providers whose consistency and network performance as an important object. To incr...