Decentralized Probabilistic Text Clustering by using Distributed Hierarchical peer to peer Clustering
Journal Title: International Journal of Research in Computer and Communication Technology - Year 2014, Vol 3, Issue 11
Abstract
Document clustering (or text clustering) is the application of cluster analysis to textual documents. It has applications in automatic document organization, topic extraction and fast information retrieval or filtering. For text clustering we are using decentralized probabilistic text clustering Algorithm to mine the data, it is an traditional centralized approach by using this approach analyzing massive distributed data. But it is extremely difficult to draw conclusions based on the collective characteristics of disparate data. The goal is to achieve modularity and scalability. Decentralized probabilistic text clustering Algorithm is less scalable in distributed clustering. Distributed Hierarchically Peer-to-Peer Clustering (DHP2PC) Algorithm is scalable and efficient algorithm. In this a subset of the document collection is centrally partitioned into clusters, for which cluster signatures are created. The DHP2PC algorithm finds its roots in a parallel implementation. By using cluster signatures we can mine the Massive distributed data. The algorithm offers probabilistic guarantees for the correctness of each document assignment to a cluster.
Authors and Affiliations
Attuluri Uday Kiran, Rakesh Nayak
Power Analysis of Concurrent Error Detection in Orthogonal Latin Squares Codec
One of the major drawbacks of the data transmission is existence of errors. These errors exist in the circuits for transmission, reception and data storage. Error correction codes (ECCs) are commonly used toprotect me...
Security Measures To Prevent Vampire Attacks To Protect Routing Infrastructure
A Vampire attack is the composition and transmission of a message that sources more power to be consumed by the network than if an open node transmitted a message of the same size to the same destination although usi...
Logo Matching And Recognition System Using Surf
Matching is an important part of a model based object recognition system. Matching is a difficult task because images do not present perfect data, noise and occlusions greatly complicate the task. Although no existin...
High Speed Implementation Of Fused Floating Point Add-Subtract Unit
Most universally useful processors (GPP) and application particular processors (ASP) utilize the coasting guide number-crunching due toward its wide and exact number framework. In any case, the coasting point operatio...
Implementation of Node to Node Communication System on ARM 7 using CAN Bus
This project aims in designing a system which helps in monitoring and controlling multisingle chip communication system using CAN (Controller Area Network) protocol using LPC 2129 ARM. This system helps to achieve co...