Decentralized Probabilistic Text Clustering by using Distributed Hierarchical peer to peer Clustering

Abstract

Document clustering (or text clustering) is the application of cluster analysis to textual documents. It has applications in automatic document organization, topic extraction and fast information retrieval or filtering. For text clustering we are using decentralized probabilistic text clustering Algorithm to mine the data, it is an traditional centralized approach by using this approach analyzing massive distributed data. But it is extremely difficult to draw conclusions based on the collective characteristics of disparate data. The goal is to achieve modularity and scalability. Decentralized probabilistic text clustering Algorithm is less scalable in distributed clustering. Distributed Hierarchically Peer-to-Peer Clustering (DHP2PC) Algorithm is scalable and efficient algorithm. In this a subset of the document collection is centrally partitioned into clusters, for which cluster signatures are created. The DHP2PC algorithm finds its roots in a parallel implementation. By using cluster signatures we can mine the Massive distributed data. The algorithm offers probabilistic guarantees for the correctness of each document assignment to a cluster.

Authors and Affiliations

Attuluri Uday Kiran, Rakesh Nayak

Keywords

Related Articles

Power Analysis of Concurrent Error Detection in Orthogonal Latin Squares Codec

One of the major drawbacks of the data transmission is existence of errors. These errors exist in the circuits for transmission, reception and data storage. Error correction codes (ECCs) are commonly used toprotect me...

Security Measures To Prevent Vampire Attacks To Protect Routing Infrastructure

A Vampire attack is the composition and transmission of a message that sources more power to be consumed by the network than if an open node transmitted a message of the same size to the same destination although usi...

Logo Matching And Recognition System Using Surf

Matching is an important part of a model based object recognition system. Matching is a difficult task because images do not present perfect data, noise and occlusions greatly complicate the task. Although no existin...

High Speed Implementation Of Fused Floating Point Add-Subtract Unit

Most universally useful processors (GPP) and application particular processors (ASP) utilize the coasting guide number-crunching due toward its wide and exact number framework. In any case, the coasting point operatio...

Implementation of Node to Node Communication System on ARM 7 using CAN Bus

This project aims in designing a system which helps in monitoring and controlling multisingle chip communication system using CAN (Controller Area Network) protocol using LPC 2129 ARM. This system helps to achieve co...

Download PDF file
  • EP ID EP28112
  • DOI -
  • Views 242
  • Downloads 0

How To Cite

Attuluri Uday Kiran, Rakesh Nayak (2014). Decentralized Probabilistic Text Clustering by using Distributed Hierarchical peer to peer Clustering. International Journal of Research in Computer and Communication Technology, 3(11), -. https://europub.co.uk/articles/-A-28112