CluSandra: A Framework and Algorithm for Data Stream Cluster Analysis
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2011, Vol 2, Issue 11
Abstract
The clustering or partitioning of a dataset’s records into groups of similar records is an important aspect of knowledge discovery from datasets. A considerable amount of research has been applied to the identification of clusters in very large multi-dimensional and static datasets. However, the traditional clustering and/or pattern recognition algorithms that have resulted from this research are inefficient for clustering data streams. A data stream is a dynamic dataset that is characterized by a sequence of data records that evolves over time, has extremely fast arrival rates and is unbounded. Today, the world abounds with processes that generate high-speed evolving data streams. Examples include click streams, credit card transactions and sensor networks. The data stream’s inherent characteristics present an interesting set of time and space related challenges for clustering algorithms. In particular, processing time is severely constrained and clustering algorithms must be performed in a single pass over the incoming data. This paper presents both a clustering framework and algorithm that, combined, address these challenges and allows end-users to explore and gain knowledge from evolving data streams. Our approach includes the integration of open source products that are used to control the data stream and facilitate the harnessing of knowledge from the data stream. Experimental results of testing the framework with various data streams are also discussed.
Authors and Affiliations
Jose R. Fernandez , Eman M. El-Sheikh
Automating Legal Research through Data Mining
The term legal research generally refers to the process of identifying and retrieving appropriate information necessary to support legal decision-making from past case records. At present, the process is mostly manual, b...
Resources Management of High Speed Downlink Packet Access Network in the Presence of Mobility
High-Speed Downlink Protocol Access (HSDPA) is a mobile telephony protocol. It is designed to increase data capacity and transfer rate. This paper presents a resource allocation strategy in the HSDPA broadband network. A...
E-Commerce Adoption at Customer Level in Jordan: an Empirical Study of Philadelphia General Supplies
E-commerce in developing countries has been studied by numerous researchers during the last decade and a number of common and culturally specific challenges have been identified.. This study considers Jordan as a case st...
Tagging Urdu Sentences from English POS Taggers
Being a global language, English has attracted a majority of researchers and academia to work on several Natural Language Processing (NLP) applications. The rest of the languages are not focused as much as English. Part-...
A Review on Security Issues and their Impact on Hybrid Cloud Computing Environment
The evolution of cloud infrastructures toward hybrid cloud models enables innovative business outcomes, twin turbo drivers by the requirement of greater IT agility and overall cost-containment pressures. Hybrid cloud sol...