Optimization of Horizontal Aggregation in SQL by using C4.5 Algorithm and K-Means Clustering

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 5

Abstract

 Abstract: Datasets in the horizontal aggregated layout are preferred by most of data mining algorithms, machine learning algorithm. Major efforts are required to compute data in the horizontal aggregated format. There are many inbuilt aggregation functions in SQL, namely, minimum, maximum, average, sum and count. These aggregation functions are used with a query evaluation method to retrieve data in the horizontal aggregation format. Optimization techniques used for vertical aggregation is not appropriate for horizontal aggregation. Standard aggregations are hard to interpret when there are many result rows, especially when grouping attributes having high cardinalities. That is why we proposed C4.5 classification algorithm and K-means clustering algorithm with query evaluation method and aggregation function for optimizing horizontal aggregation. Horizontal aggregation is a method which generates SQL code to return aggregated columns in the horizontal tabular layout. It returns a set of numbers instead of one number per row. There are various applications where the horizontal aggregation is used such as electrical billing, banks, hospital management system, pharmacy, and online library etc. [6].

Authors and Affiliations

Ms. Priti Phalak , Dr. Rekha Sharma

Keywords

Related Articles

3d Object Recognition Using Centroidal Representation

Abstract: Three dimensional object recognition from two-dimensional image is implemented in this work with the use of 512 different poses of the object (which is represented by an airplane or a cube or a satellite). The...

Pose Estimation in AVI from Unknown Correspondences Using Genetic Algorithm

Abstract: In this paper we propose a model based approach to determine correspondences and pose of objects in automated visual inspection applications. The method does not consider correspondence and pose estimation prob...

 Evaluate the Effectiveness of Test Suite Prioritization Techniques Using APFD Metric

 Abstract: Regression testing is a testing activity that is performed to provide confidence that changes do not  harm the existing behavior of the software. Test suites tend to grow in size as software evolves...

Automated Detection of Microaneurysm, Hard Exudates, and Cotton Wool Spots in Retinal fundus Images

Abstract: The The automatic identification of Image processing techniques for abnormalities in retinal images. Its very importance in diabetic retinopathy screening. Manual annotations of retinal images are rare and excl...

 Improvement of QoS Contained by AODV Routing Protocol On the Basis of Varying Queue Length and Dynamic TTL Value in MANET

 A Mobile ad-hoc network (MANET) is a network, self-configuring, proficient of self-directed functioning, quickly deployable and operates without infrastructure. MANET operates without any centralized administrati...

Download PDF file
  • EP ID EP105338
  • DOI -
  • Views 141
  • Downloads 0

How To Cite

Ms. Priti Phalak, Dr. Rekha Sharma (2014).  Optimization of Horizontal Aggregation in SQL by using C4.5 Algorithm and K-Means Clustering. IOSR Journals (IOSR Journal of Computer Engineering), 16(5), 6-13. https://europub.co.uk/articles/-A-105338