ortizontal Aggregation in SQL for Data Mining Analysis to Prepare Data Sets

Journal Title: International Journal of Modern Engineering Research (IJMER) - Year 2013, Vol 3, Issue 4

Abstract

 : Preparing a data set for analysis is generally the most time consuming task in a data mining project, requiring many complex SQL queries, joining tables and aggregating columns. Existing SQL aggregations have limitations to prepare data sets because they return one column per aggregated group. In general, a significant manual effort is required to build data sets, where a horizontal layout is required. We propose simple, yet powerful, methods to generate SQL code to return aggregated columns in a horizontal tabular layout, returning a set of numbers instead of one number per row. This new class of functions is called horizontal aggregations. Horizontal aggregations build data sets with a horizontal denormalized layout (e.g. point-dimension, observation-variable, instance-feature), which is the standard layout required by most data mining algorithms. We propose three fundamental methods to evaluate horizontal aggregations: CASE: Exploiting the programming CASE construct; SPJ: Based on standard relational algebra operators (SPJ queries); PIVOT: Using thePIVOT operator, which is offered by some DBMSs. Experiments with large tables compare the proposed query evaluation methods. Our CASE method has similar speed to the PIVOT operator and it is much faster than the SPJ method. In general, the CASE and PIVOT methods exhibit linear scalability, where as the SPJ method does not.

Authors and Affiliations

B Susrutha, J. Vamsi Nath

Keywords

Related Articles

 Review on Adsorption Refrigeration System

 Energy can neither be created nor be destroyed”- first law of thermodynamics. the energy potential of the world is constant , so we have to save the energy as much as possible .as the refrigeration is needed ev...

Wheel Speed Signal Time-Frequency Transform and Tire Pressure Monitoring System Design

This article designed a tire pressure monitoring system with time-frequency conversion function, the system can convert wheel speed time signal into frequency signal, can filter in the frequency domain to obtain pure whe...

Comparison of Morphological, Averaging & Median Filter

Morphological & Averaging filter is proposed in this paper. A comparison between adaptive generalized morphological filter is proposed in this paper. With respect to the interference possibly encountered in image...

 Three Party Authenticated Key Distribution using Quantum Cryptography

 Cryptography is the science of writing in secret message and is an ancient art. In data and  telecommunications, cryptography is necessary when communicating over any untrusted medium, which includes just abou...

 An efficient model for design of 64-bit High Speed Parallel Prefix VLSI adder

 To make addition operations more efficient parallel prefix addition is a better method. In this paper 64-bit parallel prefix addition has been implemented with the help of cells like black cell and grey cell oper...

Download PDF file
  • EP ID EP141140
  • DOI -
  • Views 110
  • Downloads 0

How To Cite

B Susrutha, J. Vamsi Nath (2013).  ortizontal Aggregation in SQL for Data Mining Analysis to Prepare Data Sets. International Journal of Modern Engineering Research (IJMER), 3(4), 1861-1871. https://europub.co.uk/articles/-A-141140