A Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel Skylake
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 10
Abstract
The modern CPU’s design, which is composed of hierarchical memory and SIMD/vectorization capability, governs the potential for algorithms to be transformed into efficient implementations. The release of the AVX-512 changed things radically, and motivated us to search for an efficient sorting algorithm that can take advantage of it. In this paper, we describe the best strategy we have found, which is a novel two parts hybrid sort, based on the well-known Quicksort algorithm. The central partitioning operation is performed by a new algorithm, and small partitions/arrays are sorted using a branch-free Bitonicbased sort. This study is also an illustration of how classical algorithms can be adapted and enhanced by the AVX-512 extension. We evaluate the performance of our approach on a modern Intel Xeon Skylake and assess the different layers of our implementation by sorting/partitioning integers, double floatingpoint numbers, and key/value pairs of integers. Our results demonstrate that our approach is faster than two libraries of reference: the GNU C++ sort algorithm by a speedup factor of 4, and the Intel IPP library by a speedup factor of 1.4.
Authors and Affiliations
Berenger Bramas
Green Technology, Cloud Computing and Data Centers: the Need for Integrated Energy Efficiency Framework and Effective Metric
Energy efficiency (EE), energy consumption cost and environmental impact are vibrant challenges to cloud computing and data centers. Reducing energy consumption and emissions of carbon dioxide (CO2) in data centers repre...
Empirical Evaluation of Modified Agile Models
Empirical evaluation is one of the widely accepted validation method in the domain of software engineering which investigates the proposed technique via practical experience and reflects its benefits and limitations. Due...
Modeling and Simulation Multi Motors Web Winding System
Web winding systems allow the operations of unwinding and rewinding of various products including plastic films, sheets of paper, sheets, and fabrics. These operations are necessary for the development and the treatment...
Efficient Algorithm for Maximal Clique Size Evaluation
A large dataset network is considered for computation of maximal clique size (MC). Additionally, its link with popular centrality metrics to decrease uncertainty and complexity and for finding influential points of any n...
Strength of Crypto-Semantic System of Tabular Data Protection
The strength of the crypto-semantic method (CSM) of text data protection based on the use of lexicographical systems in the form of applied linguistic corpora within the formally defined restrictions of selected spheres...