VECTORIZATION OF OPERATIONS ON SMALL-DIMENSIONAL MATRICES FOR INTEL XEON PHI KNIGHTS LANDING PROCESSOR
Journal Title: Современные информационные технологии и ИТ-образование - Year 2018, Vol 14, Issue 1
Abstract
The article is devoted to the vectorization of calculations for Intel Xeon Phi Knights Landing (KNL) processor. Small-dimensional matrices are considered as objects for optimization. These operations are wide common in calculation codes in various scopes of research, for example, in calculational fluid dynamics. KNL is the latter Intel Xeon Phi processor, it contains up to 72 calculational cores and allows running applications using massive parallelism. They implement wide range of opportunities for effective performance of supercomputer calculations. In particular, they support different memory and cluster modes. In many cases the compiler isn't able to create high-performance parallel vectorized execution code. This leads to performance losses. One of the reserves of improving code performance is the manual vectorization of the hot blocks of the code. This leads to the entire application acceleration. An important step in the program optimizing when using KNL processors is applying special 512-bit vector instructions that can significantly increase the speed of the execution code. Using of 512-bit vector instructions allows processing vectors consisting of 16 floating-point values. Special fused multiply-add instructions allow us to combine operations of componentwise multiplication and addition of these vectors. For simplification of the manual vectorization of the program code, special intrinsic functions are used. In fact these functions are just wrappers over the processor instructions. Vectorization of operations on matrices, performed with the intrinsic functions, made it possible to reduce the execution time of these operations in the range from 23% to 70% in comparison with the version compiled by the Intel compiler with the maximum level of optimization. The results received show additional hidden performance reserves of applications that can be obtained by manual optimization of the source code.
Authors and Affiliations
Leonid Benderskiy, Sergey Leshchev, Alexey Rybakov
COMPARISON OF SOLVING A STIFF EQUATION ON A SPHERE BY THE MULTI-LAYER METHOD AND METHOD OF CONTINUING AT THE BEST PARAMETER
A stiff equation, linked with the solution of singularly perturbed differential equations with the use of standard methods of numeral solutions of simple differential equations often lead to major difficulties. First dif...
INTERNATIONAL CLUSTER MODEL OF TEACHING GEOMETRIC HERITAGE OF AL-FARABI
The paper considers the features of the international cluster model of teaching geometric heritage of al-Farabi. Also, it describes an experience of organization and conducting of international integrated megalessons on...
THE FORMATION OF THE COMPONENTS OF THE FUZZY KNOWLEDGE BASE FOR DIGITAL PLAN-SCHEMES OF THE RESULTS OF SATELLITE MONITORING OF AGRICULTURAL LANDS
The methods of forming the components of fuzzy knowledge base in the form of basic digital plan-scheme of territories determined by the morphology of satellite images, natural data and the results of subjective assessmen...
EVALUATION OF EXPERT JUDGEMENTS CONSISTENCY WHEN CONSTRUCTING A MEMBERSHIP FUNCTION OF FUZZY SET USING THE METHOD OF LEVEL SETS
The article deals with one of the expert methods for construction of a membership function of fuzzy set – method of level sets, developed by R. Yager. The feasibility of improving of this method, by adding a procedure fo...
FROM MILITARY COMMUNICATION SYSTEM TO MILITARY COGNITIVE INFORMATION AND TELECOMMUNICATION SYSTEM
The article analyzes the trajectory of technology development of military communications systems. It is noted that the acceleration of scientific and technological progress in the field of information, telecommunication...