Exploiting the Role of Hardware Prefetchers in Multicore Processors
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2013, Vol 4, Issue 6
Abstract
The processor-memory speed gap referred to as memory wall, has become much wider in multi core processors due to a number of cores sharing the processor-memory interface. In addition to other cache optimization techniques, the mechanism of prefetching instructions and data has been used effectively to close the processor-memory speed gap and lower the memory wall. A number of issues have emerged when prefetching is used aggressively in multicore processors. The results presented in this paper are an indicator of the problems that need to be taken into consideration while using prefetching as a default technique. This paper also quantifies the amount of degradation that applications face with the aggressive use of prefetching. Another aspect that is investigated is the performance of multicore processors using a multiprogram workload as compared to a single program workload while varying the configuration of the built-in hardware prefetchers. Parallel workloads are also investigated to estimate the speedup and the effect of hardware prefetchers. This paper is the outcome of work that forms a part of the PhD research project currently in progress at NED University of Engineering and Technology, Karachi.
Authors and Affiliations
Hasina Khatoon, Shahid Mirza, Talat Altaf
Efficient K-Nearest Neighbor Searches for Multiple-Face Recognition in the Classroom based on Three Levels DWT-PCA
The main weakness of the k-Nearest Neighbor algorithm in face recognition is calculating the distance and sort all training data on each prediction which can be slow if there are a large number of training instances. Thi...
Towards Analytical Modeling for Persuasive Design Choices in Mobile Apps
Persuasive technology has emerged as a new field of research in the past decade with its applications in various domains including web-designing, human-computer interaction, healthcare systems, and social networks. Altho...
A Framework for Creating a Distributed Rendering Environment on the Compute Clusters
This paper discusses the deployment of existing render farm manager in a typical compute cluster environment such as a university. Usually, both a render farm and a compute cluster use different queue managers and assume...
Improving Accelerometer-Based Activity Recognition by Using Ensemble of Classifiers
In line with the increasing use of sensors and health application, there are huge efforts on processing of collected data to extract valuable information such as accelerometer data. This study will propose activity recog...
Modified Farmland Fertility Optimization Algorithm for Optimal Design of a Grid-connected Hybrid Renewable Energy System with Fuel Cell Storage: Case Study of Ataka, Egypt
In this paper, a Modified Farmland Fertility Optimization algorithm (MFFA) has been presented for optimal sizing of a grid connected hybrid system including photovoltaic (PV), wind turbines and fuel cell (FC). The system...