Thread affinity mapping for irregular data access on shared cache GPGPU

Hsien Kai Kuo*, Kuan Ting Chen, Bo-Cheng Lai, Jing Yang Jou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

Memory Coalescing and on-chip shared Cache are two effective techniques to alleviate the memory bottleneck in modern GPGPUs. These two techniques are very useful on applications with regular memory accesses. However, they become ineffective on concurrent threads with large numbers of uncoordinated accesses and the potential performance benefit could be significantly degraded. This paper proposes a thread affinity mapping methodology to coordinate the irregular data accesses on shared cache GPGPUs. Based on the proposed affinity metrics, threads are congregated into execution groups which are able to fully exploit the memory coalescing and data sharing within an application. An average of 3.5x runtime speedup is achieved on a Fermi GPGPU. The speedup scales with the sizes of test cases, which makes the proposed methodology an effective and promising solution for the continually increasing complexities of applications in the future many-core systems.

Original languageEnglish
Title of host publicationASP-DAC 2012 - 17th Asia and South Pacific Design Automation Conference
Pages659-664
Number of pages6
DOIs
StatePublished - 26 Apr 2012
Event17th Asia and South Pacific Design Automation Conference, ASP-DAC 2012 - Sydney, NSW, Australia
Duration: 30 Jan 20122 Feb 2012

Publication series

NameProceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC

Conference

Conference17th Asia and South Pacific Design Automation Conference, ASP-DAC 2012
CountryAustralia
CitySydney, NSW
Period30/01/122/02/12

Fingerprint Dive into the research topics of 'Thread affinity mapping for irregular data access on shared cache GPGPU'. Together they form a unique fingerprint.

  • Cite this

    Kuo, H. K., Chen, K. T., Lai, B-C., & Jou, J. Y. (2012). Thread affinity mapping for irregular data access on shared cache GPGPU. In ASP-DAC 2012 - 17th Asia and South Pacific Design Automation Conference (pp. 659-664). [6165038] (Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC). https://doi.org/10.1109/ASPDAC.2012.6165038