Early load: Hiding load latency in deep pipeline processor

Shun Chieh Chang*, Walter Yuan Hwa Li, Yuan Jung Kuo, Chung-Ping Chung

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Load instructions usually have long execution latency in a deep processor pipeline, and have significant impact on overall performance. Therefore, how to hide the load latency becomes a serious problem in processor design. The latency of memory load can be separated into two parts: cache-miss latency and load-to-use latency. Previous work which tried to hide the load latency in a deep processor pipeline has some limitations. In this paper, we propose a hardware-based method, called early load, to hide the load-to-use latency with little hardware overhead. Early load scheme allows load instructions to load data from the cache system before it enters the execution stage. In the meantime, a detection method makes sure the correctness of the early operation before the load instruction enters the execution stage. Our experimental results showed that our approach can achieve 11.64% performance improvement in Dhrystone benchmark and 4.97% in average for MiBench benchmark suite.

Original languageEnglish
Title of host publication13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008
DOIs
StatePublished - 17 Nov 2008
Event13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008 - Hsinchu, Taiwan
Duration: 4 Aug 20086 Aug 2008

Publication series

Name13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008

Conference

Conference13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008
CountryTaiwan
CityHsinchu
Period4/08/086/08/08

Fingerprint Dive into the research topics of 'Early load: Hiding load latency in deep pipeline processor'. Together they form a unique fingerprint.

Cite this