TY - GEN
T1 - Blocked-z test for reducing rasterization, z test and shading workloads
AU - Chung, Chung-Ping
AU - Chen, Hong Wei
AU - Yang, Hui Chin
PY - 2009/12/3
Y1 - 2009/12/3
N2 - We propose a blocked-Z test to effectively eliminate unnecessary data traffic between triangle setup and rasterization. This method works seamlessly with the existing rendering pipeline, with or without those existing fragment-based hierarchical Z/early Z/Z tests. And it performs much better than primitive-based Z test, in terms of data structuring and coverage. In this method, primitives are blocked into proper sizes and blocked-Z tested to filter out the most of hidden blocks, easing the storage and workloads of subsequent rendering tasks. Advantage of this method comes from two features: the blocked test, in which only one test may be sufficient to filter out a group (of the block size) of fragments; and the place of the test saving even unnecessary rasterization. Block sizes are determined statically without hardware nor runtime overhead, and an additional blocked-Z buffer, of the size of [Z buffer/(# fragments in block)], plus blocking and Z-test circuitry, are required. This design lengthens the rendering pipeline, but will not affect the throughput; in fact, it may even increase throughput, since a common wisdom is that the fragment-based pipeline stages are graphics rendering bottlenecks, and our proposal effectively relieves these bottlenecks. Design methods and circuits are given in this paper. Experimental results using Doom3 and Quake4 with various screen sizes show that the rasterization and Z test workloads can be saved up to 70%.
AB - We propose a blocked-Z test to effectively eliminate unnecessary data traffic between triangle setup and rasterization. This method works seamlessly with the existing rendering pipeline, with or without those existing fragment-based hierarchical Z/early Z/Z tests. And it performs much better than primitive-based Z test, in terms of data structuring and coverage. In this method, primitives are blocked into proper sizes and blocked-Z tested to filter out the most of hidden blocks, easing the storage and workloads of subsequent rendering tasks. Advantage of this method comes from two features: the blocked test, in which only one test may be sufficient to filter out a group (of the block size) of fragments; and the place of the test saving even unnecessary rasterization. Block sizes are determined statically without hardware nor runtime overhead, and an additional blocked-Z buffer, of the size of [Z buffer/(# fragments in block)], plus blocking and Z-test circuitry, are required. This design lengthens the rendering pipeline, but will not affect the throughput; in fact, it may even increase throughput, since a common wisdom is that the fragment-based pipeline stages are graphics rendering bottlenecks, and our proposal effectively relieves these bottlenecks. Design methods and circuits are given in this paper. Experimental results using Doom3 and Quake4 with various screen sizes show that the rasterization and Z test workloads can be saved up to 70%.
UR - http://www.scopus.com/inward/record.url?scp=70749160689&partnerID=8YFLogxK
U2 - 10.1109/CSE.2009.464
DO - 10.1109/CSE.2009.464
M3 - Conference contribution
AN - SCOPUS:70749160689
SN - 9780769538235
T3 - Proceedings - 12th IEEE International Conference on Computational Science and Engineering, CSE 2009
SP - 402
EP - 407
BT - Proceedings - 12th IEEE International Conference on Computational Science and Engineering, CSE 2009 - 7th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, EUC 2009
Y2 - 29 August 2009 through 31 August 2009
ER -