VisoMT: A collaborative multithreading multicore processor for multimedia applications with a fast data switching mechanism

Wei Chun Ku*, Shu Hsuan Chou, Jui Chin Chu, Chi Lin Liu, Tien-Fu Chen, Jiun-In Guo, Jinn Shyan Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Multithreading and multicore processing are powerful ways to take advantage of parallelism in applications in order to boost a system's performance. However, exploring sufficient parallelism and achieving data locality with low communication overhead are still important research issues in embedded multithreading/multicore design. This paper introduces the design of a fast data switching mechanism between multilevel storage structures in a new multicore architecture. This paper makes several contributions to the development of contemporary sophisticated multimedia applications with advanced standards such as H.264. The first contribution, collaborative-multithreading, tightly unifies reduced instruction set computer and collaborative multithreading digital signal processing (DSP) in order to exploit high parallelism to provide sufficient computing power to applications. Each collaborative thread of our DSP is constructed by a heterogeneous-simultaneously multithreading single instruction, multiple data structure, and four media processing cores, which is connected by a fast switch for providing a fast data exchange mechanism among correlative streams on a thread-level basis. Our second contribution is one-stop streaming processing, which aims to keep data in the system for as long as possible until it is no longer needed, thus making data more efficient to access. Our third contribution is a chunk threading programming model, including a thread management library and threading communication directives for reducing data communication and synchronization overhead. By a combination of coarse-grained and fine-grained threading, programmers can choose various threading levels based on the amount of data exchange in a program. With our proposed techniques and an appropriate programming model, we can reduce processing time by 54.9% in H.264 video encoding (common intermediate format video at 16.574 f/s) with the 1-virtual independent and streaming processing by open collaborative multithreading configuration, compared to the Texas Instruments C62 core that owns 8 function units. We realize our design as a prototype by chip implementation, and fabricate it as a chip based on the Taiwan Semiconductor Manufacturing Company Ltd. 0.13 μ rm m process. The die size of the processor core is 16.12 rm mm2, including 414 k logic transistors and 34.4 kB of on-chip static random access memory. The processor runs at 180 MH0z/1.2-V and consumes 245 mW by postsimulation results.

Original languageEnglish
Article number5229356
Pages (from-to)1633-1645
Number of pages13
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume19
Issue number11
DOIs
StatePublished - 1 Nov 2009

Keywords

  • Computer architecture
  • Digital signal processors
  • Multiprocessor interconnection
  • Programming

Fingerprint Dive into the research topics of 'VisoMT: A collaborative multithreading multicore processor for multimedia applications with a fast data switching mechanism'. Together they form a unique fingerprint.

Cite this