H.264/AVC motion estimation implmentation on compute unified device architecture (CUDA)

Wei Nien Chen*, Hsueh-Ming Hang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

116 Scopus citations

Abstract

Due to the rapid growth of graphics processing unit (GPU) processing capability, using GPU as a coprocessor to assist the central processing unit (CPU) in computing massive data becomes essential. In this paper, we present an efficient block-level parallel algorithm for the variable block size motion estimation (ME) in H.264/AVC with fractional pixel refinement on a computer unified device architecture (CUDA) platform, developed by NVIDIA in 2007. The CUDA enhances the programmability and flexibility for general-purpose computation on GPU. We decompose the H.264 ME algorithm into 5 steps so that we can achieve highly parallel computation with low external memory transfer rate. Experimental results show that, with the assistance of GPU, the processing time is 12 times faster than that of using CPU only.*

Original languageEnglish
Title of host publication2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Proceedings
Pages697-700
Number of pages4
DOIs
StatePublished - Jun 2008
Event2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Hannover, Germany
Duration: 23 Jun 200826 Jun 2008

Publication series

Name2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Proceedings

Conference

Conference2008 IEEE International Conference on Multimedia and Expo, ICME 2008
CountryGermany
CityHannover
Period23/06/0826/06/08

Keywords

  • Compute Unified Device Architecture (CUDA)
  • Graphics Processing Unit (GPU)
  • H.264/AVC
  • Motion estimation
  • Parallel processing

Fingerprint Dive into the research topics of 'H.264/AVC motion estimation implmentation on compute unified device architecture (CUDA)'. Together they form a unique fingerprint.

Cite this