This paper presents a motion estimation design with an interleaved scheduling structure to relieve high dependency penalty and improve hardware utilization. The interleaved structure uses a fine grained hardware scheduling by decomposing the whole ME into SAD/SATD/interpolation filter units such that multiple prediction units without dependency can be executed at the same time. This fine grained scheduling can help reduce the overall execution time and hardware cost. The proposed design costs 422.9K logic gates and 22.736 Kbytes of on-chip memory under TSMC 90nm CMOS process for 4Kx2K 30fps video at 270MHz operation frequency.
- Hardware design
- Motion estimation