In this paper, a low complexity high quality motion estimation architecture design was proposed for MPEG-4 AVC/H.264 video coding applications. The proposed design is based on a low complexity algorithm that reduces over 90% of complexity at the cost of 0.06968dB and 0.08296dB PSNR drop as compared to JM9.3 full search with a ±32 search range at CIF and D1 formats, respectively. Besides, the algorithm provides a capacity of scalable search range. We have also exploited an on-chip memory rotation scheme and a configurable summation of absolute difference processor to reduce the on-chip memory bandwidth and the hardware cost. According to the TSMC 0.18um CMOS technology, the proposed design costs 47.9K gates, 4K bits of Cur./Ref. pixel buffer and 22 Kbits SRAM with the maximum working frequency of 125MHz. The proposed design can achieve real-time motion estimation on Dl video and HD720 video when operating at 40MHz and 105MHz, respectively.