In this paper, we propose a low complexity high quality fractional motion estimation design for H.264/AVC. A mode reduction algorithm of sub-macroblock partitions reduces about 30% of the hardware cost for FME block matching. The algorithm provides the continuous search points in a modified search area to boost hardware utilization and own high feasibility for the VLSI array processing. Simulation results show that the proposed FME has 0.01196dB worse than and 0.0115dB better than JM9.3 at CIF and D1 formats, respectively. Moreover, an associated FME architecture with a configurable flexibility is also proposed in the paper. It adopts flexible mode selection between several sets of macroblock partitions for providing trade-off in computation complexity and video quality. According to the TSMC 0.13um CMOS technology, the proposed design costs 112.7K gates with the maximum working frequency of 158 MHz. This design can realize the real-time H.264/AVC encoding on a D1 video and HD720 video at operation frequency of 40 MHz and 108 MHz, respectively.