This paper presents a high throughput VLSI architecture for H.264/AVC in-loop de-blocking filter (ILF) supporting baseline, main, and high profile (BP/MP/HP) video decoding targeted at HDTV applications. We develop a 4x4/8x8 filter and a buffer management scheme to perform the various coding tools in H.264 de-blocking filter for supporting the coding tools of picture adaptive frame/field (PAFF) coding, macroblock adaptive frame/field (MBAFF) coding, and 8x8 transform coding. In particular, we adopt two local buffers to store the reference MB pair data and reschedule the internal pixels when switching the filtering operations on the horizontal and vertical edges without writing it out to the external memory. Adopting TSMC 0.13μm CMOS technology, we implement the proposed design with the cost of 36.9K gates and 672 bytes of local memory when operating at 225 MHz. Moreover, the proposed design achieves the data throughput rate of 260 cycles per MB in average, which meets the real-time processing requirement for H.264 16VGA (2560×1920)@30fps video decoding.