In this paper, a VLC decoder supporting to decode coefficient data in blocks of MPEG-2 and CAVLC in H.264 is presented. To achieve programmability of the VLC decoder, a memory-based architecture with improved memory efficiency is proposed. Group-based look-up table (LUT) algorithm is extended to multi-table merging (MTM) which extracts redundancy of groups further. With multi-table merging algorithm, all coding tables are integrated into memory more efficiently. While the memory access may lead to much power consumption, a low-power scheme is proposed to reduce memory access. The distributed cache is adopted to save power and improve the decoding throughput as well. Simulation results show that the cache with replacement method can reduce about 60% - 95% memory accesses.