In this paper, we present a memory efficient VLSI architecture for 2-D Discrete Wavelet Transform (DWT) using lifting scheme. The advantages of lifting scheme are lower computational complexity, transforming signal without extension and reduced memory requirement. It decomposes the wavelet transform with finite taps into two coefficient sets named predictor and updater. Base on the lifting scheme, we explore its data dependency of input and output signals, and thus propose a programmable architecture for different filter banks with low memory usage. For the computation of NxN 2-D DWT with Daubechies 9-7 filter, our architecture requires 9N storage cells and the memory bandwidth requirement is almost one-half of JPEG2000's proposal. This architecture is suitable for VLSI implementation and various real-time image/video applications.