In this paper, a low-power embedded memory module is designed for a multi-threaded DSP processor. A co-design of circuit and architecture technique is proposed. The technique includes three circuit schemes: controllable pre-charged bit-line, low voltage bit-line, and controllable data-retention power gating. Because the low-power control signals are generated by the DSP engine, the operating condition of the memory module can be arbitrarily adjusted by using software programming. The integration of low-power dual-port 8KB SRAM and the multi-threaded DSP engine is implemented in TSMC 130nm CMOS technology. By using these techniques, the overall access power reduction of the DSP core is around 15.30%-16.84%.