||Power consumption and high throughput are the most important criteria of the VLSI implementation of the modern sophisticated wireless systems. In this work, we focus on the low power design of the high performance wireless digital baseband implementation. In particular, we investigate how we can reduce the power consumption by considering both the algorithm and architecture together. We tackle the design of the following power-hungry modules in a multiple-in and multiple-out (MIMO) baseband system: the K-best lattice decoder design for MIMO detection, the channel decoder design including Viterbi decoder and LDPC decoder design. For K-best lattice decoder, we propose a threshold-based K-best algorithm that offers significant reduction in computation and hence the power consumption, while maintaining the similar performance. We also develop the corresponding high throughput and low power VLSI architecture for the K-best lattice decoder by properly scheduling of the computation. For Viterbi decoder design, we propose a novel low complexity algorithm based on limited search algorithm and scarce state transition (SST) decoding algorithm to reduce the average number of add-compare-select (ACS) computation and the access of the survival path memory of the Viterbi algorithm. The proposed decoding scheme has very low overhead and facilitates low power implementation for high throughput applications. We also develop an uneven-partitioned memory architecture for the trace-back survivor memory unit to implement the proposed algorithm to reduce the power consumption of the Viterbi decoder. With these two schemes, the power consumption of the Vierbit decoder can be reduced by up to 80%. For LDPC decoder design, we investigate on the decoding algorithm and architecture that lead to a low power design by reducing the power consumption of the memory. Taking the column overlapping of the LDPC parity check matrix into consideration, we reduce the amount of the memory access by bypassing the accesses of the memory storing the soft posterior reliability values. The amount of achievable memory bypassing depends on the decoding order of the layers. We formulate the problem of finding the optimal decoding order as an optimization problem and propose an algorithm to obtain the optimal solution. By using the proposed scheme, the amount of memory access for storing the posterior values and hence the memory power consumption are minimized and reduced by 71.8%~98.7% for the LDPC codes defined in IEEE802.11n. In addition, we also propose a memory partitioning method which leads to significant reduction of the memory area of the architecture. We also develop a scheduling algorithm which can increase the hardware utilization rate of the partial-parallel LDPC architecture using overlap memory processing.