||We design and implement Mars, a MapReduce runtime system accelerated with graphics processing units (GPUs). MapReduce is a simple and flexible parallel programming paradigm originally proposed by Google, for the ease of large scale data processing on thousands of CPUs. Compared with CPUs, GPUs have an order of magnitude higher computation power and memory bandwidth. However, GPUs are designed as special-purpose co-processors and their programming interfaces are less familiar than those on the CPUs to MapReduce programmers. To harness GPUs’ power for MapReduce, we developed Mars to run on NVIDIA GPUs, AMD GPUs, as well as multi-core CPUs. Furthermore, we integrated Mars into Hadoop, an open-source CPU-based distributed MapReduce system. Mars hides the programming complexity of GPUs behind the simple and familiar MapReduce interface, and automatically manages task partitioning, data distribution, and parallelization on the processors. We have implemented six representative applications on Mars and evaluated their performance on PCs equipped with GPUs as well as multi-core CPUs. The GPU acceleration with an NVIDIA GTX280 achieved a speedup of an order of magnitude over a quad-core CPU. Utilizing both the GPU and the CPU further improved GPU-only performance by 40% for some applications. Additionally, integrating Mars into Hadoop enabled GPU acceleration for a network of PCs.