HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Computer Science and Engineering >
CSE Master Theses  >

Please use this identifier to cite or link to this item:
Title: Accelerating genomic sequence compression with graphics processors
Authors: Tan, Yuwei
Issue Date: 2012
Abstract: A modern sequencing instrument is able to generate hundreds of millions of short reads of genomic data on a daily basis. As a result, there is an urgent need to develop fast algorithms that can efficiently handle, store, compress, access, and decompress genomic data. This thesis focuses on specialized compression schemes that can quickly compress and decompress large scale genomic data. We developed light-weight compression schemes for the FASTQ/FASTA format data, as well as specifically for sequence alignment output data. Furthermore, we leverage the Graphics Processing Unit’s (GPU) massively parallel architecture, high density of arithmetic logic units, and superior memory bandwidth to significantly accelerate compression and decompression. We demonstrate that our GPU-powered custom compression schemes achieve a compression ratio similar to or better than general purpose compressing algorithms for sequence data, also gain 20 times faster in compression process. Finally, we integrate our compression techniques into the state-of-the-art alignment tools and accelerate the overall speed by an order of magnitude by reducing the IO cost.
Description: Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2012
50 p. : ill. ; 30 cm
HKUST Call Number: Thesis CSED 2012 Tan
Appears in Collections:CSE Master Theses

Files in This Item:

File Description SizeFormat

All items in this Repository are protected by copyright, with all rights reserved.