Please use this identifier to cite or link to this item: http://hdl.handle.net/1783.1/64128

Piecewise-constant and low-rank approximation for identification of recurrent copy number variations

Authors Zhou, Xiaowei HKUST affiliated (currently or previously).
Liu, Jiming
Wan, Xiang
Yu, Weichuan View this author's profile
Issue Date 2014
Source Bioinformatics , v. 30, (14), July 2014, p. 1943-1949
Summary Motivation: The post-genome era sees urgent need for more novel approaches to extracting useful information from the huge amount of genetic data. The identification of recurrent copy number variations (CNVs) from array-based comparative genomic hybridization (aCGH) data can help understand complex diseases, such as cancer. Most of the previous computational methods focused on single-sample analysis or statistical testing based on the results of single-sample analysis. Finding recurrent CNVs from multi-sample data remains a challenging topic worth further study. Results: We present a general and robust method to identify recurrent CNVs from multi-sample aCGH profiles. We express the raw dataset as a matrix and demonstrate that recurrent CNVs will form a low-rank matrix. Hence, we formulate the problem as a matrix recovering problem, where we aim to find a piecewise-constant and low-rank approximation (PLA) to the input matrix. We propose a convex formulation for matrix recovery and an efficient algorithm to globally solve the problem. We demonstrate the advantages of PLA compared with alternative methods using synthesized datasets and two breast cancer datasets. The experimental results show that PLA can successfully reconstruct the recurrent CNV patterns from raw data and achieve better performance compared with alternative methods under a wide range of scenarios.
ISSN 1367-4803
Language English
Format Article
Access View full-text via DOI
View full-text via Web of Science
View full-text via Scopus
Find@HKUST