HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Computer Science and Engineering >
CSE Doctoral Theses >

Please use this identifier to cite or link to this item:
Title: Kernel-based clustering and low rank approximation
Authors: Zhang, Kai
Issue Date: 2008
Abstract: Clustering is an unsupervised data exploration scenario that is of fundamental importance to pattern recognition and machine learning. This thesis involves two types of clustering paradigms, the mixture models and graph-based clustering methods, with the primary focus on how to improve the scaling behavior of related algorithms for large-scale application. With regard to mixture models, we are interested in reducing the model complexity in terms of number of components. We propose a unified algorithm to simultaneously solve “model simplification” and “component clustering”, and apply it with success in a number of learning algorithms using mixture models, such as density based clustering and SVM testing. For graph-based clustering, we propose the density weighted Nyström method for solving large scale eigenvalue problems, which demonstrates encouraging performance in the normalized-cut and kernel principal component analysis. We further extend this to the low rank approximation of kernel matrices, which is the key component to scaling up the kernel machines. We provide an error analysis on the Nyström low rank approximation, based on which a new sampling scheme is proposed. Our scheme is very efficient and numerically outperforms a number of state-of-the-art approaches such as incomplete Cholesky decomposition, the standard Nyström method, and probabilistic sampling approaches.
Description: Thesis (Ph.D.)--Hong Kong University of Science and Technology, 2008
x, 110 leaves : ill. ; 30 cm
HKUST Call Number: Thesis CSED 2008 Zhang
Appears in Collections:CSE Doctoral Theses

Files in This Item:

File Description SizeFormat

All items in this Repository are protected by copyright, with all rights reserved.