||Clustering is an unsupervised data exploration scenario that is of fundamental importance to pattern recognition and machine learning. This thesis involves two types of clustering paradigms, the mixture models and graph-based clustering methods, with the primary focus on how to improve the scaling behavior of related algorithms for large-scale application. With regard to mixture models, we are interested in reducing the model complexity in terms of number of components. We propose a unified algorithm to simultaneously solve “model simplification” and “component clustering”, and apply it with success in a number of learning algorithms using mixture models, such as density based clustering and SVM testing. For graph-based clustering, we propose the density weighted Nyström method for solving large scale eigenvalue problems, which demonstrates encouraging performance in the normalized-cut and kernel principal component analysis. We further extend this to the low rank approximation of kernel matrices, which is the key component to scaling up the kernel machines. We provide an error analysis on the Nyström low rank approximation, based on which a new sampling scheme is proposed. Our scheme is very efficient and numerically outperforms a number of state-of-the-art approaches such as incomplete Cholesky decomposition, the standard Nyström method, and probabilistic sampling approaches.