Please use this identifier to cite or link to this item:

Scaling up support vector data description by using core-sets

Authors Chu, Shun-Kwong
Issue Date 2004
Summary Support vector data description (SVDD) is a powerful kernel method that has been commonly used for novelty detection. While its quadratic programming formulation has the important computational advantage of avoiding the problem of local minimum, this has a runtime complexity of Ο(N3), where N is the number of training patterns. It thus becomes prohibitive when the data set is large. In the context of shape-fitting problems in computational geometry, core-sets have been commonly used to obtain approximations for the minimum enclosing ball problem. The runtime of core-set approxmation algorithms grows only linearly with N and the data dimensionality, and is relatively small compared to the other shape-fitting problems. Inspired from such use of the core-sets, we propose an approximation algorithm, called Core-set SVDD (CSVDD), that allows SVDD to scale better to large data sets. Most importantly, the proposed method has a running time that is only linear in N. Experimental results on two large real-world data sets demonstrate that the proposed method can handle data sets much larger than those that can be handled by standard SVDD packages, while its approximate solution still attains comparable, or sometimes even better, novelty detection performance. An ensemble approach of CSVDD, called ensemble CSVDD (ECSVDD), is also proposed. This improves CSVDD's novelty detection performance and reduces its fluctuations, while still maintaining a running time that is only linear in N.
Note Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2004
Language English
Format Thesis
Access View full-text via DOI
Files in this item:
File Description Size Format
th_redirect.html 339 B HTML