HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Computer Science and Engineering >
CSE Master Theses  >

Please use this identifier to cite or link to this item:
Title: Scaling up support vector data description by using core-sets
Authors: Chu, Shun-Kwong
Issue Date: 2004
Abstract: Support vector data description (SVDD) is a powerful kernel method that has been commonly used for novelty detection. While its quadratic programming formulation has the important computational advantage of avoiding the problem of local minimum, this has a runtime complexity of Ο(N3), where N is the number of training patterns. It thus becomes prohibitive when the data set is large. In the context of shape-fitting problems in computational geometry, core-sets have been commonly used to obtain approximations for the minimum enclosing ball problem. The runtime of core-set approxmation algorithms grows only linearly with N and the data dimensionality, and is relatively small compared to the other shape-fitting problems. Inspired from such use of the core-sets, we propose an approximation algorithm, called Core-set SVDD (CSVDD), that allows SVDD to scale better to large data sets. Most importantly, the proposed method has a running time that is only linear in N. Experimental results on two large real-world data sets demonstrate that the proposed method can handle data sets much larger than those that can be handled by standard SVDD packages, while its approximate solution still attains comparable, or sometimes even better, novelty detection performance. An ensemble approach of CSVDD, called ensemble CSVDD (ECSVDD), is also proposed. This improves CSVDD's novelty detection performance and reduces its fluctuations, while still maintaining a running time that is only linear in N.
Description: Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2004
x, 64 leaves : ill. ; 30 cm
HKUST Call Number: Thesis COMP 2004 Chu
Appears in Collections:CSE Master Theses

Files in This Item:

File Description SizeFormat

All items in this Repository are protected by copyright, with all rights reserved.