HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Computer Science and Engineering >
CSE Journal/Magazine Articles >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1783.1/1832
Title: Use of bias term in projection pursuit learning improves approximation and convergence properties
Authors: Kwok, Tin-Yau
Yeung, Dit-Yan
Keywords: Projection pursuit
Regression
Smoother
Universal approximation
Convergence
Issue Date: 1996
Citation: IEEE transactions on neural networks, v. 7, no. 5, Sept. 1996, p. 1168-1183
Abstract: In a regression problem, one is given a d-dimensional random vector X, the components of which are called predictor variables, and a random variable, Y, called response. A regression surface describes a general relationship between variables X and Y. One nonparametric regression technique that has been successfully applied to high-dimensional data is projection pursuit regression (PPR). In this method, the regression surface is approximated by a sum of empirically determined univariate functions of linear combinations of the predictors. Projection pursuit learning (PPL) proposed by Hwang et al. formulates PPR using a two-layer feedforward neural network. One of the main differences between PPR and PPL is that the smoothers in PPR are nonparametric, whereas those in PPL are based on Hermite functions of some predefined highest order R. While the convergence property of PPR is already known, that for PPL has not been thoroughly studied. In this paper, we demonstrate that PPL networks in the original form proposed by Hwang et al. do not have the universal approximation property for any finite R, and thus cannot converge to the desired function even with an arbitrarily large number of hidden units. But, by including a bias term in each linear projection of the predictor variables, PPL networks can regain these capabilities, independent of the exact choice of R. Experimentally, it is shown in this paper that this modification increases the rate of convergence with respect to the number of hidden units, improves the generalization performance, and makes it less sensitive to the setting of R. Finally, we apply PPL to chaotic time series prediction, and obtain superior results compared with the cascade-correlation architecture.
Rights: © 1996 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
URI: http://hdl.handle.net/1783.1/1832
Appears in Collections:CSE Journal/Magazine Articles

Files in This Item:

File Description SizeFormat
tnn96.pdfpre-published version647KbAdobe PDFView/Open

Find published version via OpenURL Link Resolver

All items in this Repository are protected by copyright, with all rights reserved.