||Determining network size used to require various _ad hoc_ rules of thumb. In recent years, several researchers proposed methods to handle this problem with as little human intervention as possible. Among these, the cascade-correlation learning architecture is probably the most popular. Despite its promising empirical performance, this heuristically derived method does not have strong theoretical support. In this paper, we analyze the problem of learning in constructive neural networks from a Hilbert space point of view. A novel objective function for training new hidden units using a greedy approach isderived. More importantly, we prove that a network so constructed incrementally still preserves the universal approximation property with respect to L^2 performance criteria. While theoretical results obtained so far on the universal approximation capabilities of multi-layer feed-forward networks only provide existence proofs, our results move one step further by providing a theoretically sound procedure for constructive approximation while still preserving the universal approximation property.