Please use this identifier to cite or link to this item:

Deformable models for detection and classification of radicals in multi-font printed Chinese characters

Authors Wong, Daniel Man Ho
Issue Date 1994
Summary Chinese character recognition is a very difficult pattern recognition problem because of the existence of large variation of font and writing styles in printed and handwritten characters. Existing recognition techniques can give reasonably high accuracy, but they are not robust enough to achieve human-level performance in its high tolerance towards various sources of noise. This study is an attempt to look for more robust segmentation and recognition techniques. In particular, the problem of radical detection and ~Zassification in multi-font printed Chinese character recognition is chosen as our testing ground, although the approach studied here has been designed with more general applications in mind, which include the very difficult offline handwritten Chinese character recognition problem. The approach studied here is called deformable template matching, which is related to earlier work by others, such as elastic line matching proposed by Burr, snake or active contour modelproposed by Kass et al., and elastic net proposed by Durbin and Willshaw. Unlike traditional rigid template matching, this approach formulates matching as a process which tries to deform a template (deformable model) to match some patterns in the image. Similar to snake and elastic net, we define an energy function specific to our application to measure both the deformation of a template (an attributed graph representing a Chinese radical) and its similarity with some patterns (radicals) in the character image after the energy minimization process has settled down. Moreover, our study goes beyond the detection problem to study the classification of radicals as well. This can be done by relating the energy measure to a probability measure through the Boltzmann-Gibbs distribution, with which the Bayesian framework can be used. Extensive experiments have been performed on 1590 different Chinese characters consisting of 11 radical classes. Each character is represented in 7 different fonts. ,So there are a total of 11130 (1590 x 7) character images in the data set. Our method achieves 93% accuracy in detecting and correctly classifying the radicals in these images. The classification rates increase to 99% and 99.7% if we consider the correct answer being in the 3 best and 5 best matched classes, respectively.
Note Thesis (M.Phil.)--Hong Kong University of Science and Technology, 1994
Language English
Format Thesis
Access View full-text via DOI
Files in this item:
File Description Size Format
th_redirect.html 345 B HTML
Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.