HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Computer Science and Engineering >
CSE Master Theses  >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1783.1/5742
Title: Knowledge-based sense pruning using the HowNet : an alternative to word sense disambiguation
Authors: Wang, Chi-Yung
Issue Date: 2002
Abstract: In this thesis, we try to solve the problem of word sense disambiguation (WSD) in natural language processing by Sense Pruning using a knowledge-based approach. Traditional WSD methods provide only one meaning for each word in a passage. However, we believe that textual information alone may not be sufficient to determine the exact meaning of each word which has to be resolved when higher-level knowledge becomes available. Thus, we propose that the objective of WSD is to reduce the number of plausible meanings of a word as much as possible through "Sense Pruning". After Sense Pruning, we will associate a word with a list of plausible meanings. We would like to keep the truly correct sense of each word on its own meaning list and yet keep the number of possible meanings of a whole sentence as small as possible. We applied Sense Pruning to Chinese WSD, making use of the HowNet. HowNet is a knowledge base that describes all entities in its database by a set of unambiguous sememes. It provides information about the relationship between concepts or their attributes, in which concepts are represented by the sememes. One of our contributions is integrating various knowledge from HowNet for Sense Pruning, such as, relations between sememes, infomation structures in Chinese, relations of object and attribute, and characteristics of functional words. Based on HowNet, four additional databases were developed for Sense Pruning in this thesis. We evaluated our Sense Pruning algorithm on the Corpus of Sinica from Taiwan. Two criteria were used for the evaluation: recall rate and reduction of the number of possible meanings of a sentence. Effects of the size of the analytical window and the analytical unit, and the speed of the algorithm were fully studied. In summary, Sense Pruning achieves a recall rate of 97% while reducing the number of possible meanings of a sentence by 48% when a whole sentence is taken as an analytical unit.
Description: Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2002
v, 77 leaves : ill. ; 30 cm
HKUST Call Number: Thesis COMP 2002 Wang
URI: http://hdl.handle.net/1783.1/5742
Appears in Collections:CSE Master Theses

Files in This Item:

File Description SizeFormat
th_redirect.html0KbHTMLView/Open

All items in this Repository are protected by copyright, with all rights reserved.