HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Electronic and Computer Engineering  >
ECE Master Theses >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1783.1/4605
Title: Extracting web design knowledge : the web de-compiler
Authors: Chan, Michael Hei Lung
Issue Date: 2001
Abstract: WWW has been growing dramatically during the last decade. There are more than 4 billion webs pages are operating all over the work today. WWW application includes news publishing, education, e-Commerce and yellow pages etc. However there are no quantitative methods to measure the quality of web pages. We introduce a system Web de-compiler (WDC) that analysis web page quantitatively, and extract the higher-level design objects. As a result, sharing among web ages will not restricted to image, audio and video materials, but also including web objects. In my thesis research, I focused on the extraction and classification of web objects. After a preliminary study, I found out that it is highly feasible to analyze web page syntactic in a structural approach. I have developed a parsing tool that parse HTML code, analyze and report the web object. After running a set of experiments, we prove that the tool can significantly identify and classify web objects. In the thesis, I will present the present the background information and problems in chapter 2. In chapter 3, I will present system architecture and implementation including system design and algorithms for structural analysis that applied on WDC. Chapter 4 presents the experiment result and the visual result. In the last session, I will present conclusion and future works of WDC.
Description: Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2001
xi, 78 leaves : ill. ; 30 cm
HKUST Call Number: Thesis ELEC 2001 ChanM
URI: http://hdl.handle.net/1783.1/4605
Appears in Collections:ECE Master Theses

Files in This Item:

File Description SizeFormat
th_redirect.html0KbHTMLView/Open

All items in this Repository are protected by copyright, with all rights reserved.