|
HKUST Institutional Repository >
Electronic and Computer Engineering >
ECE Master Theses >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1783.1/4605
|
| Title: | Extracting web design knowledge : the web de-compiler |
| Authors: | Chan, Michael Hei Lung |
| Issue Date: | 2001 |
| Abstract: | WWW has been growing dramatically during the last decade. There are more than 4 billion webs pages are operating all over the work today. WWW application includes news publishing, education, e-Commerce and yellow pages etc. However there are no quantitative methods to measure the quality of web pages. We introduce a system Web de-compiler (WDC) that analysis web page quantitatively, and extract the higher-level design objects. As a result, sharing among web ages will not restricted to image, audio and video materials, but also including web objects.
In my thesis research, I focused on the extraction and classification of web objects. After a preliminary study, I found out that it is highly feasible to analyze web page syntactic in a structural approach. I have developed a parsing tool that parse HTML code, analyze and report the web object. After running a set of experiments, we prove that the tool can significantly identify and classify web objects.
In the thesis, I will present the present the background information and problems in chapter 2. In chapter 3, I will present system architecture and implementation including system design and algorithms for structural analysis that applied on WDC. Chapter 4 presents the experiment result and the visual result. In the last session, I will present conclusion and future works of WDC. |
| Description: | Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2001 xi, 78 leaves : ill. ; 30 cm HKUST Call Number: Thesis ELEC 2001 ChanM |
| URI: | http://hdl.handle.net/1783.1/4605 |
| Appears in Collections: | ECE Master Theses
|
Files in This Item:
| File |
Description |
Size | Format |
| th_redirect.html | | 0Kb | HTML | View/Open |
|
All items in this Repository are protected by copyright, with all rights reserved.
|