Summary WWW has been growing dramatically during the last decade. There are more than 4 billion webs pages are operating all over the work today. WWW application includes news publishing, education, e-Commerce and yellow pages etc. However there are no quantitative methods to measure the quality of web pages. We introduce a system Web de-compiler (WDC) that analysis web page quantitatively, and extract the higher-level design objects. As a result, sharing among web ages will not restricted to image, audio and video materials, but also including web objects. In my thesis research, I focused on the extraction and classification of web objects. After a preliminary study, I found out that it is highly feasible to analyze web page syntactic in a structural approach. I have developed a parsing tool that parse HTML code, analyze and report the web object. After running a set of experiments, we prove that the tool can significantly identify and classify web objects. In the thesis, I will present the present the background information and problems in chapter 2. In chapter 3, I will present system architecture and implementation including system design and algorithms for structural analysis that applied on WDC. Chapter 4 presents the experiment result and the visual result. In the last session, I will present conclusion and future works of WDC.
