Please use this identifier to cite or link to this item: http://hdl.handle.net/1783.1/2680

Regular languages and codes

Authors Han, Yo-Sub
Issue Date 2005
Summary Regular languages are one of the oldest, well-known topics in formal language theory. Indeed, it has been more than a half century since the introduction of regular languages. During this time period, many challenging and exciting problems have been solved. Because of recent applications in new areas such as XML and bioinformatics, many problems have arisen and some of them have created new areas to investigate with respect to regular languages. First, we survey finite-state automaton constructions and state elimination, which we then use to prove the Kleene theorem. In particular, we study the structural properties of finite-state automata for computing shorter regular ex-pressions using state elimination. We show that we should not eliminate certain states before others to obtain a shorter regular expression. Furthermore, we pro-pose a divide-and-conquer heuristic for state elimination based on the structural properties of given finite-state automata. Second, we look at a popular application of regular languages, the pattern matching problem. We notice that if a given pattern is prefix-free, then there are at most a linear number of matching substrings. Based on this observation, we establish an efficient algorithm for the prefix-free regular-expression matching problem. Furthermore, we vigorously examine subfamilies of regular languages and investigate the decision problems for these subfamilies. We discover that a finite-state automaton for each subfamily preserves certain structural properties. Based on these properties, we design efficient algorithms for decision problems. Moreover, we define a new subfamily of regular languages, simple-regular lan-guages, and study the complexity of this family. Lastly, we examine the prime decomposition of regular languages. We show that the prime infix-free decomposition is not unique whereas the prime outfix-free decomposition is unique. We also suggest algorithms for computing prime decompositions for each subfamily in polynomial time. Note that in general, the prime decomposition of regular languages is not unique and the primality test is conjectured to be NP-complete. We expect that we can extend results in this thesis to more general families such as context-free languages. We aim to develop applications based on this new research.
Note Thesis (Ph.D.)--Hong Kong University of Science and Technology, 2005
Subjects
Language English
Format Thesis
Access
Files in this item:
File Description Size Format
th_redirect.html 339 B HTML