Periodicity and Repetition in Combinatorics on Words

Loading...
Thumbnail Image

Date

2004

Authors

Wang, Ming-wei

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

This thesis concerns combinatorics on words. I present many results in this area, united by the common themes of periodicity and repetition. Most of these results have already appeared in journal or conference articles. Chapter 2 &ndash; Chapter 5 contain the most significant contribution of this thesis in the area of combinatorics on words. Below we give a brief synopsis of each chapter. Chapter 1 introduces the subject area in general and some background information. Chapter 2 and Chapter 3 grew out of attempts to prove the Decreasing Length Conjecture (DLC). The DLC states that if &prime; is a morphism over an alphabet of size <em>n</em> then for any word <em>w</em>, there exists 0 &le; <em>i</em> < <em>j</em> &le; <em>n</em> such that |&prime;<em>i</em>(<em>w</em>)| &le; |&prime;<em>j</em>(<em>w</em>)|. The DLC was proved by S. Cautis and S. Yazdani in <em>Periodicity, morphisms, and matrices</em> in <em>Theoret. Comput. Sci. </em> (<strong>295</strong>) 2003, 107-121. More specifically, Chapter 2 gives two generalizations of the classical Fine and Wilf theorem which states that if (<em>fn</em>)<em>n</em>&ge;0, (<em>gn</em>)<em>n</em>&ge;0 are two periodic sequences of real numbers, of period lengths <em>h</em> and <em>k</em> respectively, (a) If <em>fn</em> = <em>gn</em> for 0 &le; <em>n</em> < <em>h</em> +<em> k</em> - gcd(<em>h</em>;<em>k</em>), then <em>fn</em> = <em>gn</em> for all <em>n</em> &ge; 0. (b) The conclusion in (a) would be false if <em>h</em> + <em>k</em> - gcd(<em>h</em>;<em>k</em>) were replaced by any smaller number. We give similar results where equality in (a) is replaced by inequality and to more than two sequences. These generalizations can be used to prove weak versions of the DLC. Chapter 3 gives an essentially optimal bound to the following matrix problem. Let <em>A</em> be an <em>n</em> &times; <em>n</em> matrix with non-negative integer entries. Let <em>f</em>(<em>n</em>) be the smallest integer such that for all <em>A</em>, there exist <em>i</em> < <em>j </em>&le; <em>f</em>(<em>n</em>) such that <em>Ai</em> &le; <em>Aj</em>, where <em>A</em> &le; <em>B</em> means each entry of <em>A</em> is less than or equal to the corresponding entry in <em>B</em>. The question is to find good upper bounds on <em>f</em>(<em>n</em>). This problem has been attacked in two different ways. We give a method that proves an essentially optimal upper bound of <em>n</em> + <em>g</em>(<em>n</em>) where <em>g</em>(<em>n</em>) is the maximum order of an element of the symmetric group on <em>n</em> objects. A second approach yields a slightly worse upper bound. But this approach has a result of independent interest concerning <em>irreducible matrices</em>. A non-negative <em>n</em> &times; <em>n</em> matrix <em>A</em> is <em>irreducible</em> if &sum;{<em>i</em>=0}^{<em>n</em>-1}<em>Ai</em> has all entries strictly positive. We show in Chapter 3 that if <em>A</em> is an irreducible <em>n</em> &times; <em>n</em> matrix, then there exists an integer <em>e</em> > 0 with <em>e</em> = <em>O</em>(<em>n</em> log <em>n</em>) such that the diagonal entries of <em>Ae</em> are all strictly positive. These results improve on results in my Master's thesis and is a version of the DLC in the matrix setting. They have direct applications to the growth rate of words in a D0L system. Chapter 4 gives a complete characterization of two-sided fixed points of morphisms. A weak version of the DLC is used to prove a non-trivial case of the characterization. This characterization completes the previous work of Head and Lando on finite and one-sided fixed points of morphisms. Chapter 5, 6 and 7 deal with avoiding different kinds of repetitions in infinite words. Chapter 5 deals with problems about simultaneously avoiding cubes and large squares in infinite binary words. We use morphisms and fixed points to construct an infinite binary word that simultaneously avoid cubes and squares <em>xx</em> with |<em>x</em>| &ge; 4. M. Dekking was the first to show such words exist. His construction used a non-uniform morphism. We use only uniform morphisms in Chapter 5. The construction in Chapter 5 is somewhat simpler than Dekking's. Chapter 6 deals with problems of simultaneously avoiding several patterns at once. The patterns are generated by a simple arithmetic operation. Chapter 7 proves a variant of a result of H. Friedman. We say a word <em>y</em> is a <em>subsequence</em> of a word <em>z</em> if <em>y</em> can be obtained by striking out zero or more symbols from <em>z</em>. Friedman proved that over any finite alphabet, there exists a longest finite word <em>x</em> = <em>x</em>&#8321;<em>x</em>&#8322; &middot;&middot;&middot; x<em>n</em> such that x<em>i</em>x<em>i</em>i+1 &middot;&middot;&middot; <em>x</em>&#8322;<em>i</em> is not a subsequence of <em>xjxj</em>+1 &middot;&middot;&middot; <em>x</em>&#8322;<em>j</em> for 1 &le; <em>i</em> < <em>j</em> &le; <em>n</em>/2. We call such words <em>self-avoiding</em>. We show that if &#8220;subsequence&#8221; is replaced by &#8220;subword&#8221; in defining self-avoiding, then there are infinite self-avoiding words over a 3-letter alphabet but not over binary or unary alphabets. This solves a question posed by Jean-Paul Allouche. In Chapter 8 we give an application of the existence of infinitely many square-free words over a 3-letter alphabet. The <em>duplication language</em> generated by a word <em>w</em> is roughly speaking the set of words that can be obtained from <em>w </em>by repeatedly doubling the subwords of <em>w</em>. We use the existence of infinitely many square-free words over a 3-letter alphabet to prove that the duplication language generated by a word containing at least 3 distinct letters is not regular. This solves an open problem due to J. Dassow, V. Mitrana and Gh. Paun. It is known that the duplication language generated by a word over a binary alphabet is regular. It is not known whether such languages are context-free if the generator word contains at least 3 distinct letters. After the defence of my thesis I noticed that essentially the same argument was given by Ehrenfeucht and Rozenberg in <em>Regularity of languages generated by copying systems</em> in <em>Discrete Appl. Math. </em> (<strong>8</strong>) 1984, 313-317. Chapter 9 defines a new &#8220;descriptive&#8221; measure of complexity of a word <em>w</em> by the minimal size of a deterministic finite automaton that accepts <em>w</em> (and possibly other words) but no other words of length |<em>w</em>|. Lower and upper bounds for various classes of words are proved in Chapter 9. Many of the proofs make essential use of repetitions in words.

Description

Keywords

Computer Science

LC Keywords

Citation