Automatic Complexity

1 First steps in automatic complexity

The Kolmogorov complexity of a finite word \(w\) is, roughly speaking, the length of the shortest description \(w^*\) of \(w\) in a fixed formal language. The description \(w^*\) can be thought of as an optimally compressed version of \(w\). Motivated by the non-computability of Kolmogorov complexity, Shallit and Wang  [ 15 ] studied a deterministic finite automaton analogue.

Their notion of automatic complexity is an automata-based and length-conditional analogue of Sipser’s distinguishing complexity \(CD\) ( [ 16 ] , [ 12 , Definition 7.1.4 ] ). This was pointed out by Mia Minnes in a review in the Bulletin of Symbolic Logic from 2012. Another precursor is the length-conditional Kolmogorov complexity [ 12 , Definition 2.2.2 ] .

The automatic complexity of Shallit and Wang is the minimal number of states of an automaton accepting only a given word among its equal-length peers. Finding such an automaton is analogous to the protein folding problem where one looks for a minimum-energy configuration. The protein folding problem may be NP-complete  [ 3 ] , depending on how one formalizes it as a mathematical problem. For automatic complexity, the computational complexity is not known, but a certain generalization to equivalence relations gives an NP-complete decision problem  [ 9 ] .

In this chapter we start to develop the properties of automatic complexity.

1.1 Words

The set of all natural numbers is \(\mathbb {N} =\{ 0,1,2,\dots \} \). Following the von Neumann convention, each natural number \(n\in \mathbb {N} \) is considered to be the set of its predecessors:

\[ 0=\emptyset ,\quad 1=\{ 0\} , \text{ and in general}\quad n=\{ 0,1,\dots ,n-1\} . \]

The power set \(\mathcal P(X)\) of a set \(X\) is the set of all the subsets of \(X\):

\[ \mathcal P(X) = \{ A \mid A \subseteq X\} . \]

If \(\Sigma \) is an alphabet (a set), a word (sometimes called string) is a sequence of elements of \(\Sigma \).

For computer implementations it is often convenient to use an alphabet that is an interval \([0,b)\) in \(\mathbb {N} \). When we want to emphasize that 0, 1, etc. are playing the role of symbols rather than numbers, we often typeset them as \(\mathtt{0}, \mathtt{1}\), etc., respectively.

We denote concatenation of words by \(x\mathbin {{+}{+}}y\) or by juxtaposition \(xy\). In the word \(w=xyz\), \(y\) is called a subword or factor of \(w\). Infinite words are denoted in boldface. For example, there is a unique infinite word \(\mathbf{w}\) such that \(\mathbf w = 0 \mathbf w\), and we write \(\mathbf w = 0^{\infty }\).

Let \(\preceq \) denote the prefix relation, so that \(a\preceq b\) iff \(a\) is a prefix of \(b\), iff there is a word \(c\) such that \(b=ac\).

The concatenation of a word \(x\) and a symbol \(a\) is written \(x\mathbin {{:}{;}}a\) if \(a\) is appended on the right, and \(a\mathbin {{:}{:}}x\) if \(a\) is appended on the left. It may seem most natural to define \(\Sigma ^*\) by induction using \(x\mathbin {{:}{;}}a\), at least for speakers of languages where one reads from left to right. The Lean proof assistant  [ 4 ] (version 3) uses \(a \mathbin {{:}{:}}x\). That approach fits well with co-induction, if infinite words are ordered in order type \(\mathbb {N} \)  [ 10 ] .

For \(n\in \mathbb {N} \), \(\Sigma ^n\) is the set of words of length \(n\) over \(\Sigma \). We may view \(\sigma \in \Sigma ^n\) is a function with domain \(n\) and range \(\Sigma \).

We view functions \(f:A\to B\) as subsets of the cartesian product \(A\times B\),

\[ f=\{ (x,y)\mid y=f(x)\} . \]

The set \(\Sigma ^n\) is both the set of functions from \(n\) to \(\Sigma \) and the cartesian product \((\Sigma ^{n-1})\times \Sigma \) if \(n{\gt}0\). The empty word is denoted \(\varepsilon \) and the first symbol in a word \(x\) is denoted \(x(0)\).

We can define \(\Sigma ^*=\bigcup _{n\in \mathbb {N} }\Sigma ^n\). More properly, the set \(\Sigma ^*\) is defined recursively by the rule that \(\varepsilon \in \Sigma ^*\), and whenever \(s\in \Sigma ^*\) and \(a\in \Sigma \), then \(s\mathbin {{:}{;}}a\in \Sigma ^*\). We define concatenation by structural induction: for \(s,t\in \Sigma ^*\),

\begin{eqnarray*} t\mathbin {{+}{+}}\varepsilon & =& t,\\ t\mathbin {{+}{+}}(s \mathbin {{:}{;}}a) & =& (t\mathbin {{+}{+}}s) \mathbin {{:}{;}}a. \end{eqnarray*}
Definition 1
#

The length of a word \(s\in \Sigma ^*\) is defined by induction:

\begin{eqnarray*} \lvert \varepsilon \rvert & =& 0\\ \lvert s \mathbin {{:}{;}}a\rvert & =& \lvert s\rvert + 1. \end{eqnarray*}

If \(A\subseteq B\) and \(f\subseteq B\times C\) then \(f\upharpoonright A = \{ (x,y)\in f\mid x\in A\} \) is the restriction of \(f\) to \(A\).

The word \(\sigma \) is also denoted \(\langle \sigma (0),\sigma (1),\dots ,\sigma (\lvert \sigma \rvert -1)\rangle \). By convention, instead of \(\langle \mathtt0, \mathtt1, \mathtt0\rangle \) we write simply \(\mathtt0\mathtt1\mathtt0\).

Example 2
#

We have

\begin{eqnarray*} \langle \mathtt0\rangle \mathbin {{+}{+}}\langle \mathtt1, \mathtt0\rangle & =& \langle \mathtt0, \mathtt1, \mathtt0\rangle \\ & =& \mathtt0 \mathbin {{:}{:}}\langle 1, 0\rangle \\ & =& \langle 0,1\rangle \mathbin {{:}{;}}\langle 0\rangle .\\ \end{eqnarray*}

1.1.1 Occurrences and powers

In this subsection we state some results from the subject “combinatorics on words” that will be used frequently.

The statement \(\mathrm{occurs}(x,k,y)\) that \(x\) occurs in position \(k\) within \(y\) may be defined by induction on \(n\):

\begin{eqnarray*} \mathrm{occurs}(x,0,y) & \iff & \exists z,\, x \mathbin {{+}{+}}z =y,\\ \mathrm{occurs}(x,n+1,y)& \iff & \exists a\in \Sigma ,\, \mathrm{occurs}(a\mathbin {{:}{:}}x,n,y). \end{eqnarray*}
Example 3
#

We have \(\mathrm{occurs}(\mathrm{na}, 2, \mathrm{banana})\) and \(\mathrm{occurs}(\mathrm{na},4,\mathrm{banana})\).

The number of occurrences can be defined, without defining a notion of “occurrence”, as the cardinality of \(\{ k\in \mathbb {N} : \mathrm{occurs}(x,k,y)\} \). To define disjoint occurrences, so we should have a notion of “occurrence”.

Definition 4
#

Two occurrences of words \(a\) (starting at position \(i\)) and \(b\) (starting at position \(j\)) in a word \(x\) are disjoint if \(x=uavbw\) where \(u,v,w\) are words and \(\lvert u\rvert =i\), \(\lvert uav\rvert =j\).

Type-theoretically  [ 2 ] we may say that an occurrence of \(x\) in \(y\) is a pair \((k,h)\) where \(h\) is a proof that \(\mathrm{occurs}(x,k,y)\). Of course, we could also say that the occurrence is simply the number \(k\), or even the triple \((x,k,y)\) but in that case the object does not have its defining property within it, so to speak: the number \(k\), and the triple \((x,k,y)\), exist even when \(x\) does not occur at position \(k\) in \(y\).

To naturally speak of disjoint occurrences we make Definition 5. Note that we primarily use zero-based words in this book, i.e., other things being equal we prefer to call the first letter of a word \(x_0\), rather than \(x_1\).

Definition 5
#

A word \(x=x_0\dots x_{n-1}\), or an infinite word \(x=x_0x_1\dots \), with each \(x_i\in \Sigma \), is viewed as a function \(f_x:n\to \Sigma \) with \(f_x(i)=x_i\). An occurrence of \(y=y_0\dots y_{m-1}\) in \(x\) is a function \(f_x\upharpoonright [a,a+m-1]\) such that \(f_x(a+i)=y_i\) for each \(0\le i{\lt} m\).

For \(a,b\in \mathbb {N} \), let \([a,b]=\{ x\in \mathbb {N} \mid a\le x\le b\} \). Two occurrences \(f_x\upharpoonright [a,b]\), \(f_x\upharpoonright [c,d]\) are disjoint if \([a,b]\cap [c,d]=\emptyset \).

If moreover \([a,b+1]\cap [c,d]=\emptyset \) then the occurrences are strongly disjoint.

A subword that occurs at least twice in a word \(w\) is a repeated subword of \(w\). Let \(k\in \mathbb {N} \), \(k\ge 1\). A word \(x=x_0\dots x_{n-1}\), \(x_i\in \Sigma \), is \(k\)-rainbow if it has no repeated subword of length \(k\): there are no \(0\le i{\lt}j{\lt}n-k\) with \(x\upharpoonright [i,i+k-1]=x\upharpoonright [j,j+k-1]\). A 1-rainbow word is also known simply as rainbow.

In particular, the empty word \(\varepsilon \) occurs everywhere in every word. However, each word has exactly one occurrence of \(\varepsilon \), since all empty functions are considered to be equal (in set theory, at any rate).

Having properly defined “occurrence” in Definition 5, we can state and prove the trivial Lemma 6.

Lemma 6

Suppose \(n,t\) are positive integers with \(t\le n+1\). A word of length \(n\) has \(n+1-t\) occurrences of subwords of length \(t\).

Proof

Let \(x\) be a word of length \(n\). The occurrences of subwords of length \(t\) are

\[ f_x\upharpoonright [0,t-1], f_x\upharpoonright [1,t], \dots , f_x\upharpoonright [n-t,n-t+(t-1)]. \]
Theorem 7

Let \(k,t\in \mathbb {N} \) with \(k\ge 1\). In an alphabet of cardinality \(k\), a \(t\)-rainbow word has length at most \(k^t+t-1\).

Proof

There are \(k^t\) words of length \(t\). Thus, by Lemma 6, for a \(t\)-rainbow word of length \(n\) we have \(n+1-t \le k^t\).

Theorem 7 has a converse, Theorem 9. To prove it we shall require the notion of a de Bruijn word.

Definition 8
#

A de Bruijn word of order \(n\) over an alphabet \(\Sigma \) is a sequence \(y\) such that every \(x\in \Sigma ^n\) occurs exactly once as a cyclic substring of \(y\).

Theorem 9

Let \(k,t\in \mathbb {N} \) with \(k\ge 1\). In an alphabet of cardinality \(k\), there exists a \(t\)-rainbow word of length \(k^t+t-1\).

Proof

Case \(t=0\): indeed, the empty word is a 0-rainbow word of length 0. Case \(t=1\): A 1-rainbow word of length \(k\) exists, namely any permutation of the symbols in \(\Sigma \) (Definition 5). For \(t{\gt}1\), let \(x\) be a de Bruijn word \(B(k,t)\) of length \(k^t\) and let \(w= x^2\upharpoonright (k^t+t-1)\).

Definition 10
#

Let \(\alpha \) be a word of length \(n\), and let \(\alpha _i\) be the \(i^{\text{th}}\) letter of \(\alpha \) for \(1\le i\le n\). We define the \(u^{\text{th}}\) power of \(\alpha \) for certain values of \(u\in \mathbb Q_{\ge 0}\) (the set of nonnegative rational numbers) as follows. For \(u\in \mathbb {N} \):

\begin{eqnarray*} \alpha ^0 & =& \varepsilon ,\\ \alpha ^{n+1} & =& \alpha ^n \mathbin {{+}{+}}\alpha . \end{eqnarray*}
  • If \(u=v+k/n\) where \(0{\lt}k{\lt}n\), and \(k\) is an integer, then \(\alpha ^u\) denotes \(\alpha ^v \alpha _1\dots \alpha _k\) and is called a \(u\)-power.

The word \(w\) is \(u\)-power-free if no nonempty \(v\)-power, \(v\ge u\), occurs (Definition 5) in \(w\). In particular, 2-power-free is called square-free and 3-power-free is called cube-free. Let \(\mathbf{w}\) be an infinite word over the alphabet \(\Sigma \), and let \(x\) be a finite word over \(\Sigma \). Let \(u{\gt}0\) be a rational number. The word \(x\) is said to occur in \(\mathbf{w}\) with exponent \(u\) if \(x^{u}\) occurs in \(\mathbf{w}\) (Definition 5).

The reader may note that the definition of \(u\)-power-free is perhaps not obvious. For example, the word \(aba\) is not \(1.49\)-power-free: while it contains no \(1.49\)-power, it contains a \(1.5\)-power. This way of defining things enables Theorem 12 and goes back at least to Krieger  [ 11 , page 71 ] .

Definition 11
#

The critical exponent \(\mathrm{ce}(w)\) of an infinite word \(\mathbf{w}\) is defined by

\[ \mathrm{ce}(w) = \sup \{ \alpha \in \mathbb Q \mid \mathbf{w} \text{ contains some $\alpha $-power}\} . \]
Theorem 12 Krieger  [ 11 ]

The critical exponent of \(\mathbf w\) is equal to

\[ \inf \{ \alpha \in \mathbb Q \mid \mathbf{w} \text{ is $\alpha $-power-free}\} . \]
Proof

Let

\begin{eqnarray*} S& =& \{ \alpha \in \mathbb Q \mid \mathbf{w} \text{ is $\alpha $-power-free}\} ,\\ T& =& \{ \alpha \in \mathbb Q \mid \mathbf{w} \text{ contains some $\alpha $-power}\} . \end{eqnarray*}

Paying careful attention to Definition 10, the word \(\mathbf w\) is \(\alpha \)-power-free iff for all \(\beta \ge \alpha \), \(\mathbf w\) contains no \(\beta \)-power. Therefore, \(S\) is upward closed. On the other hand, \(T\) is an upward dense subset of the complement of \(S\). Therefore, \(\mathrm{ce}(w)=\sup T=\inf S\).

As an example of Definition 10, we have \(\mathtt{0110}^{3/2}=\mathtt{011001}\). Note that the expected Power Rule for Exponents fails for word exponentiation. In general, \((x^a)^b\ne x^{ab}\), for instance

\[ (01)^3=010101\ne 010010= ((01)^{3/2})^2. \]

Fix an alphabet \(\Sigma \), and let \(\Sigma ^+\) denote the set of nonempty words over \(\Sigma \).

Lemma 13 Lyndon and Schützenberger  [ 13 ] ; see  [ 14 , Theorem 2.3.2 ]
#

Let \(c,d,e\in \Sigma ^+\). The equation \(cd=de\) holds iff there exist a nonempty word \(u\), a word \(v\), and a natural number \(p\) such that \(c=uv\), \(d=(uv)^p u\), and \(e=vu\).

Theorem 14 Lyndon and Schützenberger  [ 13 ] ; see [ 14 , Theorem 2.3.3 ]
#

Let \(x,y\in \Sigma ^+\). Then the following four conditions are equivalent:

  1. \(xy=yx\).

  2. There exists \(z\in \Sigma \) and integers \(k,l{\gt}0\) such that \(x=z^k\) and \(y=z^l\).

  3. There exist integers \(i,j{\gt}0\) such that \(x^i=y^j\).