Automatic Complexity

1 First steps in automatic complexity

The Kolmogorov complexity of a finite word w is, roughly speaking, the length of the shortest description w of w in a fixed formal language. The description w can be thought of as an optimally compressed version of w. Motivated by the non-computability of Kolmogorov complexity, Shallit and Wang  [ 19 ] studied a deterministic finite automaton analogue.

Their notion of automatic complexity is an automata-based and length-conditional analogue of Sipser’s distinguishing complexity CD ( [ 21 ] , [ 15 , Definition 7.1.4 ] ). This was pointed out by Mia Minnes in a review in the Bulletin of Symbolic Logic from 2012. Another precursor is the length-conditional Kolmogorov complexity [ 15 , Definition 2.2.2 ] .

The automatic complexity of Shallit and Wang is the minimal number of states of an automaton accepting only a given word among its equal-length peers. Finding such an automaton is analogous to the protein folding problem where one looks for a minimum-energy configuration. The protein folding problem may be NP-complete  [ 3 ] , depending on how one formalizes it as a mathematical problem. For automatic complexity, the computational complexity is not known, but a certain generalization to equivalence relations gives an NP-complete decision problem  [ 12 ] .

In this chapter we start to develop the properties of automatic complexity.

1.1 Words

The set of all natural numbers is N={0,1,2,}. Following the von Neumann convention, each natural number nN is considered to be the set of its predecessors:

0=,1={0}, and in generaln={0,1,,n1}.

The power set P(X) of a set X is the set of all the subsets of X:

P(X)={AAX}.

If Σ is an alphabet (a set), a word (sometimes called string) is a sequence of elements of Σ.

For computer implementations it is often convenient to use an alphabet that is an interval [0,b) in N. When we want to emphasize that 0, 1, etc. are playing the role of symbols rather than numbers, we often typeset them as 0,1, etc., respectively.

We denote concatenation of words by x++y or by juxtaposition xy. In the word w=xyz, y is called a subword or factor of w. Infinite words are denoted in boldface. For example, there is a unique infinite word w such that w=0w, and we write w=0.

Let denote the prefix relation, so that ab iff a is a prefix of b, iff there is a word c such that b=ac.

The concatenation of a word x and a symbol a is written x:;a if a is appended on the right, and a::x if a is appended on the left. It may seem most natural to define Σ by induction using x:;a, at least for speakers of languages where one reads from left to right. The Lean proof assistant  [ 4 ] (version 3) uses a::x. That approach fits well with co-induction, if infinite words are ordered in order type N  [ 13 ] .

For nN, Σn is the set of words of length n over Σ. We may view σΣn is a function with domain n and range Σ.

We view functions f:AB as subsets of the cartesian product A×B,

f={(x,y)y=f(x)}.

The set Σn is both the set of functions from n to Σ and the cartesian product (Σn1)×Σ if n>0. The empty word is denoted ε and the first symbol in a word x is denoted x(0).

We can define Σ=nNΣn. More properly, the set Σ is defined recursively by the rule that εΣ, and whenever sΣ and aΣ, then s:;aΣ. We define concatenation by structural induction: for s,tΣ,

t++ε=t,t++(s:;a)=(t++s):;a.
Definition 1
#

The length of a word sΣ is defined by induction:

|ε|=0|s:;a|=|s|+1.

If AB and fB×C then fA={(x,y)fxA} is the restriction of f to A.

The word σ is also denoted σ(0),σ(1),,σ(|σ|1). By convention, instead of 0,1,0 we write simply 010.

Example 2
#

We have

0++1,0=0,1,0=0::1,0=0,1:;0.

1.1.1 Occurrences and powers

In this subsection we state some results from the subject “combinatorics on words” that will be used frequently.

The statement occurs(x,k,y) that x occurs in position k within y may be defined by induction on n:

occurs(x,0,y)z,x++z=y,occurs(x,n+1,y)aΣ,occurs(a::x,n,y).
Example 3
#

We have occurs(na,2,banana) and occurs(na,4,banana).

The number of occurrences can be defined, without defining a notion of “occurrence”, as the cardinality of {kN:occurs(x,k,y)}. To define disjoint occurrences, so we should have a notion of “occurrence”.

Definition 4
#

Two occurrences of words a (starting at position i) and b (starting at position j) in a word x are disjoint if x=uavbw where u,v,w are words and |u|=i, |uav|=j.

Type-theoretically  [ 2 ] we may say that an occurrence of x in y is a pair (k,h) where h is a proof that occurs(x,k,y). Of course, we could also say that the occurrence is simply the number k, or even the triple (x,k,y) but in that case the object does not have its defining property within it, so to speak: the number k, and the triple (x,k,y), exist even when x does not occur at position k in y.

To naturally speak of disjoint occurrences we make Definition 5. Note that we primarily use zero-based words in this book, i.e., other things being equal we prefer to call the first letter of a word x0, rather than x1.

Definition 5
#

A word x=x0xn1, or an infinite word x=x0x1, with each xiΣ, is viewed as a function fx:nΣ with fx(i)=xi. An occurrence of y=y0ym1 in x is a function fx[a,a+m1] such that fx(a+i)=yi for each 0i<m.

For a,bN, let [a,b]={xNaxb}. Two occurrences fx[a,b], fx[c,d] are disjoint if [a,b][c,d]=.

If moreover [a,b+1][c,d]= then the occurrences are strongly disjoint.

A subword that occurs at least twice in a word w is a repeated subword of w. Let kN, k1. A word x=x0xn1, xiΣ, is k-rainbow if it has no repeated subword of length k: there are no 0i<j<nk with x[i,i+k1]=x[j,j+k1]. A 1-rainbow word is also known simply as rainbow.

In particular, the empty word ε occurs everywhere in every word. However, each word has exactly one occurrence of ε, since all empty functions are considered to be equal (in set theory, at any rate).

Having properly defined “occurrence” in Definition 5, we can state and prove the trivial Lemma 6.

Lemma 6

Suppose n,t are positive integers with tn+1. A word of length n has n+1t occurrences of subwords of length t.

Proof
Theorem 7

Let k,tN with k1. In an alphabet of cardinality k, a t-rainbow word has length at most kt+t1.

Proof

Theorem 7 has a converse, Theorem 9. To prove it we shall require the notion of a de Bruijn word.

Definition 8
#

A de Bruijn word of order n over an alphabet Σ is a sequence y such that every xΣn occurs exactly once as a cyclic substring of y.

Theorem 9

Let k,tN with k1. In an alphabet of cardinality k, there exists a t-rainbow word of length kt+t1.

Proof
Definition 10
#

Let α be a word of length n, and let αi be the ith letter of α for 1in. We define the uth power of α for certain values of uQ0 (the set of nonnegative rational numbers) as follows. For uN:

α0=ε,αn+1=αn++α.
  • If u=v+k/n where 0<k<n, and k is an integer, then αu denotes αvα1αk and is called a u-power.

The word w is u-power-free if no nonempty v-power, vu, occurs (Definition 5) in w. In particular, 2-power-free is called square-free and 3-power-free is called cube-free. Let w be an infinite word over the alphabet Σ, and let x be a finite word over Σ. Let u>0 be a rational number. The word x is said to occur in w with exponent u if xu occurs in w (Definition 5).

The reader may note that the definition of u-power-free is perhaps not obvious. For example, the word aba is not 1.49-power-free: while it contains no 1.49-power, it contains a 1.5-power. This way of defining things enables Theorem 12 and goes back at least to Krieger  [ 14 , page 71 ] .

Definition 11
#

The critical exponent ce(w) of an infinite word w is defined by

ce(w)=sup{αQw contains some α-power}.
Theorem 12 Krieger  [ 14 ]

The critical exponent of w is equal to

inf{αQw is α-power-free}.
Proof

As an example of Definition 10, we have 01103/2=011001. Note that the expected Power Rule for Exponents fails for word exponentiation. In general, (xa)bxab, for instance

(01)3=010101010010=((01)3/2)2.

Fix an alphabet Σ, and let Σ+ denote the set of nonempty words over Σ.

Lemma 13 Lyndon and Schützenberger  [ 16 ] ; see  [ 18 , Theorem 2.3.2 ]
#

Let c,d,eΣ+. The equation cd=de holds iff there exist a nonempty word u, a word v, and a natural number p such that c=uv, d=(uv)pu, and e=vu.

Theorem 14 Lyndon and Schützenberger  [ 16 ] ; see [ 18 , Theorem 2.3.3 ]
#

Let x,yΣ+. Then the following four conditions are equivalent:

  1. xy=yx.

  2. There exists zΣ and integers k,l>0 such that x=zk and y=zl.

  3. There exist integers i,j>0 such that xi=yj.

1.2 Automata

In Definition 15, some basic objects of study in this book are introduced.

Definition 15
#

Let Σ be a finite alphabet and let Q be a finite set whose elements are called states. A nondeterministic finite automaton (NFA) is a 5-tuple M=(Q,Σ,δ,q0,F). The transition function δ:Q×ΣP(Q) maps each (q,b)Q×Σ to a subset of Q. Within Q we find the initial state q0Q and the set of final states FQ. The function δ is extended to a function δ:Q×ΣP(Q) by structural induction:

δ(q,ε)={q},δ(q,σ:;i)=sδ(q,σ)δ(s,i).

Overloading notation, we may also write δ=δ. The language accepted by M is

L(M)={xΣδ(q,x)F}.

A deterministic finite automaton (DFA) is also a 5-tuple M=(Q,Σ,δ,q0,F). In this case, δ:Q×ΣQ is a total function and is extended to δ:Q×ΣQ by

δ(q,σ:;i)=δ(δ(q,σ),i).
1

If the domain of δ is a subset of Q×Σ, M is an incomplete DFA. In this case, M coincides with an NFA with a special property which we can check for effectively. We denote a partial function f from A to B by f:(A)B or fAB.

Finally, the set of words accepted by M is

L(M)={xΣδ(q,x)F}.

Automata may be viewed as an instance of graph theory, where the automaton is a directed graph (digraph) with labeled edges, a state is a vertex and a transition is an edge. This point of view we introduce now, and return to frequently.

Definition 16
#

A digraph D=(V,E) consists of a set of vertices V and a set of edges EV2. Let s,tV. Let n0, nZ. A walk of length n from s to t is a function Δ:{0,1,,n}V such that Δ(0)=s, Δ(n)=t, and (Δ(k),Δ(k+1))E for each 0k<n.

A cycle of length n=|Δ|1 in D is a walk from s to s, for some sV, such that Δ(t1)=Δ(t2),t1t2{t1,t2}={0,n}. Two cycles are disjoint if their ranges are disjoint.

Equation 1 may be viewed as a closure property: if (q,σ,r)δ then (q,σ:;i,δ(r,i))δ. Formally, then, δ is the intersection of all functions that contain δ (viewing symbols as length-1 strings) and is closed under this closure property. Using the fact that N is well-ordered and is the range of the length function on strings, we can define G(n) to be δ restricted to words of length n, and then G(n) is defined from Gn by G(n)=F(n,Gn), where

F(n,g)={δ,n=0,{(q,σ:;i,δ(r,i))(q,σ,r)g(n1)},n>0,n1dom(g),,n>0,n1dom(g).

Note that (Gn)(n1)=G(n1) so that the third clause does not occur in our application. Since G0 is the empty function, we need to carve out the special case n=0.

Theorem 17 is the special case of the wellorder (N,<) from Schimmerling  [ 17 , Theorem 3.8 ] . We apply it with B={ffQ×ΣQ}.

Theorem 17
#

Let B be a set and let P be the set of all partial functions from N to B. For each F:N×PB, there is a unique function G:NB such that

G(n)=F(n,Gn)

for all nN.

Remark 18
#

Sipser, 2nd edition  [ 20 ] does not introduce δ at all but discusses everything in terms of a sequence of values of δ. Shallit  [ 18 ] , and Hopcroft and Ullman  [ 5 ] , introduce δ but do not give a justification for its existence.

Definition 19
#

Let Σ be an alphabet. Let DFAΣ denote the class of all DFAs over Σ and let partDFAΣ denote the class of all partial DFAs over Σ. If a DFA M over Σ has the property that it accepts a string x, but no other strings of length |x|, i.e.,

L(M)Σ|x|={x},

then we say that M accepts x uniquely. Let xΣ with |x|=n. Define A(x,Σ), the automatic complexity of x (with respect to Σ) to be the least number of states in any MDFAΣ such that M accepts x uniquely.

If we consider x to be a function x:[n]Σ, then xΣ where Σ=range(x). We define A~(x)=A(x,range(x)).

Similarly, we define A(x,Σ), the partial automatic complexity of x, to be the least number of states in any MpartDFAΣ such that L(M)Σn={x}.

In most cases below, if we write A(x) we intend A~(x).

Theorem 20

Let x be a word in a finite alphabet Σ. We have A(x)A(x). If |x|=1 and |Σ|>1 then A(x)=1<2=A(x).

Proof
Definition 21 [ 7 , 19 ]
#

Let L(M) be the language recognized by the automaton M. Let x be a finite word. The (unique-acceptance) nondeterministic automatic complexity AN(w)=ANu(w) of w is the minimum number of states of an NFA M such that M accepts w and the number of walks along which M accepts words of length |w| is 1.

The exact-acceptance nondeterministic automatic complexity ANe(w) of w is the minimum number of states of an NFA M such that M accepts w and L(M)Σ|w|={w}.