This post continues the discussion on the Kleene theorem from here. The previous post gave a notion of regular expressions that had the same expressive power as nondeterministic automata. The idea was that a star corresponds to a graph where the set of vertices had one orbit, and where the edges where labelled by simpler regular expressions. In the case without atoms, such a graph is necessarily a self loop, which corresponds to the standard Kleene star. This post is about a stronger requirement: instead of saying that the graph has one orbit of vertices, we say that it has one orbit of edges.
For instance, in the equality atoms, if the vertices are atoms, then the previous post would allow a graph where we have edges for both equality and disequality, while the stronger requirement in this post will only allow one kind of edge.
The more restricted regular expressions are still expressively equivalent to the less restricted ones; but the proof is only in the equality atoms and the result is likely false in other atoms.
Definition (restricted atomic star). Define a restricted atomic star to be the special case of an atomic star (see here), but with the additional requirement that the edge relation is has one orbit as well (which implies that the vertex set has one orbit).
Definition (restricted atomic regular expression). A restricted atomic regular expression is one which uses the restricted atomic star instead of the usual one.
Theorem. For the equality atoms, atomic regular expressions are the same as restricted atomic regular expressions, and therefore the same as nondeterministic automata.
The rest of this post is devoted to proving the theorem. For a set and elements
, define
to be the set of sequences of pairs such that the source of the first pair is , the target of the last pair is
, and the target of each pair (except the last) is the source of the next pair. For the purpose of this proof, we say that a set
is simple if for every
, the set
is definable by a restricted regular expression.
To prove the theorem, it suffices to show that every orbit-finite set is simple. To define the language recognized by a nondeterministic automaton with states
, one can take the expression for
substitute each letter
by the set of labels that can be read when going from state
to state
. The rest of the post is devoted to showing that all orbit-finite sets are simple.
Lemma 1. Simple sets are closed under finite unions and surjective images of finitely supported functions.
Proof. For finite unions, the proof is as in the standard Kleene theorem, with some orbit finiteness involved, in other words the same as here. For images, one observes that if is a finitely supported function with domain
, then
By the above lemma, it suffices to show that for every , the set
, namely the set of nonrepeating tuples of
atoms, is simple. This is because every orbit finite set is a finite union of images of such sets under finitely supported functions. We prove that
is simple by induction on
.
In the rest of the proof, when , we use the expression path from
to
to refer to an element of
. We say that a path is nonforgetting if every vertex visited by the shares at least one atom with the source vertex. A path is called forgetting otherwise, i.e. its target does have any atoms in common with its source.
Lemma 2 . For every , the set of nonforgetting paths from
to
can be defined by a restricted atomic regular expression.
Proof. Let be the set of atom tuples which contain at least one atom in from
. The set in the statement of the lemma is
. It is not difficult to see that
is a finite union of images of the set
. Therefore, the result follows from 1 and the induction assumption.
A path is called minimal forgetting if it is forgetting, but after removing its last edge it is no longer forgetting. From lemma 2 it follows that the minimal forgetting paths from to
can be described by a restricted atomic regular expression.
An arbitrary path from
to
is either nonforgetting, or it can be decomposed as a finite concatenation of forgetting paths. In the latter case, this means that there is a decomposition
and a sequence of atom tuples
such that for every , the tuples
and
share no atoms in common, and the path
is a minimal forgetting path from
to
. The key property is that all of the pairs
are in the same orbit. Therefore, the paths that decompose into a forgetting paths can be described by using the restricted atomic star.
This completes the proof of the theorem.
Question. Does the theorem hold for other kinds of atoms?
From discussions with Sławek, it seems that the answer is no. For instance, the theorem seems to fail in the universal undirected graph, or even in . In both cases, a difficult language seems to be simply all possible sequences of atoms. What breaks down in the proof is the key property at the end, in atoms other than the equality atoms, there can be more than one way in which disjoint tuples of atoms can be related.