Alternation Hierarchies of First Order Logic with Regular Predicates

. We investigate the decidability of the deﬁnability problem for fragments of ﬁrst order logic over ﬁnite words enriched with regular numerical predicates. In this paper, we focus on the quantiﬁer alternation hierarchies of ﬁrst order logic. We obtain that deciding this problem for each level of the alternation hierarchy of both ﬁrst order logic and its two-variable fragment when equipped with all regular numerical predicates is not harder than deciding it for the corresponding level equipped with only the linear order. Relying on some recent results, this proves the decidability for each level of the alternation hierarchy of the two-variable ﬁrst order fragment while in the case of the ﬁrst order logic the question remains open for levels greater than two. The main ingredients of the proofs are syntactic transformations of ﬁrst-order formulas as well as the inﬁnitely testable property, a new algebraic notion on varieties that we deﬁne.


Introduction
The equivalence between regular languages and automata as well as monadic second order logic [3] and finite monoids [14] was the start of a domain of research that is still active today. In this article, we are interested in the logic on finite words, and more precisely the question we address is the definability problem for fragments of logic. Fragments of logic are defined as sets of monadic second order formulas satisfying some restrictions, and are equipped with a set of predicates called a signature. Then the definability problem of a fragment of logic F consists in deciding if a regular language can be defined by a formula of F. This question has already been considered and solved in many cases where the signature contains only the predicate <, which denotes the linear order over the positions of the word. For instance, a celebrated result by Schützenberger [19] and McNaughton and Papert [13] gave an effective algebraic characterization of languages definable by first order formulas. The decidability has often been achieved through algebraic means, showing a deep connection between algebraic and logical properties of a given regular language. In this article, we follow this approach.
We investigate the question of the behaviour of the decidability of some fragments when their signature is enriched with regular numerical predicates. These predicates are exactly the formulas of monadic second order logic without letter predicates. Intuitively they correspond to the maximal class of numerical predicates that can enrich the signature of a fragment of MSO, while keeping the definable languages regular. This question was already considered in the case of first order logic (FO) in [2] and one of its fragments: the formulas without quantifier alternation in [15].
The enrichment by regular numerical predicates arose in the context of the Straubing's conjectures [23]. Roughly speaking, these conjectures state that deciding the definability of a regular language to a fragment of enriched logic corresponds to deciding its circuit complexity. It is known [15,23] that an enrichment of the classical fragments by regular numerical predicates is equivalent to an enrichment by the signature [<, +1, MOD], where +1 denotes the local predicates and MOD the modular predicates. A first step toward the study of fragments of logic with these predicates was initiated by Straubing [22]. He obtained that adding the local predicates preserves the decidability for a large number of fragments. As a corollary of this work, Straubing obtained that the decidability of the alternation hierarchy of first order logic (BΣ k ) equipped with [<, +1] reduces to the decidability of the simpler one [<]. More recently, Kufleitner and Lauser [11] proved the decidability of the alternation hierarchy of the two-variable first order fragment (FO 2 k ) equipped with [<, +1] by using the recent results [10,12] on the decidability of this hierarchy with [<].
In this context, the case of modular predicates is poorly understood. The study of this enrichment was first considered for first order logic in [2], and had been extended to the first level of its alternation hierarchy with the successor predicate in [15], and later without it in [4]. The enrichment by a finite set of modular predicate was considered in [8]. Finally, the authors provided a characterization of the two-variable first order logic over the signature [<, MOD] in [6].
In this paper, we focus on the enrichment by all regular predicates and let aside the question of the signature [<, MOD], which surprisingly turns out to be more intricate. The fragments we consider here are the quantifier alternations hierarchy of the first order logic and its two-variable counterpart. Our main result states that for both of these hierarchies, the decidability of each level equipped with regular numerical predicates reduces to decidability of the same level with the signature [<, +1]. Then by using the recent decidability result of Kufleitner and Lauser [11], as well as the decidability of BΣ 2 [<] by Place and Zeitoun [18], we deduce that the fragments FO 2 k [Reg], for any positive k, and BΣ 2 [Reg] are decidable. Our main contributions are summarized in the next table.
Proofs Methods. The proofs of the main results can be decomposed in two major steps. The first part is rather classical and shows that in the cases we consider, adding a finite number of modular predicates does not affect the decidability. The second part is dedicated to finding a systematic way to select, for a given  [2] regular language and a fragment, a finite number of modular predicates that can serve as witness of its definability. This is done through a heavy use of the algebraic framework of varieties of semigroups. We introduce a new notion for varieties of semigroups that we call the infinitely testable property and show that this property is satisfied by the considered fragments. We then conclude by proving that this property allows us to find such a witness set for modular predicates that only depends on the input language.
Generalizations. While we are focused in this article on the levels of the quantifier alternation hierarchies, our approach can be generalized to other fragments under certain conditions. The generality of our results are discussed in Remarks 4, 6, 10 and 12.
Organization of the Paper. Section 2 defines the logical and algebraic notions that will be used in the paper. The main results of the paper are presented in Sect. 3. The Sects. 4 and 5 are then dedicated to the proofs. Section 4 first discusses adding a finite number of predicates and reduces our decidability problems to a delay question, which can be summarized as being able to choose the proper finite set of modular predicates. Then Sect. 5 defines a new notion, the infinitely testable property, which is satisfied by the fragments that we consider and whereby gives a delay. Finally, we discuss in Sect. 5 some other results that can be directly obtained from our approach, as well as a related algebraic characterization of the two-variable first order logic with the regular numerical signature.

Preliminaries
Logic. We consider the monadic second order logic on finite words MSO[<] as usual (see [23] for example). We denote by A an alphabet and by a a letter of A. A word u over an alphabet A is a set of labelled positions ordered from 0 to |u| − 1. The set of words over A is denoted A * and a subset L of A * is called a language. We also denote by A + the set of non-empty words. A language is said to be defined by a formula if it corresponds exactly to the set of words that satisfy this formula. It is said to be regular if it is defined by a MSO[<] formula. When syntactic restrictions are applied to MSO[<], one defines fragments of logic that characterize subclasses of regular languages. The most well-known fragment is probably the first order logic, whose expressive power was characterized thanks to the results of [13,19]. The first order logic itself gave birth to its own zoo of fragments. These were defined using syntactical restrictions such as limiting the number of variables, or by enrichment of its signature. A fragment F with signature σ will be denoted F[σ] and will refer to the formulas as well as the class of languages it defines. We first define the different signatures that will appear through this paper, and then formally define the fragments that are considered here: the quantifier alternation hierarchies.
Signatures. We are interested in regular numerical predicates, which are numerical predicates that can only define regular languages. Simultaneously, Straubing [23] and Péladeau [15] defined three sets of regular numerical predicates that can be used as a base for all the regular numerical predicates. The first set is the singleton order {<} which is a binary predicate corresponding to the natural order on the positions of the input word. The second set is {min, max, S} and is called the local predicates. It is usually denoted +1. The predicates min and max are unary predicates that are satisfied respectively on the first and last positions. The predicate S, the successor, is a binary predicate satisfied if the second variable quantifies the successor of the first one.
Finally, we define, for each positive integer d, the modular predicates on d, denoted MOD d , as the set, for i < d, of predicates MOD d i (x) which are unary predicates satisfied if the position quantified by x is congruent to i modulo d, and the predicates D d i which are constants holding if the length of the input word is congruent to i mod d. We denote by MOD the union of the classes MOD d , for any positive d.
Example 1. The language (A 2 ) * aA * is defined by the formula: ∃x a(x) ∧ MOD 2 0 (x). The signatures that we will consider for our fragments are unions of these three sets of regular numerical predicate, and will always contain the letter predicates. Abusing notations, we will also write Reg = {<} ∪ +1 ∪ MOD.

Fragments and Alternation Hierarchies. While MSO[Reg] = MSO[<]
, the equality does not hold for subclasses of MSO. For a signature σ, we denote by FO[σ] the class of first order formulas whose predicates belong to σ. Since the local predicates can be expressed in FO[<], the fragments FO[<] and FO[<, +1] define the same classes of languages, called the Star-Free languages [13]. On the other hand the fragment FO[<, MOD] is strictly more expressive [2].
The fragment FO 2 is the subclass of formulas of FO using only two symbols of variables which can be reused (see Example 2). Here, the class of languages defined by FO 2 [<] is strictly contained in FO 2 [<, +1] and FO 2 [<, MOD] (see [6,25]).

Example 2.
The language A * aA * bA * aA * can be described by the first order formula ∃x∃y∃z x < y < z ∧ a(x) ∧ b(y) ∧ a(z). This formula uses three variables x, y and z. However, by reusing x we get an equivalent formula that uses only two variables: Now given a first order formula, one can compute a prenex normal form using the De Morgan's laws. We define the quantifier alternation of a formula as the number of blocks of quantifiers ∀ and ∃ in its prenex normal form. For example, the formula ∃x∃y∀z x < z < y ∧ a(x)∧ a(y)∧ c(z) has a quantifier alternation of 2. It describes the language A * ac * aA * . Then given a signature σ and a positive integer k, we denote by BΣ k [σ] the set of prenex normal formulas of FO[σ] whose quantifier alternation is smaller or equal to k. They form the levels of the quantifier alternation hierarchy over FO[σ].
When σ is reduced to {<}, this hierarchy is called the Straubing-Thérien hierarchy [21,24]. Only the first [20] and second [18] levels are known to be decidable. For σ = {<} ∪ +1, this hierarchy is called the Dot-Depth hierarchy [5]. The decidability of each level reduces to the decidability of the corresponding level of the Straubing-Thérien hierarchy [22]. In both cases, the hierarchies are known to be strict, and cover all Star-Free languages. In this article, we also consider the alternation hierarchy of FO 2 . To define formally the number of alternations of a formula, we cannot rely on the prenex normal form since the construction increases the number of variables. In particular, remark that FO 2 [7]. That said, the number of alternations is still a relevant parameter that could be defined as follows: Consider the parse tree naturally associated to a formula. For instance, (a) has ∃ as a root and the atomic formulas as the leaves. In a two-variable first order formula we count the maximal number of alternations appearing on a branch, i.e. between the root and a leaf, once the negations have been pushed on to the leaves. A more precise definition can be found in [28]. We denote by FO 2 k [σ] the formulas of FO 2 [σ] that have at most k − 1 quantifier alternations. The hierarchy induced by FO 2 k [<] is known to be strict [28] and its definability problem is decidable [10,12]. Note that the hierarchy FO 2 k [<, +1] is also known to be decidable [11].
Algebra. We quickly present here the fundamental notions used by the proofs of the article (mainly Sect. 5) and refer the reader to [17] for a detailed approach. A (finite) semigroup is a finite set equipped with an associative internal law. A semigroup with a neutral element for this law is called a monoid. Recall that a semigroup S divides another semigroup T if S is a quotient of a subsemigroup of T . This defines a partial order on finite semigroups. Given a finite semigroup S, an element e of S is idempotent if ee = e. We denote by E(S) the set of idempotents of S. For any element x of S, there exists a positive integer n such that x n is idempotent. We call this element the idempotent power of x and denote it by x ω . One can check that the application x → x ω is well defined.
A semigroup S recognizes a language L over an alphabet A via a morphism η : A + → S. Given a regular language L, we can compute its syntactic semigroup as the smallest semigroup that recognizes L, in the sense of division. For a morphism η : A + → S, the set η(A) is an element of the powerset semigroup of S. As such it has an idempotent power. The stability index of a morphism η is then defined as the smallest positive integer s such that η(A s ) = η(A 2s ).
Remark that η(A s ) forms a subsemigroup of S, that we call the stable semigroup. A subset T of S is an ideal if the sets T S and ST are both included in T . A (pseudo-)variety of semigroups is a non empty class of finite semigroups closed under division and finite product.
A fragment of logic is characterized by a variety if they recognize the same languages. By extension, a variety V will also refers to the class of languages it recognizes. The most famous example is the equality FO[<] = A [13,19], where A denotes the class of aperiodic semigroups, which are finite semigroups that are not divided by any group. As for FO[<], the definability problem for a fragment of logic has often been solved thanks to an algebraic characterization ( [20,24,25] for example). This decidability is sometimes obtained through profinite equations. For example, the variety of aperiodic semigroups A is defined by the equation x ω+1 = x ω .

Main Results
We present here the main results of this paper, which are reductions of decidability from any level of the first order hierarchies equipped with the regular complete signature to the corresponding level whose signature is reduced to the order. As the decidability of each level of the two-variable hierarchy is known, we get a decidability result. But as the decidability of both the Straubing-Thérien hierarchy, and consequently the Dot-Depth hierarchy as well as their decidability are equivalent, is still open for any level greater than 2, we only get a transfer result.  Remark 4. This approach could be applied to any abstract fragment characterized by a variety and expressive enough to contain the languages (ab) + and A * a. At this level of abstraction, the operation of adding modular predicates corresponds to a wreath product by modular morphisms. However, for the sake of concise presentation, we focus on what we assume to be the most interesting corollaries of this approach: the alternation hierarchies with successor. This method can also be generalized to varieties that do not contain (ab) + and is therefore not dependant on the presence of the successor relation. However, this requires to introduce the more involved framework of finite categories [27]. In this context, the infinitely testable property of a variety of semigroups, which is the key ingredient of the proof, lifts to the associated variety of semigroupoids.  [22], that reduces the decidability of BΣ k [<, +1] to the decidability of BΣ k [<], and the result of Kufleitner and Lauser [11] that prove the decidability of FO 2 k [<, +1]. The main issue is therefore to prove the first reduction. In order to obtain it, we decompose the proof in two important steps. The first one proves that adding a finite number of modular predicates is decidable, while the second one allows us to compute such a finite set that serves as a witness for a language to belong to the fragment. If the first step is quite standard, the second introduces a new notion, the infinitely testable property, which allows us to solve the delay question for the fragments we consider.

The Delay Question
The objective of this section is to reduce the decidability question to another question, the delay. Informally, the delay question is: which modular predicates would be used by a formula of the fragment to describe the input language. Firstly, we deal with adding the modular predicates ranging over one specific congruence. The idea is to reduce the decidability of a partially enriched fragment to the one of the input fragment. As in [6], this is done by transferring the modular information to an enriched alphabet. For any positive integer d, we denote by A d = A × Z d the enriched alphabet of A and by π d : A + d → A + the projection on the first component. To link this enrichment to the modular information, we also define the well-formed words K d as the language of words (a 0 , i 0 ) . . . (a n , i n ) such that for any 0 j n i j = j mod d. Finally, given a language L, we denote by L d = π −1 d (L) ∩ K d . The following theorem proves the reduction from the partially enriched fragment to the initial one by deriving formulas for one language to a formula to the other. Remark 6. Even if the previous proposition is only stated for the fragments considered in this article, its applications range over many more fragments. Indeed, it would hold for any expressive enough fragment, i.e. any fragment that can define the set of well-formed words over the enriched alphabet and that satisfies some closure properties. Now that we proved that adding predicates according to one congruence we make the following easy remark. The denomination stems from the Delay Theorem of [22] that solves a similar question for the enrichment by the successor predicate.

The Infinitely Testable Property
In this Section, we conclude the proof of the main theorem by solving the delay question for the fragments considered. We actually solve the delay question for the fragments we consider via an algebraic property on varieties satisfied by their characterization. This property, which we call the infinitely testable property, is a new notion that we introduce and which is defined below. Informally, a variety is infinitely testable if the membership of a language to the variety only depends on words long enough.
Definition. Given a semigroup S, the idempotents' ideal of S, denoted I E (S), is the ideal of S generated by its idempotents. We have then I E (S) = SE(S)S, where E(S) denotes the set of idempotents of S. Note also that given a morphism η : A + → S, it is the semigroup of all elements of S having an infinite number of preimages by η. An aware reader could notice that I E (S) is the set of all elements of S that are J -below an idempotent. A variety of semigroups V is said to be infinitely testable if the membership of a semigroup to V is equivalent to the membership of its idempotents' ideal. Informally, a variety is infinitely testable if its membership can be reduced to an algebraic condition on the idempotents' ideal. By extension, we say that a fragment of logic is infinitely testable if it is characterized by an infinitely testable variety.

Example 7. The fragment FO[=]
is equivalent to the aperiodic and commutative variety ACom. This fragment is also described by the equations xy = yx and x ω+1 = x ω . This fragment is not infinitely testable. For instance the language equal to the singleton {ab} has a trivial idempotents' ideal while it is not definable in FO[=].

Example 8. The fragment FO[+1]
is equivalent to the languages whose syntactic semigroup belongs to the variety: ACom * LI [23,Theorem VI.3.1]. This fragment is also described by the profinite equation x ω uy ω vx ω wy ω = x ω wy ω vx ω uy ω . (b) We now show that it is an infinitely testable fragment. Let L be a regular language and S its syntactic semigroup. We simply prove that if the Eq. (b) is not satisfied by S, then it is not satisfied by I E (S). Suppose that there exists x, y, u, v, w ∈ S such that the Eq. (b) is not satisfied. Then by setting: x = x ω , y = y ω , u = x ω uy ω , v = y ω vx ω , w = x ω wy ω . All new variables belong to I E (S) and they also fail to satisfy (b).
Infinitely Testable Fragments. The infinitely testable property of levels of the FO 2 [<, +1] hierarchy is proved using the equational characterization obtained in [11], following Example 8. Because of the lack of equational description for BΣ k [<, +1], we use a more involved algebraic argument for this latter case. Remark 10. The infinitely testable property of BΣ k [<, +1] can be stated in a more general framework. Indeed, in the article of Tilson [27], a version of the delay theorem states that a semigroup belongs to V * LI if, and only if, the idempotents' category belongs to the variety of finite categories generated by V. In this framework of finite categories, the idempotents categories is defined as the semigroup S E by removing the absorbing element 0. Therefore, one could argue that all varieties of semigroups of the form V * LI have the property to be infinitely testable.
Delay Theorem for Quantifier Hierarchies. We reach the key theorem of our presentation. It proves a delay for each levels of the quantifier hierarchies over the first order logic and its two-variable counterpart. The delay we obtain here is the stability index. Before proving this claim, let us remark that since a variety of semigroups is closed by division, this claim ends the proof. Since if L belongs to F[σ, MOD ds ] then S ds belongs to V and therefore I E (S ds ) belongs to V as well. By division, I E (S s ) belongs to V, and thanks to the infinitely testable hypothesis, we have that S s belongs to V. Finally, we deduce that L s belongs to F[σ]. We now aim to construct a division from I E (S s ) to I E (S ds ). This is done through the enriched alphabet. We introduce the following projection h : and F d the language of well-formed factors, which is the set of well-formed words that do not necessarily start by a letter of the form (a, 0). Note that L ds = h −1 (L s ) ∩ K s . Let us remark also that the image a word not in F s (resp. F ds ) by η s (resp. η ds ) has an absorbing zero as image by η s (resp. η ds ). This zero being idempotent, it belongs to I E (S s ) (resp. I E (S ds )). Finally, if two words of F s have the same image by η s , then they have the same length modulo s and their first (and consequently last) letters have the same enrichment.
Consider then x a non-zero element of I E (S s ). We show that Since x belongs to I E (S s ), there exists a word u of A + s of length greater than s in the preimage of x. And since η s (A s s ) = η s (A 2s s ) by definition of the stability index, for any k > 0 there exists a word v k of A + s of length greater than ks such that u ≡ L v k and |u| = |v k | mod s, since η s (u) = η s (v k ). Then for k sufficiently large, there exists a word w in h −1 (v k ), such that η ds (w) belongs to I E (S ds ). Note that by taking k as a multiple of d, we obtain a word w such that |u| mod s = |w| mod ds. Thus for each element x ∈ I E (S s ), we can choose such an element, that we denote w x . This justifies the definition of the following function: We conclude by proving that f is an injective morphism, and thus I E (S s ) is a subsemigroup of I E (S ds ).
The Application f Is a Morphism. Let x, y ∈ I E (S s ). We show that f (xy) = f (x)f (y). First, we can assume without loss of generality that x = 0 and y = 0. We remark that since |w x | mod ds = |h(w x )| mod s, the concatenated word w x w y is well-formed if, and only if, h(w x )h(w y ) is well-formed too. If xy = 0.Then, xy have a well-formed preimage and w x w y is well-formed. Then as w xy and w x w y are syntactically equivalent with respect to both F ds and h −1 (L s ), η ds (w xy ) = η ds (w x w y ) = η ds (w x )η ds (w y ), meaning that f (xy) = f (x)f (y). Now if xy = 0, then either xy has no well-formed preimage or xy is a zero for π −1 s (L). In the latter case, then f (x)f (y) = 0 according to the previous point. If xy has no well-formed preimage, then w x w y is not well-formed and consequently f (x)f (y) = 0.
The Application f Is Injective. Let x, y ∈ I E (S s ) be such that x = y. Without loss of generality, we assume that x = 0. Necessarily, there exist p, q ∈ S s such that pxq ∈ η s (L s ) if, and only if, pyq ∈ η s (L s ). Let u and v be words from the preimage of p and q respectively. Then there exists two words u ∈ h −1 (u) ∩ F ds and v ∈ h −1 (v) ∩ F ds such that u w x v ∈ L ds if, and only if, u w y v ∈ L ds . Therefore, we have f (x) = f (y) and f is injective.
Remark 12. Theorem 11 is only stated for the levels of the quantifier alternation hierarchies that we consider. The main reason for that is that it makes use of Proposition 5 which was also stated for these fragments. Actually, the theorem would hold for any infinitely testable fragment for which we can obtain a result similar to Proposition 5 (see Remark 6).
Discussion. The main result gives the decidability of the alternation hierarchy of FO 2 [Reg]. However, the decidability of this fragment is still an open problem. But one can notice that Proposition 9 proves that FO 2 [<, +1] is infinitely testable, and that Proposition 5 holds. Therefore, Theorem 11 gives the decidability of FO 2 [Reg] as well. However, we prefer to give an elegant algebraic characterization of this fragment that one could transfer into an equational description. This characterization draws a parallel with the characterization FO[Reg] = QA obtained in [2] and extends the characterization FO 2 [<, MOD] = QDA obtained by the authors in [6]. A language L belongs to LDA if for any idempotent e of S L , the monoid eS L e belongs to DA. It belongs to QLDA if its stable semigroup belongs to LDA. Theorem 13. FO 2 [Reg] = QLDA.

Conclusion.
In this paper, we proved that regarding the quantifier alternation hierarchy of the first order and its two-variable counterpart, dealing with all the regular numerical predicates is as difficult as dealing with the order predicate only. We chose a generic algebraic approach which introduced a new notion, the infinitely testable property, and proved that for fragments that are expressive enough, the decidability with enriched signature reduces to the simpler one.
While mainly applied to the levels of the quantifier alternation hierarchies, this approach can be used on other fragments that satisfy the same hypotheses, as the fragment FO[+1]. This approach appears in fact to be a part of some more generic results that could also be applied to less expressive fragments. These results stem from the more intricate framework of varieties of finite categories, as considered in [4]. In this case, if the delay question is solved, then the decidability of the modular enriched fragment reduces to the decidability of the global of the initial variety. It is possible to adapt the definition of the infinitely testable property for varieties of categories, and extend the equational proofs like the one proposed in Example 8 to prove that this property holds. This generalized approach might provide the decidability of the hierarchy FO 2 k [<, MOD], which is not covered by our results.
An interesting fact is that despite the different methods used to obtain a delay when adding modular predicates, it was always revealed that the stability index is a delay, even in cases not covered by the approach mentioned above. The question of solving the adding of modular predicate in a general setting seems then achievable, but one has first to solve many questions, like for example what is a good notion of fragment of logic. Surprisingly, a good case of study would be the quite simple fragment FO[=]. Indeed, the global of this fragment is not infinitely testable, and it is unknown if it accepts the stability index as a delay.