Ambiguity, Nondeterminism and State Complexity of Finite Automata
TL;DR Summary
This paper surveys measures of ambiguity and nondeterminism in finite automata, focusing on their impact on state complexity, and reveals how increased ambiguity or nondeterminism can reduce the number of states needed in NFAs.
Abstract
Acta Cybernetica 23 (2017) 141–157. Ambiguity, Nondeterminism and State Complexity of Finite Automata Yo-Sub Han a , Arto Salomaa b , and Kai Salomaa c Abstract The degree of ambiguity counts the number of accepting computations of a nondeterministic finite automaton (NFA) on a given input. Alternatively, the nondeterminism of an NFA can be measured by counting the amount of guessing in a single computation or the number of leaves of the computation tree on a given input. This paper surveys work on the degree of ambiguity and on various nondeterminism measures for finite automata. In particu- lar, we focus on state complexity comparisons between NFAs with quantified ambiguity or nondeterminism. Keywords: finite automata, nondeterminism, degree of ambiguity, state complexity Dedicated to the memory of Zolt´ an ´ Esik (1951–2016). 1 Introduction Finite automata are a fundamental model of computation that has been system- atically studied since the 1950’s. At the same time many important questions on finite automata and regular languages remain open [7, 18, 52]. The last decades have seen much work on the descriptional complexity, or state complexity, of r
Mind Map
In-depth Reading
English Analysis
1. Bibliographic Information
1.1. Title
Ambiguity, Nondeterminism and State Complexity of Finite Automata
1.2. Authors
- Yo-Sub Han: Affiliated with the Basic Science Research Program through NRF funded by MEST (2015R1D1A1A01060097) and the Yonsei University Future-leading Research Initiative of 2016.
- Arto Salomaa
- Kai Salomaa: Supported by the Natural Sciences and Engineering Research Council of Canada Grant OGP0147224.
1.3. Journal/Conference
This paper was published in Acta Cybernetica, Vol. 25, No. 2, pp. 241–266, 2021. Acta Cybernetica is a well-regarded journal in theoretical computer science, focusing on areas such as automata theory, formal languages, and computational complexity.
1.4. Publication Year
2021 (based on confirmed publication in Acta Cybernetica).
1.5. Abstract
The paper surveys existing research on the concepts of ambiguity and nondeterminism in nondeterministic finite automata (NFAs). It defines the degree of ambiguity as the number of accepting computations for a given input string. It also explores various measures of nondeterminism, such as counting the amount of guessing in a single computation or the number of leaves in a computation tree. The primary focus of the survey is to compare the state complexity of NFAs when their ambiguity or nondeterminism is quantified, meaning how the minimum number of states required to recognize a language changes based on these measures.
1.6. Original Source Link
The official source link for the paper is: https://toc.yonsei.ac.kr/~emmous/papers/actacyb_amb_sc.pdf This is an officially published paper in Acta Cybernetica.
2. Executive Summary
2.1. Background & Motivation
The core problem the paper addresses revolves around understanding and quantifying the descriptive power of nondeterministic finite automata (NFAs) compared to deterministic finite automata (DFAs) and other variants. While DFAs and NFAs recognize the same class of languages (regular languages), NFAs can often represent these languages more compactly, requiring fewer states. This succinctness comes from nondeterminism and ambiguity.
The problem is important because finite automata are fundamental models in computation, and understanding their efficiency and complexity is crucial for designing efficient algorithms and systems. Previous research has explored state complexity (the minimum number of states required) for DFAs and NFAs. However, nondeterminism itself can be measured in various ways, and the ambiguity of an NFA (how many ways a string can be accepted) adds another layer of complexity.
Specific challenges or gaps in prior research include systematically comparing state complexity across different degrees of ambiguity and different amounts of nondeterminism, especially when these measures are unbounded or grow as a function of the input length. The paper's entry point is to survey and synthesize the existing work in these areas, highlighting known separation results and identifying open problems for future study.
2.2. Main Contributions / Findings
The paper's primary contributions are:
-
Comprehensive Survey: It provides a thorough overview of research on
degree of ambiguityand variousnondeterminism measuresforfinite automata. -
Focus on State Complexity Comparisons: It systematically reviews the
state complexitytrade-offs betweenNFAswith different quantified levels of ambiguity (e.g., unambiguous, finitely ambiguous, polynomially ambiguous, exponentially ambiguous) and nondeterminism (e.g., finite branching, tree width). -
Highlighting Separation Results: The paper emphasizes key theoretical results demonstrating
super-polynomial separationsinstate complexitybetween different classes ofNFAs. For instance, it notes that:UFAscan be exponentially more succinct thanDFAs.Finitly Ambiguous NFAs (FNFAs)can be exponentially more succinct thanUFAs.Polynomially Ambiguous NFAs (PNFAs)can be super-polynomially more succinct thanFNFAs.- General
NFAs(potentially exponentially ambiguous) can be super-polynomially more succinct thanPNFAs.
-
Quantifying Nondeterminism: It details various measures of nondeterminism, such as
guessing,branching,trace, andtree width, and their relationships. -
Spectrum Result for Finite Nondeterminism: It presents the "spectrum" result by Goldstine et al. [11], showing that different finite amounts of
branchingin anNFAcan lead to incremental savings instate complexity, often with precise bounds. -
Identification of Open Problems: The paper concludes by pointing out significant gaps in current knowledge, particularly regarding
state complexitycomparisons forNFAswhere the amount of nondeterminism is unbounded and measured as a function of input length. It suggests areas for future research, including applyingcommunication complexitytechniques to these unbounded measures and comparingNFAsbased on both ambiguity and nondeterminism.These findings solve the problem of consolidating scattered research, providing a clear landscape of known results, and charting a path for future exploration in the descriptional complexity of
finite automata.
3. Prerequisite Knowledge & Related Work
3.1. Foundational Concepts
To understand this paper, a reader needs a basic grasp of automata theory fundamentals and computational complexity concepts.
- Finite Automata (FA): A mathematical model of computation.
- Deterministic Finite Automaton (DFA): An
FAwhere for each state and input symbol, there is exactly one transition to a next state. This means the computation path for any given input string is unique.- Formal Definition: A
DFAis a tuple , where:- : A finite set of
states. - : A finite set of input symbols, called the
alphabet. - : The
transition function. For a given state and input symbol , gives a unique next state. - : The
initial state. - : The set of
final(oraccepting)states.
- : A finite set of
- Formal Definition: A
- Nondeterministic Finite Automaton (NFA): An
FAwhere for each state and input symbol, there can be zero, one, or multiple transitions to next states. This introduces the possibility of multiple computation paths for a single input string. AnNFAacceptsa string if at least one of its possible computation paths leads to afinal stateafter reading the entire string.- Formal Definition: An
NFAis a tuple , where:- : A finite set of
states. - : A finite set of input symbols, the
alphabet. - : The
transition function. For a given state and input symbol , returns a set of possible next states (which can be empty). denotes the power set of . - : The
initial state. - : The set of
finalstates.
- : A finite set of
- Formal Definition: An
- Deterministic Finite Automaton (DFA): An
- Regular Languages: A class of formal languages that can be recognized by
finite automata(bothDFAsandNFAsrecognize exactly the same class of languages). - State Complexity: For a given
regular language, thestate complexityof is the minimum number of states required for aDFAto recognize . Similarly,nondeterministic state complexityis the minimum number of states required for anNFAto recognize . The paper extends this to compareNFAswith different degrees of ambiguity or nondeterminism. - Ambiguity: In the context of
NFAs,ambiguityrefers to the number of distinctaccepting computationsanNFAcan have for a given input string.- An
unambiguous NFA (UFA)has at most oneaccepting computationfor any input string. - A
finitely ambiguous NFA (FNFA)has a bounded maximum number ofaccepting computationsfor any input string, regardless of its length. - A
polynomially ambiguous NFA (PNFA)has a maximum number ofaccepting computationsthat grows polynomially with the input string's length. - An
exponentially ambiguous NFAhas a maximum number ofaccepting computationsthat grows exponentially with the input string's length.
- An
- Nondeterminism Measures: Different ways to quantify the "amount of guessing" or "choices" an
NFAmakes. These includeguessing,branching,trace, andtree width, which are formally defined in the methodology section. - Regular Expressions: A sequence of characters that define a search pattern, often used to describe
regular languages. There's a relationship betweenregular expressionsandfinite automata. - Communication Complexity: A field in theoretical computer science that studies the amount of communication required between two or more parties to compute a function whose inputs are distributed among them. This technique is mentioned as a powerful tool for proving lower bounds in
state complexity. - Super-polynomially separated: A term used in
state complexityto indicate that simulating a device from one class (e.g.,NFA) with a device from another class (e.g.,PNFA) results in an increase in the number of states that grows faster than any polynomial function. For example, if an -stateNFArequires states in aPNFA, they aresuper-polynomially separated.
3.2. Previous Works
The paper extensively references prior research, forming the backbone of its survey. Key works include:
- Maslov (1970) [34]: One of the earliest works to consider the
state complexityof basic operations onregular languages. This laid the groundwork for quantifying the size offinite automata. - Book et al. (1971) [3]: First systematically considered
ambiguityinregular expressionsandfinite state machines, and the relationship between these concepts. They established fundamental ideas about howambiguityis defined and preserved. - Brüggemann-Klein and Wood (1998) [4]: Introduced the more restrictive notion of
one-unambiguity(or1-determinism) forregular expressions, where theposition automatonis deterministic. This showed that not allregular languagescan be denoted by1-unambiguous regular expressions, contrasting with the fact that all haveunambiguous regular expressions. - Ravikumar and Ibarra (1989) [42]: Systematically studied
size trade-offsbetweenunambiguous,finitely ambiguous,polynomially ambiguous, andexponentially ambiguous NFAs. They were pioneers in comparingstate complexitybased on degrees of ambiguity, establishing the concept ofsuper-polynomial separation. - Leung (1998, 2005) [30, 32]: Provided critical
separation results. Leung [30] established thatgeneral NFAscan besuper-polynomially separatedfromPNFAs, constructing -stateNFAsfor which any equivalentPNFAneeds states. Leung [32] also showed thatfinitely ambiguous NFAscan be exponentially more succinct thanUFAs( states for an -stateFNFA). - Hromkovič et al. (2002, 2011) [19, 20]: Used powerful
communication complexitytechniques to provestate complexity separations. Hromkovič et al. [19] provided a simplified proof for theNFA-to-PNFA separation(Theorem 6). Hromkovič and Schnitger [20] solved the long-standing open question by proving asuper-polynomial separationbetweenPNFAsandFNFAs(Theorem 7), and more generally betweenNFAswith differentpolynomial degrees of growthfor ambiguity (Theorem 8). They also studiednondeterminism measuresliketree widthandadvice. - Mandel and Simon (1977) [33] & Reutenauer (1977) [43]: Showed that it's
decidablewhether anNFAisfinitely ambiguousorpolynomially ambiguous, and Reutenauer provided an algorithm to compute thepolynomial degree of growth. - Weber and Seidl (1991) [50]: Provided a simpler structural characterization of
finitely ambiguousandpolynomially ambiguous NFAs, leading topolynomial time algorithmsfor deciding these properties and computing thepolynomial degree of growth. They also improved the upper bound for the maximalfinite degree of ambiguityof an -stateFNFA. - Kintala and Fischer (1980) [25] & Kintala and Wotschke (1980) [26]: Early work on quantifying
nondeterminism. Kintala and Wotschke first quantifiednondeterminisminfinite automatacomputations and observed a significant difference indeterminization size blow-upforNFAswith different finite numbers of nondeterministic choices. - Goldstine et al. (1990, 1992, 2002) [10, 11, 12]: Conducted extensive research on
descriptional complexityandnondeterminism. Goldstine et al. [11] established the "spectrum result" forfinite branching(Theorems 17 & 18), showing incremental savings instate complexityforNFAswith different finitebranchingamounts. They also showed an exponential blow-up for converting a generalNFAto afinite branching NFA. Goldstine et al. [12] explored the subtle relationship betweenambiguityandguessing. - Palioudakis et al. (2012, 2013, 2014, 2015) [38, 39, 40, 41]: Researched
tree widthand othernondeterminism measures. They characterizedstate complexityforNFAswithfinite tree width(Theorem 19), comparednondeterminism measures, and studied bounds for convertingNFAstoMDFAs. - Kappes (2000) [23]: Showed that an
NFAwith states andbranchingcan be simulated by adeterministic finite automaton with multiple initial states (MDFA)with states and initial states.
3.3. Technological Evolution
The study of finite automata began in the 1950s as a foundational model of computation. Initially, the focus was on their expressive power (what languages they can recognize). State complexity emerged as a field to quantify the size of automata.
The evolution of understanding nondeterminism and ambiguity in finite automata can be traced as follows:
-
Early 1970s: Initial systematic studies of
ambiguityinregular expressionsandNFAs(Book et al.). The concept ofunambiguous regular expressionswas introduced. -
Early 1980s: Quantifying
nondeterminismdirectly forfinite automatacomputations (Kintala and Wotschke), leading to observations aboutdeterminization size blow-up. -
Late 1980s - Early 1990s: Systematic comparisons of
state complexityforNFAswith differentdegrees of ambiguity(Ravikumar and Ibarra). Decidability results forfiniteandpolynomial ambiguity(Mandel and Simon, Reutenauer, Weber and Seidl). The "spectrum" ofnondeterminismforfinite branchingNFAs(Goldstine et al.). -
Late 1990s - 2000s: Major
separation resultsproved, showingsuper-polynomial succinctnessfor various classes ofNFAs(Leung, Hromkovič et al.). Introduction ofcommunication complexitytechniques for provingstate complexitylower bounds. Exploration of differentnondeterminism measuresliketree widthand their relationships. -
2010s - Present: Refinements of bounds, studies on unary alphabets, operations on
UFAs, anddescriptional complexityforinput-driven pushdown automata. Continued identification of open problems concerningunbounded nondeterminism.This paper's work fits into the current state of this timeline by providing a concise summary of these developments, synthesizing the known
state complexitycomparisons, and clearly outlining the remaining open questions in the field.
3.4. Differentiation Analysis
Compared to other surveys or foundational texts, this paper's core differentiation lies in its specific focus and synthesis:
-
Joint Consideration of Ambiguity and Nondeterminism: Many works might focus on one aspect or the other. This paper explicitly brings together the
degree of ambiguityand variousnondeterminism measures, analyzing their relationships and combined impact onstate complexity. -
Emphasis on State Complexity Comparisons and Separation Results: The paper is not just a definition of terms but a comprehensive review of the
succinctnesshierarchies among differentNFAclasses. It meticulously detailsexponentialandsuper-polynomial separation results, which are crucial for understanding the inherent computational power gains fromnondeterminismandambiguity. -
Clear Identification of Open Problems: A significant contribution is the explicit enumeration of open research questions, particularly concerning
unbounded nondeterminismand the application ofcommunication complexityto these problems. This provides a roadmap for future research in the field. -
Beginner-Friendly Definitions: While rigorous, the paper introduces and formally defines all the key measures (
degree of ambiguity,guessing,branching,tree width, etc.), making it accessible to those new to the specific sub-field ofdescriptional complexity.In essence, while the individual results surveyed are from various researchers, the paper's innovation is in its structured, comparative synthesis and its forward-looking perspective on the remaining theoretical challenges.
4. Methodology
The paper is a survey, so it doesn't propose a new methodology in the experimental sense. Instead, its "methodology" is to rigorously define the concepts and measures it surveys, and then present the known theoretical results (theorems and propositions) that compare these measures in terms of state complexity. This section will detail those definitions and the underlying principles.
4.1. Principles
The core principle guiding the analysis in this paper is the quantitative comparison of different forms of nondeterminism and ambiguity in finite automata through their state complexity. This involves:
- Formal Definition: Precisely defining various metrics for
ambiguityandnondeterminism. - Classification: Categorizing
NFAsbased on the growth rate or boundedness of these metrics (e.g.,finitely ambiguous,polynomially ambiguous). - Succinctness Comparison: Investigating the minimum number of states required for equivalent automata belonging to different classes, often leading to
separation results(e.g., exponential blow-ups). - Relationships: Exploring mathematical relationships and bounds between different
ambiguityandnondeterminismmeasures.
4.2. Core Methodology In-depth (Layer by Layer)
4.2.1. Basic Definitions and Notations
-
Set of Positive Integers:
-
Cardinality of a Set: for a finite set .
-
Alphabet: A finite set of symbols, .
-
Set of Strings: over an alphabet .
-
Empty String: .
-
Bounded Language: A subset of , where are (not necessarily distinct) elements.
-
Nondeterministic Finite Automaton (NFA): An
NFAis formally defined as a tuple :- : A finite set of
states. - : The input
alphabet. - : The
transition function. This function maps a state and an input symbol to a set of possible next states. For example, returns the set of states reachable from on input . - : The
initial state. - : The set of
final(oraccepting)states.
- : A finite set of
-
Extended Transition Function: The transition function is extended to handle strings: . For a state and string , is the set of all states reachable from by reading .
-
Language Recognized by an NFA: The language
L(A)recognized byNFAis the set of all strings for which at least one computation path starting from and reading ends in anaccepting state. . -
Deterministic Finite Automaton (DFA): A
DFAis a special case of anNFAwhere for all and , . This means there is at most one next state for any given state-symbol pair. The paper notes thatDFAscan have undefined transitions (i.e., for some pairs). -
State Complexity of a Language:
State complexityof a regular language : The number of states in the minimalDFArecognizing .Nondeterministic state complexityof a regular language : The number of states in a state-minimalNFArecognizing .
-
Computation of an NFA: For a string (where and ), a
computationof on is a sequence of states such that:- (the first state is reachable from the initial state via the first symbol).
- for (each subsequent state is reachable from the previous one via the next symbol).
- Either (a
complete computation), or and (the computation halts prematurely if no transition is defined).
-
Accepting Computation: A
complete computation(where ) that ends in afinal state(). -
Set of Computations: is the set of all computations of on .
-
Set of Accepting Computations: is the set of all
accepting computationsof on .
4.2.2. Degree of Ambiguity
The degree of ambiguity quantifies how many ways an NFA can accept a given string.
-
Degree of Ambiguity of an NFA on a String : is defined as the number of
accepting computationsof on .- Explanation: This counts all distinct paths from the initial state to a final state, consuming the entire string .
-
Degree of Ambiguity of an NFA on Strings of Length : is defined as the maximum
degree of ambiguityfor any string of length .- Explanation: This provides a measure of ambiguity that depends on the length of the input string.
-
Finite (Bounded) Ambiguity: An
NFAisfinitely ambiguousif the values for all are bounded by some constant. The maximum such bound is denoted:- Explanation: If an
NFAisfinitely ambiguous, there's a fixed upper limit on how many ways any string can be accepted, regardless of how long the string is.
- Explanation: If an
-
Unambiguous NFA (UFA): An
NFAisunambiguousif . This means any string has at most oneaccepting computation. EveryDFAis inherentlyunambiguous. -
Classes of NFAs by Ambiguity: Following Ravikumar and Ibarra [42], the paper categorizes
NFAsinto five classes based on theirdegree of ambiguitygrowth rate:- DFAs:
Deterministic Finite Automata. - UFAs:
Unambiguous NFAs(). - FNFAs:
Finitely Ambiguous NFAs( is a finite constant ). - PNFAs:
Polynomially Ambiguous NFAs. AnNFAisstrictly polynomially ambiguousif it is notfinitely ambiguousand there exists a polynomial such that for all . Thepolynomial degree of growthis the minimal degree of such a polynomial. - General (Potentially Exponentially Ambiguous) NFAs: An
NFAisstrictly exponentially ambiguousif it is notpolynomially ambiguous, meaning grows faster than any polynomial.
- DFAs:
4.2.3. Nondeterminism Measures
These measures quantify the "amount of guessing" or "choices" made by an NFA.
-
Branching of a Transition: For a state and input symbol , the
branchingis , which is the number of possible next states. -
Guessing of a Computation ( ): For a computation on a string , the
guessing(introduced by Goldstine et al. [11]) is defined as:- Explanation: This measure quantifies the
amount of guessinginbits of information.- : Represents the number of bits needed to choose the first state from the set of possible initial transitions (i.e., ).
- : Sums the bits needed to choose the next state at each subsequent
nondeterministicstep along the computation path . - If an
NFAis aDFA, all branching factors are 1, so , and theguessingis 0.
- Explanation: This measure quantifies the
-
Branching of a Computation ( ): The
branchingis defined as .- Explanation: This is the product of the
branchingfactors of the individual transitions along the computation path . It represents the total number of choices made along that specific path.
- Explanation: This is the product of the
-
Guessing of a String ( ): This is a "best-case" measure, defined as the
guessingof the bestaccepting computationfor .- Explanation: For a string accepted by the
NFA, this finds theaccepting computationthat involves the minimum amount ofguessing.
- Explanation: For a string accepted by the
-
Maximum Guessing of a String ( ): This is a "worst-case" measure, defined as the maximum
guessingover all computations (not necessarily complete or accepting) for .- Explanation: This captures the maximum amount of
nondeterminismanNFAmight encounter on any path for a given input, regardless of whether it leads to acceptance or completion.
- Explanation: This captures the maximum amount of
-
Branching of a String ( ): This corresponds to the
best-case guessing. -
Trace of a String ( ): This corresponds to the
worst-case guessing. -
Tree Width of a String ( ): This measure quantifies the total amount of
nondeterminismby counting all possible computation paths.- Explanation: This is equivalent to the number of leaves in the
computation treeof on . It's also called 'leaf size' in some literature.
- Explanation: This is equivalent to the number of leaves in the
-
Measures as Functions of Length ( ): Similar to
degree of ambiguity, thesenondeterminism measurescan be expressed as functions of input length by taking the maximum value over all strings of that length. If is any oftw, , , , or , then is:- Explanation: This allows for characterizing the growth rate of nondeterminism with input length.
-
Finite Function: The function of is
finiteif is finite.
4.2.4. NFA Class Separations
The paper defines what it means for one class of automata to be super-polynomially separated from another.
- Super-Polynomially Separated:
Consider two classes of automata, and . Class is
super-polynomially separatedfrom class if there exists a family of languages (for ) such that:- Each can be recognized by an -state automaton from class .
- For any polynomial
p(n)and sufficiently large , any automaton from class recognizing requires more thanp(n)states.
- Explanation: This means that converting an automaton from class to an equivalent one in class results in an
exponentialorsuper-polynomial size blow-upin the worst case. This quantifies the succinctness advantage of class over class .
5. Experimental Setup
As a survey paper focusing on theoretical concepts and existing results in state complexity, this paper does not describe a traditional experimental setup involving datasets, evaluation metrics, and baselines in the empirical sense. Instead, the "setup" of this field of study relies on:
5.1. Datasets
In the context of descriptional complexity research, "datasets" are regular languages designed to demonstrate specific state complexity lower bounds or separation results. These languages are not collections of real-world data but rather carefully constructed formal languages.
- Example from the Paper:
The paper mentions the family of languages , for , used by Leung [30].
- Source: This is a theoretically constructed language.
- Characteristics: It's a
regular language. It's designed to be recognized by a smalln-state NFAthat isexponentially ambiguous, but which requires a much largerPNFAto recognize it. - Example Data Sample (for ):
For , .
Strings in could include: ,
0,00,010,000,0100,0010,0110, etc. TheNFAfor is designed such that for certain strings, the number ofaccepting computationsgrows exponentially with the length or with . - Purpose: Such languages are chosen because they exhibit the worst-case behavior for the
state complexitytransformation being studied (e.g.,NFAtoPNFA). They are effective for validating theoretical bounds rather than empirical performance.
5.2. Evaluation Metrics
The primary evaluation metric throughout this research area is state complexity, which refers to the minimum number of states required by an automaton to recognize a given language.
-
State Complexity (SC):
- Conceptual Definition:
State complexityquantifies the size of the minimal automaton (eitherDFAorNFA) required to recognize a particularregular language. When comparing different classes of automata, it measures the succinctness advantage of one class over another. A higherstate complexityfor an equivalent automaton in a different class indicates asize blow-up. - Mathematical Formula: While there isn't a single universal formula,
state complexityis typically expressed as a function of the number of states in the original automaton. For instance, when converting an -stateNFAto an equivalentDFA, thestate complexityis often bounded by . - Symbol Explanation:
- : The number of states in the initial (source) automaton.
State complexityresult: Often expressed asO(f(n)),f(n), or specific bounds like or , indicating the number of states in the target automaton.
- Conceptual Definition:
-
Degree of Ambiguity Measures:
- : Number of
accepting computationsfor string . - : Maximum
degree of ambiguityfor strings of length . - : Finite maximum
degree of ambiguity. - These are used to classify
NFAsand are then correlated withstate complexity.
- : Number of
-
Nondeterminism Measures:
- :
Guessingof a computation (in bits). - :
Branchingof a computation (number of choices). - :
Best-case guessingfor string . - :
Worst-case guessingfor string . - :
Best-case branchingfor string . - :
Trace(worst-case branching) for string . - :
Tree width(number of computations) for string . - These measures are quantified for
NFAsand then their impact onstate complexityis analyzed.
- :
5.3. Baselines
In state complexity research, "baselines" are not competitor models in an experimental sense, but rather the alternative classes of finite automata against which the succinctness of a given NFA class is compared. The paper implicitly compares:
-
Unambiguous NFAs (UFAs)againstDFAs. -
Finitely Ambiguous NFAs (FNFAs)againstUFAs. -
Polynomially Ambiguous NFAs (PNFAs)againstFNFAs. -
General NFAsagainstPNFAs. -
NFAswith differentfinite branchingamounts against each other. -
NFAswithfinite tree widthagainstDFAs. -
NFAswithlimited nondeterminismagainstDeterministic Finite Automata with Multiple Initial States (MDFAs).The goal is to establish how much
nondeterminismorambiguitycan reduce the number of states required to recognize a language, by comparing thestate complexityof anNFAfrom one class to an equivalentNFAfrom a "less powerful" or "more restricted" class.
6. Results & Analysis
This section synthesizes the key findings presented in the paper, focusing on the state complexity comparisons for different degrees of ambiguity and nondeterminism.
6.1. Core Results Analysis
The paper's results are presented as theorems and observations, establishing decision problems, upper bounds, and crucial separation results regarding the state complexity of NFAs with quantified ambiguity or nondeterminism.
6.1.1. Degree of Ambiguity: Characterization and Complexity
- Decidability of Ambiguity Type:
- Theorem 1 (Weber and Seidl [50]): It is
decidableinpolynomial timewhether a givenNFAisfinitely ambiguous,strictly polynomially ambiguous, orstrictly exponentially ambiguous. Furthermore, thepolynomial degree of growthcan be computed inpolynomial time. This is a foundational result, allowing for algorithmic classification of NFAs by their ambiguity.
- Theorem 1 (Weber and Seidl [50]): It is
- Complexity of Testing Ambiguity Degree:
- Theorem 2 (Chan and Ibarra [5]): For a given
NFAand an integer , testing whether thedegree of ambiguityof is at least isPSPACE-complete. This implies that determining the exact finite degree of ambiguity for an arbitrary is computationally hard.
- Theorem 2 (Chan and Ibarra [5]): For a given
- Upper Bound for Finite Ambiguity:
- Theorem 3 (Weber and Seidl [50]): The
degree of ambiguityof an -stateFNFAis at most . This provides an upper bound on how large the number of accepting computations can be for an NFA that is finitely ambiguous.
- Theorem 3 (Weber and Seidl [50]): The
6.1.2. Ambiguity and State Complexity Separations
These results demonstrate the succinctness advantages of NFAs with higher degrees of ambiguity over those with lower degrees.
- UFA vs. DFA / FNFA vs. UFA:
- Theorem 4 (Leiss [29], Leung [32]):
- For each , there exists an
UFAwith states such that the minimal equivalentDFAhas states. (This highlights an exponential size blow-up during determinization ofUFAs). - For each , there exists an
FNFAwith states such that any equivalentUFAhas states. (This shows an exponential succinctness advantage forFNFAsoverUFAs).
- For each , there exists an
- Theorem 4 (Leiss [29], Leung [32]):
- Bounded Languages (Special Case):
- Theorem 5 (Ravikumar and Ibarra [42]): Any
NFArecognizing abounded languagecan be converted to anFNFAwith at mostpolynomial size blow-up. However, the classes ofFNFAs(respectively,UFAs) recognizingbounded languagesaresuper-polynomially separatedfromUFAs(respectively,DFAs). This suggests that forbounded languages, while ambiguities can be reduced to finite, there's still a significant cost to strict unambiguity or determinism.
- Theorem 5 (Ravikumar and Ibarra [42]): Any
- General NFA vs. PNFA:
- Theorem 6 (Leung [30], Hromkovič et al. [19]): The class of
NFAsissuper-polynomially separatedfrom the class ofPNFAs. This meansexponentially ambiguous NFAscan be drastically more succinct thanpolynomially ambiguousones. Leung's original proof yields an optimal state blow-up for the transformation.
- Theorem 6 (Leung [30], Hromkovič et al. [19]): The class of
- PNFA vs. FNFA and Different Polynomial Degrees:
- Theorem 7 (Hromkovič and Schnitger [20]): For , there exists a
PNFAwith a number of states polynomial in such that anyFNFArecognizingL(A)has at least states. This provides asuper-polynomial separationbetweenpolynomially ambiguousandfinitely ambiguous NFAs. - Theorem 8 (Hromkovič and Schnitger [20]): This generalizes Theorem 7. For positive integers and , there exist languages recognized by
NFAswithdegree of ambiguityand states, such that anyNFAfor withdegree of ambiguity(i.e., of a lower polynomial degree or finite) requires at least states. This establishes asuper-polynomial separationbetweenNFAshaving differentpolynomial degrees of growthfor ambiguity.
- Theorem 7 (Hromkovič and Schnitger [20]): For , there exists a
6.1.3. Limited Nondeterminism: Growth Rates and Complexity
- Tree Width Growth Rates:
- Theorem 9 (Hromkovič et al. [19]): For any
NFA, the function (tree widthon strings of length ) is eitherbounded by a constant,between linear and polynomial in, orin2^{\Theta(m)}
- Theorem 9 (Hromkovič et al. [19]): For any
(`exponential`). It cannot be unbounded and sublinear. This theorem provides a fundamental characterization of how `tree width` can grow.
* **Complexity of Bounded Guessing**:
* **Theorem 10 (Leung [31])**: For a given `NFA` , it is `PSPACE-complete` to decide whether (`guessing` on strings of length ) is bounded. This shows that determining if guessing remains finite is computationally hard.
* **Sublinear Guessing Growth**:
* **Theorem 11 (Simon [48], Goldstine et al. [12])**: For each , there is an `NFA` such that . This proves that `guessing` can be unbounded but `sublinear`, a unique characteristic not seen with `tree width`. This also implies `branching` can be .
* **Upper Bound for Finite Tree Width**:
* **Theorem 13 (Palioudakis et al. [38])**: The `tree width` of an -state `finite tree width NFA` is at most . Also, for every and , there exists an -state `NFA` with optimal `tree width` .
### 6.1.4. Relationships Between Nondeterminism Measures and Ambiguity
* **Proposition 1 (Palioudakis et al. [39])**: If is an `NFA` with finite `tree width`, then
\mathrm{tw}_A^{\mathrm{sup}} \leq \tau_A^{\mathrm{sup}} \leq 2^{\mathrm{tw}_A^{\mathrm{sup}} - 1}
* **Explanation**: This shows a direct relationship between the maximum `tree width` and the maximum `trace` (worst-case branching), indicating that `trace` can be exponentially larger than `tree width` in the worst case, but `tree width` is a lower bound for `trace`.
* **Theorem 14 (Hromkovič et al. [19])**: If is a `minimal NFA`, then for all
\operatorname*{max} (\gamma_A^{\operatorname*{max}}(m), \mathrm{da}_A(m)) \leq \mathrm{tw}_A(m) = O(\mathrm{da}_A(m) \cdot \gamma_A^{\operatorname*{max}}(m))
\$\$
* **Explanation**: This relates `maximum guessing`, `degree of ambiguity`, and `tree width`. It shows that `tree width` upper bounds both `maximum guessing` and `degree of ambiguity`, and is itself bounded by their product.
- Theorem 15 (Goldstine et al. [12]): Let be an
NFA. If (guessing) is non-constant andsublinear, then thecomplete ambiguityof must be unbounded. Conversely, if is in (bounded) or (linear), thencomplete ambiguitymay be either bounded or unbounded. This reveals a subtle connection: sublinear, unbounded guessing forces unbounded ambiguity.
6.1.5. Limited Nondeterminism and State Complexity Separations
- General NFA vs. Finite Branching NFA:
- Theorem 16 (Goldstine et al. [11]): For each , there exists an -state
NFAsuch that anyfinite branching NFArecognizingL(A)needs at least states. This is an exponential separation between generalNFAsand those restricted to finite branching.
- Theorem 16 (Goldstine et al. [11]): For each , there exists an -state
- Spectrum Result for Finite Branching: These theorems by Goldstine et al. [11] show how
state complexityvaries with different finite amounts ofbranching.- Theorem 17 (Goldstine et al. [11]): Let be a minimal
DFAof size , . Then:- (i) For , the optimal size of an
NFAwithbranchingforL(A)is at least . - (ii) For , an
NFAwithbranchingforL(A)has size at least .
- (i) For , the optimal size of an
- Theorem 18 (Goldstine et al. [11]): For , there exists a minimal
NFAwith states such that if denotes the optimal size of anNFAwithbranchingrecognizing , then the following relations hold:- (i) , and when .
- (ii) when .
- (iii) , when .
- Explanation: These theorems demonstrate a "spectrum" of
state complexityreductions. As (the allowedbranching) increases, thestate complexityforNFAsrecognizing certain languages decreases, showing a trade-off. For very low , the cost is exponential, but it reduces as approaches .
- Theorem 17 (Goldstine et al. [11]): Let be a minimal
- Finite Tree Width NFA vs. DFA:
- Theorem 19 (Palioudakis et al. [38]): For an
NFAwith states havingtree widthat most , the languageL(A)has aDFAof size . Furthermore, for every , there exists an -stateNFAwithtree widthover a binary alphabet such that the minimalDFAfor has states.- Explanation: Unlike general
NFAs,NFAswithfinite tree widthcan be determinized toDFAswithpolynomialsize blow-up (specifically, the sum of binomial coefficients, which is polynomial in for fixed ). This is a significant difference from the exponential blow-up for generalNFAs.
- Explanation: Unlike general
- Theorem 19 (Palioudakis et al. [38]): For an
- NFA with Finite Branching vs. MDFA:
- Theorem 20 (Kappes [23]): An
NFAwith states andbranchingcan be simulated by anMDFA(deterministic finite automaton with multiple initial states) with states and initial states. - Almost Matching Lower Bound (Palioudakis et al. [40]): For infinitely many , there exists an -state
NFAwithbranchingsuch that any equivalentMDFAneeds at least states.- Explanation: This shows a linear trade-off between
branchingand the number of states/initial states forMDFAs, demonstrating thatMDFAsare quite efficient in simulating finite branching.
- Explanation: This shows a linear trade-off between
- Theorem 20 (Kappes [23]): An
6.2. Data Presentation (Tables)
The paper presents its results primarily through theorems and propositions, stating bounds and relationships in text rather than in formal tables. Therefore, there are no specific tables to transcribe from the paper for this section. The quantitative results are embedded within the theorem statements as described above.
6.3. Ablation Studies / Parameter Analysis
As a survey paper, this work does not present original experimental results or ablation studies. It reviews existing theoretical results that often involve parameter analysis (e.g., the effect of in branching on state complexity) as part of their proofs or constructions. The theorems listed in the previous section implicitly cover this type of analysis by showing how state complexity varies with parameters like (number of states) or (degree of ambiguity/nondeterminism).
7. Conclusion & Reflections
7.1. Conclusion Summary
This paper provides a rigorous and comprehensive survey of the relationship between ambiguity, nondeterminism, and state complexity in finite automata. It meticulously defines various measures for quantifying both ambiguity (e.g., finite, polynomial, exponential) and nondeterminism (e.g., guessing, branching, tree width). A central theme is the descriptional complexity comparison, specifically how the succinctness (minimal number of states) of NFAs varies across different classes defined by these measures.
The survey highlights several key findings:
-
Super-polynomial separation resultsare well-established forNFAswith different degrees of ambiguity (e.g.,general NFAsare super-polynomially more succinct thanPNFAs,PNFAsare super-polynomially more succinct thanFNFAs, andFNFAsare exponentially more succinct thanUFAs). -
For
finite nondeterminism, the "spectrum" result by Goldstine et al. precisely quantifies thestate complexitysavings forNFAswith different finitebranchingvalues, showing an incremental trade-off. -
NFAswithfinite tree widthhave apolynomialdeterminization costtoDFAs, a stark contrast to theexponentialcost for generalNFAs. -
Relationships and bounds between different
nondeterminism measures(e.g.,tree widthandtrace) and betweenambiguityandguessinghave been established.Overall, the paper consolidates a significant body of theoretical research, providing a clear picture of the computational benefits of
nondeterminismandambiguityin terms ofautomata size.
7.2. Limitations & Future Work
The authors explicitly outline several limitations in the current state of research and suggest promising avenues for future work:
- State Complexity of Unbounded Nondeterminism: Very little is known about the
state complexityofNFAswhere the amount ofnondeterminismisunboundedand measured as a function of input length. This is a significant gap compared to the detailed results forfinite nondeterminism. - Succinctness Comparisons for Unbounded Branching/Tree Width: There are currently no known
succinctness comparisonsbetweenNFAsof differentunbounded branchingorunbounded tree width. This means it's unclear if different growth rates for these unbounded measures yield similarsuper-polynomial separationsas seen with ambiguity. - Application of Communication Complexity Techniques: The authors suggest exploring whether the powerful
communication complexity techniques(successfully used for ambiguity separations by Hromkovič et al.) can be applied to establishgood lower boundsandseparation resultsforNFAswherebranchingortree widthis measured as a function of input length. - Succinctness Comparison: Ambiguity vs. Nondeterminism Measures: A further topic of interest is comparing the
succinctnessofNFAsbased on theirdegree of ambiguityagainst those based on theirbranchingortree width. Only a few tentative results exist in this direction.
7.3. Personal Insights & Critique
This paper serves as an invaluable resource for anyone studying the descriptional complexity of finite automata.
-
Clarity and Structure: The paper's strength lies in its clear definitions and systematic presentation of complex theoretical results. For a beginner, the carefully defined terms and the structured survey approach make an otherwise daunting field much more accessible.
-
Identification of Gaps: The explicit outlining of open problems is a critical contribution. It not only summarizes what is known but also effectively guides future research by highlighting the frontiers of knowledge in this area. This encourages new researchers to tackle these challenging questions.
-
Theoretical vs. Practical Impact: While the results are highly theoretical, quantifying the inherent succinctness of
nondeterminismandambiguityhas fundamental implications for understanding the limits of computation and the trade-offs between computational models. For example, knowing that determinization can cause exponential blow-ups informs the design of algorithms that operate onNFAs. -
Complexity of Quantification: The array of
nondeterminism measures(guessing,branching,trace,tree width) and their subtle differences underscore the inherent complexity ofnondeterminismitself. The relationships between these measures andambiguity(e.g.,Theorem 15) reveal that these concepts are deeply intertwined in non-obvious ways. -
Opportunity for Communication Complexity: The suggestion to apply
communication complexitytechniques tounbounded nondeterminismmeasures is particularly intriguing. This powerful tool has proven successful in other areas ofcomputational complexityand could unlock significant breakthroughs in understanding the state complexity ofNFAswithfunction-of-lengthnondeterminism.One potential area for deeper exploration (though outside the scope of a survey) could be the practical implications or approximate scenarios where these theoretical worst-case bounds are observed. For a beginner, understanding why a specific language constructed to demonstrate an exponential blow-up is "hard" to determinize could provide deeper intuition beyond just the mathematical proof. However, the paper's focus is clearly on the theoretical landscape, and it executes that mission flawlessly.
Similar papers
Recommended via semantic vector search.