regular expression

the operators of regular expressions make use of the following 3 operations on languages:
  1. union of languages
  2. concatenation of languages
  3. closure of a language
we call them the regular operations, regular languages are closed under each one of them.
[cite:@john_automata_2006 definition 1.23]
regular expressions are defined inductively:
\textsc{BASIS}: the basis consists of three parts:
  1. the constants and are regular expressions, denoting the languages and , respectively. that is, , and .
  2. if is any symbol, then is a regular expression. this expression denotes the language . that is, . note that we use boldface font to denote an expression corresponding to a symbol. the correspondence, e.g. that refers to , should be obvious.
  3. a variable, usually capitalized and italic such as , is a variable, representing any language.
\textsc{INDUCTION}: there are four parts to the inductive step, one for each of the three operators and one for the introduction of parentheses.
  1. if and are regular expressions, then is a regular expression denoting the union of and . that is, .
  2. if and are regular expressions, then is regular expression denoting the concatenation of and . that is, .
  3. if is a regular expression, then is a regular expression, denoting the closure of . that is, .
  4. if is a regular expression, then , a parenthesized , is also a regular expression, denoting the same language as . formally; .
[cite:@john_automata_2006 section 3.1.2]
for convenience, we let be a shorthand for .
the following definition given by [cite:@sipser_comp_2012] is equivalent to john's.
say that is a regular expression if is
  1. for some in the alphabet ,
  2. ,
  3. ,
  4. , where and are regular expressions,
  5. , where and are regular expressions, or
  6. , where is a regular expression.
in items 1 and 2, the regular expressions and represent the languages and , respectively. in item 3, the regular expression represents the empty language. in items 4,5, and 6, the expressions represent the languages obtained by taking the union or concatenation of the languages and , or the star of language , respectively.
[cite:@sipser_comp_2012 definition 1.52]
dont confuse the regular expressions and . the expression represents the language containing a single string--namely, the empty string--whereas represents the language that doesn't contain any strings.
[cite:@sipser_comp_2012 chapter 1 regular languages]
like other algebras, the regular-expression operators have an assumed order of "precedence", which means that óperators are associated with their operands in a particular order. we are familiar with the notion of precedence from ordinary arithmetic expressions. for regular expressions, the following is the order of precedence for the operators:
  1. the star operator is of highest precedence. that is, it applies only to the smallest sequence of symbols to its left that is a well-formed regular expression.
  2. the concatenation operator.
  3. union
of course, sometimes we do not want the grouping in a regular expression to be as required by the precedence of the operators. If so, we are free to use parentheses to group operands exactly as we choose. In addition, there is never anything wrong with putting parentheses around operands that you want to group, even if the desired grouping is implied by the rules of precedence.
[cite:@john_automata_2006 chapter 3.1.3 precedence of regular-expression operators]
for any languages , the following algebraic properties hold
  • associativity: .
  • associative law for [[blk:1706801265][union]]: .
  • associative law for [[blk:1700165265][concatenation]]: .
  • . this law asserts that is the identity for union.
  • . this law asserts that is the identity for concatenation.
  • . this law asserts that is the annihilator for concatenation.
  • left [[blk:1663957762][distributive]] law of concatenation over union: .
  • right distributive law of concatenation over union: .
  • [[blk:1700242159][idempotence]] law for union: .
  • . this law law says that closing an expression that is already closed does not change the language.
  • .
  • .
  • .
  • .
  • . this rule is really the definition of the operator.
  • .
[cite:@john_automata_2006 chapter 3.4 algebraic laws for regular expressions]
[cite:@john_automata_2006 chapter 3.4.6 discovering laws for regular expressions]
  • the expression denotes strings over that end with .
  • .
  • .
  • the expression denotes all the words that include the substring , i.e. .
  • the expression which is equal to denotes all the strings over that dont contain the substring .
let be any regular expression, we have the following identities
however, exchanging and in the preceding identities may cause the equalities to fail. may not equal . may not equal .
[cite:@sipser_comp_2012 section 1.3 regular exxpressions]
a numerical constant that may include a fractional part and/or a sign may be described as a member of the language defined by the regex
where is the alphabet of decimal digits. examples of generated strings are: .
[cite:@sipser_comp_2012 chapter 1 regular languages]
regular expressions and finite automata are equivalent in their descriptive power. this fact is surprising because finite automata and regular expressions superficially appear to be rather different. however, any regular expression can be converted into a finite automaton that recognizes the language it describes, and vice versa.
a language is regular if and only if some regular expression describes it.
[cite:@sipser_comp_2012 chapter 1 regular languages]
automata theory course homework 4 for more practice problems