Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Regular expression
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Expressive power and compactness=== The formal definition of regular expressions is minimal on purpose, and avoids defining <code>?</code> and <code>+</code>—these can be expressed as follows: <code>a+</code>=<code>aa*</code>, and <code>a?</code>=<code>(a|ε)</code>. Sometimes the [[set complement|complement]] operator is added, to give a ''generalized regular expression''; here ''R<sup>c</sup>'' matches all strings over Σ* that do not match ''R''. In principle, the complement operator is redundant, because it does not grant any more expressive power. However, it can make a regular expression much more concise—eliminating a single complement operator can cause a [[double exponential function|double exponential]] blow-up of its length.<ref>{{harvtxt|Gelade|Neven|2008|p=332|loc=Thm.4.1}}</ref><ref>{{harvtxt|Gruber|Holzer|2008}}</ref><ref>Based on {{harvtxt|Gelade|Neven|2008}}, a regular expression of length about 850 such that its complement has a length about 2<sup>32</sup> can be found at [[:File:RegexComplementBlowup.png]].</ref> Regular expressions in this sense can express the regular languages, exactly the class of languages accepted by [[deterministic finite automata]]. There is, however, a significant difference in compactness. Some classes of regular languages can only be described by deterministic finite automata whose size grows [[exponential growth|exponentially]] in the size of the shortest equivalent regular expressions. The standard example here is the languages ''L<sub>k</sub>'' consisting of all strings over the alphabet {''a'',''b''} whose ''k''th-from-last letter equals ''a''. On the one hand, a regular expression describing ''L''<sub>4</sub> is given by <math>(a\mid b)^*a(a\mid b)(a\mid b)(a\mid b)</math>. Generalizing this pattern to ''L<sub>k</sub>'' gives the expression: : <math>(a\mid b)^*a\underbrace{(a\mid b)(a\mid b)\cdots(a\mid b)}_{k-1\text{ times}}. \, </math> On the other hand, it is known that every deterministic finite automaton accepting the language ''L<sub>k</sub>'' must have at least 2<sup>''k''</sup> states. Luckily, there is a simple mapping from regular expressions to the more general [[nondeterministic finite automata]] (NFAs) that does not lead to such a blowup in size; for this reason NFAs are often used as alternative representations of regular languages. NFAs are a simple variation of the type-3 [[formal grammar|grammars]] of the [[Chomsky hierarchy]].<ref name="HopcroftMotwaniUllman01"/> In the opposite direction, there are many languages easily described by a DFA that are not easily described by a regular expression. For instance, determining the validity of a given [[ISBN]] requires computing the modulus of the integer base 11, and can be easily implemented with an 11-state DFA. However, converting it to a regular expression results in a 2,14 megabytes file .<ref>{{cite web |title=Regular expressions for deciding divisibility |url=https://s3.boskent.com/divisibility-regex/divisibility-regex.html |access-date=2024-02-21 |website=s3.boskent.com}}</ref> Given a regular expression, [[Thompson's construction algorithm]] computes an equivalent nondeterministic finite automaton. A conversion in the opposite direction is achieved by [[Kleene's algorithm]]. Finally, many real-world "regular expression" engines implement features that cannot be described by the regular expressions in the sense of formal language theory; rather, they implement ''regexes''. See [[#Patterns for non-regular languages|below]] for more on this.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Regular expression
(section)
Add topic