Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Regular expression
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====POSIX basic and extended==== In the [[POSIX]] standard, Basic Regular Syntax ('''BRE''') requires that the [[metacharacter]]s <code>( )</code> and <code>{ }</code> be designated <code>\(\)</code> and <code>\{\}</code>, whereas Extended Regular Syntax ('''ERE''') does not. {| class="wikitable" |- ! Metacharacter ! Description |- valign="top" !<code>^</code> |Matches the starting position within the string. In line-based tools, it matches the starting position of any line. |- valign="top" !<code>.</code> |Matches any single character (many applications exclude [[newline]]s, and exactly which characters are considered newlines is flavor-, character-encoding-, and platform-specific, but it is safe to assume that the line feed character is included). Within POSIX bracket expressions, the dot character matches a literal dot. For example, <code>a.c</code> matches "abc", etc., but <code>[a.c]</code> matches only "a", ".", or "c". |- valign="top" !<code>[ ]</code> |A bracket expression. Matches a single character that is contained within the brackets. For example, <code>[abc]</code> matches "a", "b", or "c". <code>[a-z]</code> specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed: <code>[abcx-z]</code> matches "a", "b", "c", "x", "y", or "z", as does <code>[a-cx-z]</code>. The <code>-</code> character is treated as a literal character if it is the last or the first (after the <code>^</code>, if present) character within the brackets: <code>[abc-]</code>, <code>[-abc]</code>, <code>[^-abc]</code>. Backslash escapes are not allowed. The <code>]</code> character can be included in a bracket expression if it is the first (after the <code>^</code>, if present) character: <code>[]abc]</code>, <code>[^]abc]</code>. |- valign="top" !<code>[^ ]</code> |Matches a single character that is not contained within the brackets. For example, <code>[^abc]</code> matches any character other than "a", "b", or "c". <code>[^a-z]</code> matches any single character that is not a lowercase letter from "a" to "z". Likewise, literal characters and ranges can be mixed. |- valign="top" !<code>$</code> |Matches the ending position of the string or the position just before a string-ending newline. In line-based tools, it matches the ending position of any line. |- valign="top" !<code>( )</code> |Defines a marked subexpression, also called a capturing group, which is essential for extracting the desired part of the text (See also the next entry, <code>\''n''</code>). ''BRE mode requires {{nowrap|<code>\( \)</code>}}.'' |- valign="top" !<code>\''n''</code> |Matches what the ''n''th marked subexpression matched, where ''n'' is a digit from 1 to 9. This construct is defined in the POSIX standard.<ref>{{cite book |section-url=https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_06 |publisher=The Open Group |title=The Open Group Base Specifications Issue 7, 2018 edition |section=9.3.6 BREs Matching Multiple Characters |year=2017 |access-date=December 10, 2023}}</ref> Some tools allow referencing more than nine capturing groups. Also known as a back-reference, this feature is supported in BRE mode. |- valign="top" !<code>*</code> |Matches the preceding element zero or more times. For example, <code>ab*c</code> matches "ac", "abc", "abbbc", etc. <code>[xyz]*</code> matches "", "x", "y", "z", "zx", "zyx", "xyzzy", and so on. <code>(ab)*</code> matches "", "ab", "abab", "ababab", and so on. |- valign="top" !{{nowrap|<code>{''m'',''n''}</code>}} |Matches the preceding element at least ''m'' and not more than ''n'' times. For example, <code>a{3,5}</code> matches only "aaa", "aaaa", and "aaaaa". This is not found in a few older instances of regexes. BRE mode requires <code>{{nowrap|\{''m'',''n''\}}}</code>. |} '''Examples:''' * <code>.at</code> matches any three-character string ending with "at", including "hat", "cat", "bat", "4at", "#at" and " at" (starting with a space). * <code>[hc]at</code> matches "hat" and "cat". * <code>[^b]at</code> matches all strings matched by <code>.at</code> except "bat". * <code>[^hc]at</code> matches all strings matched by <code>.at</code> other than "hat" and "cat". * <code>^[hc]at</code> matches "hat" and "cat", but only at the beginning of the string or line. * <code>[hc]at$</code> matches "hat" and "cat", but only at the end of the string or line. * <code>\[.\]</code> matches any single character surrounded by "[" and "]" since the brackets are escaped, for example: "[a]", "[b]", "[7]", "[@]", "[]]", and "[ ]" (bracket space bracket). * <code>s.*</code> matches s followed by zero or more characters, for example: "s", "saw", "seed", "s3w96.7", and "s6#h%(>>>m n mQ". According to Russ Cox, the POSIX specification requires ambiguous subexpressions to be handled in a way different from Perl's. The committee replaced Perl's rules with one that is simple to explain, but the new "simple" rules are actually more complex to implement: they were incompatible with pre-existing tooling and made it essentially impossible to define a "lazy match" (see below) extension. As a result, very few programs actually implement the POSIX subexpression rules (even when they implement other parts of the POSIX syntax).<ref>{{cite web |title=Regular Expression Matching: the Virtual Machine Approach |url=https://swtch.com/~rsc/regexp/regexp2.html |author=Russ Cox |year=2009 |website=swtch.com |quote=Digression: POSIX Submatching}}</ref>
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Regular expression
(section)
Add topic