Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
String (computer science)
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Formal theory == {{See also|Tuple}} Let Σ be a [[finite set]] of distinct, unambiguous symbols (alternatively called characters), called the [[Alphabet (formal languages)|alphabet]]. A '''string''' (or '''word'''<ref>{{cite book |last1=Fletcher |first1=Peter |last2=Hoyle |first2=Hughes |last3=Patty |first3=C. Wayne |year=1991 |title=Foundations of Discrete Mathematics |publisher=PWS-Kent |isbn= 0-53492-373-9 |page=114 |quote=Let Σ be an alphabet. A '''nonempty word over Σ''' is a finite sequence with domain ''I<sub>n</sub>'' (for some ''n'' ∈ ℕ) and codomain Σ.}}</ref> or '''expression'''<ref>{{cite book |last=Shoenfield |first=Joseph R. |authorlink=Joseph R. Shoenfield |year=2010 |orig-date=1967 |title=Mathematical Logic |edition=Reprint |publisher=CRC Press |isbn=978-156881135-2 |page=2 |quote=Any finite sequence of symbols of a language is called an ''expression'' of that language.}}</ref>) over Σ is any finite [[sequence]] of symbols from Σ.<ref name="partee">{{cite book |author1=Barbara H. Partee |author2=Alice ter Meulen|author2-link=Alice ter Meulen |author3=Robert E. Wall |title=Mathematical Methods in Linguistics |publisher=Kluwer |year=1990}}</ref> For example, if Σ = {0, 1}, then ''01011'' is a string over Σ. The ''[[length]]'' of a string ''s'' is the number of symbols in ''s'' (the length of the sequence) and can be any [[non-negative integer]]; it is often denoted as |''s''|. The ''[[empty string]]'' is the unique string over Σ of length 0, and is denoted ''ε'' or ''λ''.<ref name="partee"/><ref>{{cite book| author=John E. Hopcroft, Jeffrey D. Ullman| title=Introduction to Automata Theory, Languages, and Computation| year=1979| publisher=Addison-Wesley| isbn=0-201-02988-X| url-access=registration| url=https://archive.org/details/introductiontoau00hopc}} Here: sect.1.1, p.1</ref> The set of all strings over Σ of length ''n'' is denoted Σ<sup>''n''</sup>. For example, if Σ = {0, 1}, then Σ<sup>2</sup> = {00, 01, 10, 11}. We have Σ<sup>0</sup> = {ε} for every alphabet Σ. The set of all strings over Σ of any length is the [[Kleene star|Kleene closure]] of Σ and is denoted Σ<sup>*</sup>. In terms of Σ<sup>''n''</sup>, :<math>\Sigma^{*} = \bigcup_{n \in \mathbb{N} \cup \{0\}} \Sigma^{n}</math> For example, if Σ = {0, 1}, then Σ<sup>*</sup> = {ε, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, ...}. Although the set Σ<sup>*</sup> itself is [[countably infinite]], each element of Σ<sup>*</sup> is a string of finite length. A set of strings over Σ (i.e. any [[subset]] of Σ<sup>*</sup>) is called a ''[[formal language]]'' over Σ. For example, if Σ = {0, 1}, the set of strings with an even number of zeros, {ε, 1, 00, 11, 001, 010, 100, 111, 0000, 0011, 0101, 0110, 1001, 1010, 1100, 1111, ...}, is a formal language over Σ. === Concatenation and substrings === ''[[Concatenation]]'' is an important [[binary operation]] on Σ<sup>*</sup>. For any two strings ''s'' and ''t'' in Σ<sup>*</sup>, their concatenation is defined as the sequence of symbols in ''s'' followed by the sequence of characters in ''t'', and is denoted ''st''. For example, if Σ = {a, b, ..., z}, ''s'' = {{code|bear}}, and ''t'' = {{code|hug}}, then ''st'' = {{code|bearhug}} and ''ts'' = {{code|hugbear}}. String concatenation is an [[associative]], but non-[[commutative]] operation. The empty string ε serves as the [[identity element]]; for any string ''s'', ε''s'' = ''s''ε = ''s''. Therefore, the set Σ<sup>*</sup> and the concatenation operation form a [[monoid]], the [[free monoid]] generated by Σ. In addition, the length function defines a [[monoid homomorphism]] from Σ<sup>*</sup> to the non-negative integers (that is, a function <math>L: \Sigma^{*} \mapsto \mathbb{N} \cup \{0\}</math>, such that <math>L(st)=L(s)+L(t)\quad \forall s,t\in\Sigma^*</math>). A string ''s'' is said to be a ''[[substring]]'' or ''factor'' of ''t'' if there exist (possibly empty) strings ''u'' and ''v'' such that ''t'' = ''usv''. The [[binary relation|relation]] "is a substring of" defines a [[partial order]] on Σ<sup>*</sup>, the [[least element]] of which is the empty string. === Prefixes and suffixes === A string ''s'' is said to be a [[Substring#Prefix|prefix]] of ''t'' if there exists a string ''u'' such that ''t'' = ''su''. If ''u'' is nonempty, ''s'' is said to be a ''proper'' prefix of ''t''. Symmetrically, a string ''s'' is said to be a [[Substring#Suffix|suffix]] of ''t'' if there exists a string ''u'' such that ''t'' = ''us''. If ''u'' is nonempty, ''s'' is said to be a ''proper'' suffix of ''t''. Suffixes and prefixes are substrings of ''t''. Both the relations "is a prefix of" and "is a suffix of" are [[prefix order]]s. === Reversal === The reverse of a string is a string with the same symbols but in reverse order. For example, if ''s'' = abc (where a, b, and c are symbols of the alphabet), then the reverse of ''s'' is cba. A string that is the reverse of itself (e.g., ''s'' = madam) is called a [[palindrome#Computation theory|palindrome]], which also includes the empty string and all strings of length 1. === Rotations === A string ''s'' = ''uv'' is said to be a rotation of ''t'' if ''t'' = ''vu''. For example, if Σ = {0, 1} the string 0011001 is a rotation of 0100110, where ''u'' = 00110 and ''v'' = 01. As another example, the string abc has three different rotations, viz. abc itself (with ''u''=abc, ''v''=ε), bca (with ''u''=bc, ''v''=a), and cab (with ''u''=c, ''v''=ab). === Lexicographical ordering === It is often useful to define an [[order theory|ordering]] on a set of strings. If the alphabet Σ has a [[total order]] (cf. [[alphabetical order]]) one can define a total order on Σ<sup>*</sup> called [[lexicographical order]]. The lexicographical order is [[total order|total]] if the alphabetical order is, but is not [[well-founded]] for any nontrivial alphabet, even if the alphabetical order is. For example, if Σ = {0, 1} and 0 < 1, then the lexicographical order on Σ<sup>*</sup> includes the relationships ε < 0 < 00 < 000 < ... < 0001 < ... < 001 < ... < 01 < 010 < ... < 011 < 0110 < ... < 01111 < ... < 1 < 10 < 100 < ... < 101 < ... < 111 < ... < 1111 < ... < 11111 ... With respect to this ordering, e.g. the infinite set { 1, 01, 001, 0001, 00001, 000001, ... } has no minimal element. See [[Shortlex]] for an alternative string ordering that preserves well-foundedness. For the example alphabet, the shortlex order is ε < 0 < 1 < 00 < 01 < 10 < 11 < 000 < 001 < 010 < 011 < 100 < 101 < 0110 < 111 < 0000 < 0001 < 0010 < 0011 < ... < 1111 < 00000 < 00001 ... === String operations === A number of additional operations on strings commonly occur in the formal theory. These are given in the article on [[string operations]]. === Topology === [[File:Hamming distance 3 bit binary.svg|thumb|150px|(Hyper)cube of binary strings of length 3]] Strings admit the following interpretation as nodes on a graph, where ''k'' is the number of symbols in Σ: * Fixed-length strings of length ''n'' can be viewed as the integer locations in an ''n''-dimensional [[hypercube]] with sides of length ''k''-1. * Variable-length strings (of finite length) can be viewed as nodes on a [[k-ary tree|perfect ''k''-ary tree]]. * [[Infinite string]]s (otherwise not considered here) can be viewed as infinite paths on a ''k''-node [[complete graph]]. The natural topology on the set of fixed-length strings or variable-length strings is the discrete topology, but the natural topology on the set of infinite strings is the [[limit topology]], viewing the set of infinite strings as the [[inverse limit]] of the sets of finite strings. This is the construction used for the [[p-adic|''p''-adic numbers]] and some constructions of the [[Cantor set]], and yields the same topology. [[Isomorphism]]s between string representations of topologies can be found by normalizing according to the [[lexicographically minimal string rotation]].
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
String (computer science)
(section)
Add topic