Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Unicode
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Codespace and code points === ''The Unicode Standard'' defines a ''codespace'':<ref name="Glossary">{{Cite web |title=Glossary of Unicode Terms |url=https://unicode.org/glossary/ |access-date=16 March 2010}}</ref> a sequence of integers called ''[[code point]]s''<ref name=":0">{{Cite book |url=https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-2/#G25564 |title=The Unicode Standard Version 16.0 β Core Specification |year=2024 |chapter=2.4 Code Points and Characters}}</ref> in the range from 0 to {{val|1114111}}, notated according to the standard as {{tt|U+0000}}β{{tt|U+10FFFF}}.<ref>{{Cite book |url=https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-3/#G2212 |title=The Unicode Standard, Version 16.0 |year=2024 |chapter=3.4 Characters and Encoding}}</ref> The codespace is a systematic, architecture-independent representation of ''The Unicode Standard''; actual text is processed as binary data via one of several Unicode encodings, such as [[UTF-8]]. In this normative notation, the two-character prefix <code>U+</code> always precedes a written code point,<ref>{{Cite mailing list |url=https://unicode.org/mail-arch/unicode-ml/y2005-m11/0060.html |title=Re: Origin of the U+nnnn notation |date=8 November 2005 |mailing-list=Unicode Mail List Archive}}</ref> and the code points themselves are written as [[hexadecimal]] numbers. At least four hexadecimal digits are always written, with [[leading zero]]s prepended as needed. For example, the code point {{unichar|F7|Division sign}} is padded with two leading zeros, but {{unichar|13254|Egyptian hieroglyph O004}} ([[File:Hiero O4.png|class=skin-invert-image|text-bottom|15px]]) is not padded.<ref>{{Cite web |date=September 2024 |title=Appendix A: Notational Conventions |url=https://www.unicode.org/versions/Unicode16.0.0/core-spec/appendix-a/ |website=The Unicode Standard |publisher=Unicode Consortium}}</ref> There are a total of {{val|1112064}} valid code points within the codespace.<ref>{{cite book |title=The Unicode Standard |publisher=[[The Unicode Consortium]] |isbn=978-1-936213-01-6 |edition=6.0 |location=Mountain View, California, US |at=3.9 Unicode Encoding Forms |chapter=Conformance |quote=Each encoding form maps the Unicode code points U+0000..U+D7FF and U+E000..U+10FFFF |chapter-url=https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-3/#G7404}}</ref> This number arises from the limitations of the [[UTF-16]] character encoding, which can encode the 2<sup>16</sup> code points in the range {{tt|U+0000}} through {{tt|U+FFFF}} except for the 2<sup>11</sup> code points in the range {{tt|U+D800}} through {{tt|U+DFFF}}, which are used as surrogate pairs to encode the 2<sup>20</sup> code points in the range {{tt|U+10000}} through {{tt|U+10FFFF}}.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Unicode
(section)
Add topic