Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Arithmetic coding
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Huffman coding=== {{Main|Huffman coding}} Because arithmetic coding doesn't compress one datum at a time, it can get arbitrarily close to entropy when compressing [[Independent and identically distributed random variables|IID]] strings. By contrast, using the extension of [[Huffman coding]] (to strings) does not reach entropy unless all probabilities of alphabet symbols are powers of two, in which case both Huffman and arithmetic coding achieve entropy. When naively Huffman coding binary strings, no compression is possible, even if entropy is low (e.g. ({0, 1}) has probabilities {0.95, 0.05}). Huffman encoding assigns 1 bit to each value, resulting in a code of the same length as the input. By contrast, arithmetic coding compresses bits well, approaching the optimal compression ratio of :<math> 1 - [-0.95 \log_2(0.95) + -0.05 \log_2(0.05)] \approx 71.4\%.</math> One simple way to address Huffman coding's suboptimality is to concatenate symbols ("blocking") to form a new alphabet in which each new symbol represents a sequence of original symbols β in this case bits β from the original alphabet. In the above example, grouping sequences of three symbols before encoding would produce new "super-symbols" with the following frequencies: * {{samp|000}}: 85.7% * {{samp|001}}, {{samp|010}}, {{samp|100}}: 4.5% each * {{samp|011}}, {{samp|101}}, {{samp|110}}: 0.24% each * {{samp|111}}: 0.0125% With this grouping, Huffman coding averages 1.3 bits for every three symbols, or 0.433 bits per symbol, compared with one bit per symbol in the original encoding, i.e., <math>56.7\%</math> compression. Allowing arbitrarily large sequences gets arbitrarily close to entropy β just like arithmetic coding β but requires huge codes to do so, so is not as practical as arithmetic coding for this purpose. An alternative is [[Run-length encoding|encoding run lengths]] via Huffman-based [[Golomb coding|Golomb-Rice codes]]. Such an approach allows simpler and faster encoding/decoding than arithmetic coding or even Huffman coding, since the latter requires a table lookups. In the {0.95, 0.05} example, a Golomb-Rice code with a four-bit remainder achieves a compression ratio of <math>71.1\%</math>, far closer to optimum than using three-bit blocks. Golomb-Rice codes only apply to [[Bernoulli process|Bernoulli]] inputs such as the one in this example, however, so it is not a substitute for blocking in all cases.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Arithmetic coding
(section)
Add topic