Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Lempel–Ziv–Welch
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Decoding=== To decode an LZW-compressed archive, one needs to know in advance the initial dictionary used, but additional entries can be reconstructed as they are always simply [[concatenation]]s of previous entries. {| class="wikitable" style="text-align: right; margin-left: auto; margin-right: auto;" |- ! colspan="2" | Input ! scope="col" width="6em" rowspan="2" | Output Sequence ! colspan="4" | New Dictionary Entry ! rowspan="2" | Comments |- ! Bits !! Code ! scope="col" width="6em" colspan="2" | Full ! scope="col" width="6em" colspan="2" | Conjecture |- | 10100 || 20 | style="text-align: center;" | T | style="border-right: none;" | | style="border-left: none;" | | style="border-right: none;" | 27: | style="border-left: none;" | T? || |- | 01111 || 15 | style="text-align: center;" | O | style="border-right: none;" | 27: | style="border-left: none;" | TO | style="border-right: none;" | 28: | style="border-left: none;" | O? || |- | 00010 || 2 | style="text-align: center;" | B | style="border-right: none;" | 28: | style="border-left: none;" | OB | style="border-right: none;" | 29: | style="border-left: none;" | B? || |- | 00101 || 5 | style="text-align: center;" | E | style="border-right: none;" | 29: | style="border-left: none;" | BE | style="border-right: none;" | 30: | style="border-left: none;" | E? || |- | 01111 || 15 | style="text-align: center;" | O | style="border-right: none;" | 30: | style="border-left: none;" | EO | style="border-right: none;" | 31: | style="border-left: none;" | O? || |- | 10010 || 18 | style="text-align: center;" | R | style="border-right: none;" | 31: | style="border-left: none;" | OR | style="border-right: none;" | 32: | style="border-left: none;" | R? | style="text-align: left;" | created code 31 (last to fit in 5 bits) |- | 001110 || 14 | style="text-align: center;" | N | style="border-right: none;" | 32: | style="border-left: none;" | RN | style="border-right: none;" | 33: | style="border-left: none;" | N? | style="text-align: left;" | so start reading input at 6 bits |- | 001111 || 15 | style="text-align: center;" | O | style="border-right: none;" | 33: | style="border-left: none;" | NO | style="border-right: none;" | 34: | style="border-left: none;" | O? || |- | 010100 || 20 | style="text-align: center;" | T | style="border-right: none;" | 34: | style="border-left: none;" | OT | style="border-right: none;" | 35: | style="border-left: none;" | T? || |- | 011011 || 27 | style="text-align: center;" | TO | style="border-right: none;" | 35: | style="border-left: none;" | TT | style="border-right: none;" | 36: | style="border-left: none;" | TO? || |- | 011101 || 29 | style="text-align: center;" | BE | style="border-right: none;" | 36: | style="border-left: none;" | TOB | style="border-right: none;" | 37: | style="border-left: none;" | BE? | style="text-align: left;" | 36 = TO + 1st symbol (B) of |- | 011111 || 31 | style="text-align: center;" | OR | style="border-right: none;" | 37: | style="border-left: none;" | BEO | style="border-right: none;" | 38: | style="border-left: none;" | OR? | style="text-align: left;" | next coded sequence received (BE) |- | 100100 || 36 | style="text-align: center;" | TOB | style="border-right: none;" | 38: | style="border-left: none;" | ORT | style="border-right: none;" | 39: | style="border-left: none;" | TOB? || |- | 011110 || 30 | style="text-align: center;" | EO | style="border-right: none;" | 39: | style="border-left: none;" | TOBE | style="border-right: none;" | 40: | style="border-left: none;" | EO? || |- | 100000 || 32 | style="text-align: center;" | RN | style="border-right: none;" | 40: | style="border-left: none;" | EOR | style="border-right: none;" | 41: | style="border-left: none;" | RN? || |- | 100010 || 34 | style="text-align: center;" | OT | style="border-right: none;" | 41: | style="border-left: none;" | RNO | style="border-right: none;" | 42: | style="border-left: none;" | OT? || |- | 000000 || 0 | style="text-align: center;" | # | style="border-right: none;" | | style="border-left: none;" | | style="border-right: none;" | | style="border-left: none;" | || |- |} At each stage, the decoder receives a code X; it looks X up in the table and outputs the sequence χ it codes, and it conjectures χ + ? as the entry the encoder just added – because the encoder emitted X for χ precisely because χ + ? was not in the table, and the encoder goes ahead and adds it. But what is the missing letter? It is the first letter in the sequence coded by the ''next'' code Z that the decoder receives. So the decoder looks up Z, decodes it into the sequence ω and takes the first letter z and tacks it onto the end of χ as the next dictionary entry. This works as long as the codes received are in the decoder's dictionary, so that they can be decoded into sequences. What happens if the decoder receives a code Z that is not yet in its dictionary? Since the decoder is always just one code behind the encoder, Z can be in the encoder's dictionary only if the encoder ''just'' generated it, when emitting the previous code X for χ. Thus Z codes some ω that is χ + ?, and the decoder can determine the unknown character as follows: # The decoder sees X and then Z, where X codes the sequence χ and Z codes some unknown sequence ω. # The decoder knows that the encoder just added Z as a code for χ + some unknown character ''c'', so ω = χ + ''c''. # Since ''c'' is the first character in the input stream after χ, and since ω is the string appearing immediately after χ, ''c'' must be the first character of the sequence ω. # Since χ is an initial substring of ω, ''c'' must also be the first character of χ. # So even though the Z code is not in the table, the decoder is able to infer the unknown sequence and adds χ + (the first character of χ) to the table as the value of Z. This situation occurs whenever the encoder encounters input of the form ''cScSc'', where ''c'' is a single character, ''S'' is a string and ''cS'' is already in the dictionary, but ''cSc'' is not. The encoder emits the code for ''cS'', putting a new code for ''cSc'' into the dictionary. Next it sees ''cSc'' in the input (starting at the second ''c'' of ''cScSc'') and emits the new code it just inserted. The argument above shows that whenever the decoder receives a code not in its dictionary, the situation must look like this. Although input of form ''cScSc'' might seem unlikely, this pattern is fairly common when the input stream is characterized by significant repetition. In particular, long strings of a single character (which are common in the kinds of images LZW is often used to encode) repeatedly generate patterns of this sort.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Lempel–Ziv–Welch
(section)
Add topic