Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
JPEG
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====Discrete cosine transform==== [[File:JPEG example subimage.svg|thumb|256px|The 8Γ8 sub-image shown in 8-bit grayscale]] Next, each 8Γ8 block of each component (Y, Cb, Cr) is converted to a [[frequency-domain]] representation, using a normalized, two-dimensional type-II discrete cosine transform (DCT), see Citation 1 in discrete cosine transform. The DCT is sometimes referred to as "type-II DCT" in the context of a family of transforms as in [[Discrete cosine transform#DCT-II|discrete cosine transform]], and the corresponding inverse (IDCT) is denoted as "type-III DCT". As an example, one such 8Γ8 8-bit subimage might be: :<math> \left[ \begin{array}{rrrrrrrr} 52 & 55 & 61 & 66 & 70 & 61 & 64 & 73 \\ 63 & 59 & 55 & 90 & 109 & 85 & 69 & 72 \\ 62 & 59 & 68 & 113 & 144 & 104 & 66 & 73 \\ 63 & 58 & 71 & 122 & 154 & 106 & 70 & 69 \\ 67 & 61 & 68 & 104 & 126 & 88 & 68 & 70 \\ 79 & 65 & 60 & 70 & 77 & 68 & 58 & 75 \\ 85 & 71 & 64 & 59 & 55 & 61 & 65 & 83 \\ 87 & 79 & 69 & 68 & 65 & 76 & 78 & 94 \end{array} \right]. </math> Before computing the DCT of the 8Γ8 block, its values are shifted from a positive range to one centered on zero. For an 8-bit image, each entry in the original block falls in the range <math>[0, 255]</math>. The midpoint of the range (in this case, the value 128) is subtracted from each entry to produce a data range that is centered on zero, so that the modified range is <math>[-128, 127]</math>. This step reduces the dynamic range requirements in the DCT processing stage that follows. This step results in the following values: :<math>g= \begin{array}{c} x \\ \longrightarrow \\ \left[ \begin{array}{rrrrrrrr} -76 & -73 & -67 & -62 & -58 & -67 & -64 & -55 \\ -65 & -69 & -73 & -38 & -19 & -43 & -59 & -56 \\ -66 & -69 & -60 & -15 & 16 & -24 & -62 & -55 \\ -65 & -70 & -57 & -6 & 26 & -22 & -58 & -59 \\ -61 & -67 & -60 & -24 & -2 & -40 & -60 & -58 \\ -49 & -63 & -68 & -58 & -51 & -60 & -70 & -53 \\ -43 & -57 & -64 & -69 & -73 & -67 & -63 & -45 \\ -41 & -49 & -59 & -60 & -63 & -52 & -50 & -34 \end{array} \right] \end{array} \Bigg\downarrow y. </math> [[File:Dctjpeg.png|thumb|The DCT transforms an 8Γ8 block of input values to a [[linear combination]] of these 64 patterns. The patterns are referred to as the two-dimensional DCT ''basis functions'', and the output values are referred to as ''transform coefficients''. The horizontal index is <math>u</math> and the vertical index is <math>v</math>.]] The next step is to take the two-dimensional DCT, which is given by: :<math>\ G_{u,v} = \frac{1}{4} \alpha(u) \alpha(v) \sum_{x=0}^7 \sum_{y=0}^7 g_{x,y} \cos \left[\frac{(2x+1)u\pi}{16} \right] \cos \left[\frac{(2y+1)v\pi}{16} \right] </math> where * <math>\ u</math> is the horizontal [[spatial frequency]], for the integers <math>\ 0 \leq u < 8</math>. * <math>\ v</math> is the vertical spatial frequency, for the integers <math>\ 0 \leq v < 8</math>. * <math>\alpha(u)</math> and <math>\alpha(v)</math> are normalizing scale factors to make the transformation [[orthonormal]] with <math> \alpha(i) = \begin{cases} \frac{1}{\sqrt{2}}, & \mbox{if }i=0 \\ 1, & \mbox{otherwise} \end{cases} </math> * <math>\ g_{x,y}</math> is the pixel value at coordinates <math>\ (x,y)</math> * <math>\ G_{u,v}</math> is the DCT coefficient at coordinates <math>\ (u,v).</math> If we perform this transformation on our matrix above, we get the following (rounded to the nearest two digits beyond the decimal point): :<math>G= \begin{array}{c} u \\ \longrightarrow \\ \left[ \begin{array}{rrrrrrrr} -415.38 & -30.19 & -61.20 & 27.24 & 56.12 & -20.10 & -2.39 & 0.46 \\ 4.47 & -21.86 & -60.76 & 10.25 & 13.15 & -7.09 & -8.54 & 4.88 \\ -46.83 & 7.37 & 77.13 & -24.56 & -28.91 & 9.93 & 5.42 & -5.65 \\ -48.53 & 12.07 & 34.10 & -14.76 & -10.24 & 6.30 & 1.83 & 1.95 \\ 12.12 & -6.55 & -13.20 & -3.95 & -1.87 & 1.75 & -2.79 & 3.14 \\ -7.73 & 2.91 & 2.38 & -5.94 & -2.38 & 0.94 & 4.30 & 1.85 \\ -1.03 & 0.18 & 0.42 & -2.42 & -0.88 & -3.02 & 4.12 & -0.66 \\ -0.17 & 0.14 & -1.07 & -4.19 & -1.17 & -0.10 & 0.50 & 1.68 \end{array} \right] \end{array} \Bigg\downarrow v. </math> Note the top-left corner entry with the rather large magnitude. This is the [[DC bias|DC]] coefficient (also called the constant component), which defines the basic hue for the entire block. The remaining 63 coefficients are the AC coefficients (also called the alternating components).<ref>{{cite web|url=http://forum.doom9.org/showthread.php?p=184647#post184647|title=DC / AC Frequency Questions - Doom9's Forum|website=forum.doom9.org|access-date=16 October 2017|archive-date=17 October 2017|archive-url=https://web.archive.org/web/20171017042422/http://forum.doom9.org/showthread.php?p=184647#post184647|url-status=live}}</ref> The advantage of the DCT is its tendency to aggregate most of the signal in one corner of the result, as may be seen above. The quantization step to follow accentuates this effect while simultaneously reducing the overall size of the DCT coefficients, resulting in a signal that is easy to compress efficiently in the entropy stage. The DCT temporarily increases the bit-depth of the data, since the DCT coefficients of an 8-bit/component image take up to 11 or more bits (depending on fidelity of the DCT calculation) to store. This may force the codec to temporarily use 16-bit numbers to hold these coefficients, doubling the size of the image representation at this point; these values are typically reduced back to 8-bit values by the quantization step. The temporary increase in size at this stage is not a performance concern for most JPEG implementations, since typically only a very small part of the image is stored in full DCT form at any given time during the image encoding or decoding process.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
JPEG
(section)
Add topic