Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
MPEG-1
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Layer III=== {{Main|MPEG-1 Audio Layer III}} MPEG-1 Audio Layer III (the first version of [[MP3]]) is a [[lossy]] audio format designed to provide acceptable quality at about 64 kbit/s for monaural audio over single-channel ([[basic rate interface|BRI]]) [[ISDN]] links, and 128 kbit/s for stereo sound. ====History/ASPEC==== [[File:2016-07 ASPEC 91 Bonn.jpg|thumb|ASPEC 91 in the [[Deutsches Museum Bonn]], with encoder (below) and decoder]] MPEG-1 Audio Layer III was derived from the ''Adaptive Spectral Perceptual Entropy Coding'' (ASPEC) codec developed by Fraunhofer as part of the [[EUREKA 147]] pan-European inter-governmental research and development initiative for the development of digital audio broadcasting. ASPEC was adapted to fit in with the Layer II model (frame size, filter bank, FFT, etc.), to become Layer III.<ref name=santa_clara90/> ASPEC was itself based on ''Multiple adaptive Spectral audio Coding'' (MSC) by [[E. F. Schroeder]],<!--at ???--> ''Optimum Coding in the Frequency domain'' (OCF) the [[doctoral thesis]] by [[Karlheinz Brandenburg]] at the [[University of Erlangen-Nuremberg]], ''Perceptual Transform Coding'' (PXFM) by [[J. D. Johnston]] at [[AT&T Corporation|AT&T]] [[Bell Labs]], and ''Transform coding of audio signals'' by [[Y. Mahieux]] and [[J. Petit]] at [[Institut fΓΌr Rundfunktechnik]] (IRT/CNET).<ref name=perceptual_coding>{{Citation |first1=Ted |last1=Painter |first2=Andreas |last2=Spanias |title=Perceptual Coding of Digital Audio (Proceedings of the IEEE, VOL. 88, NO. 4) |date=April 2000 |publisher=[[Proceedings of the IEEE]] |url=http://www.ee.columbia.edu/~marios/courses/e6820y02/project/papers/Perceptual%20coding%20of%20digital%20audio%20.pdf |access-date=2016-11-11 |url-status=dead |archive-url=https://web.archive.org/web/20060916012236/http://www.ee.columbia.edu/~marios/courses/e6820y02/project/papers/Perceptual%20coding%20of%20digital%20audio%20.pdf |archive-date=September 16, 2006}}</ref> ====Technical details==== MP3 is a frequency-domain audio [[Transform coding|transform encoder]]. Even though it utilizes some of the lower layer functions, MP3 is quite different from MP2. MP3 works on 1152 samples like MP2, but needs to take multiple frames for analysis before frequency-domain (MDCT) processing and quantization can be effective. It outputs a variable number of samples, using a bit buffer to enable this variable bitrate (VBR) encoding while maintaining 1152 sample size output frames. This causes a significantly longer delay before output, which has caused MP3 to be considered unsuitable for studio applications where editing or other processing needs to take place.<ref name=audio_tutorial/> MP3 does not benefit from the 32 sub-band polyphased filter bank, instead just using an 18-point MDCT transformation on each output to split the data into 576 frequency components, and processing it in the frequency domain.<ref name=telos_audio/> This extra [[wikt:granularity|granularity]] allows MP3 to have a much finer psychoacoustic model, and more carefully apply appropriate quantization to each band, providing much better low-bitrate performance. Frequency-domain processing imposes some limitations as well, causing a factor of 12 or 36 × worse temporal resolution than Layer II. This causes quantization artifacts, due to transient sounds like percussive events and other high-frequency events that spread over a larger window. This results in audible smearing and [[pre-echo]].<ref name=audio_tutorial>{{Citation|first=Davis |last=Pan |title=A Tutorial on MPEG/Audio Compression |pages=8 |date=Summer 1995 |publisher=IEEE MultiMedia Journal |url=https://www.cs.columbia.edu/~coms6181/slides/6R/mpegaud.pdf |archive-url=https://web.archive.org/web/20040919073530/https://www.cs.columbia.edu/~coms6181/slides/6R/mpegaud.pdf |url-status=dead |archive-date=2004-09-19 |access-date=2008-04-09 }}</ref> MP3 uses pre-echo detection routines, and VBR encoding, which allows it to temporarily increase the bitrate during difficult passages, in an attempt to reduce this effect. It is also able to switch between the normal 36 sample quantization window, and instead using 3× short 12 sample windows instead, to reduce the temporal (time) length of quantization artifacts.<ref name=audio_tutorial/> And yet in choosing a fairly small window size to make MP3's temporal response adequate enough to avoid the most serious artifacts, MP3 becomes much less efficient in frequency domain compression of stationary, tonal components. Being forced to use a ''hybrid'' time domain (filter bank) /frequency domain (MDCT) model to fit in with Layer II simply wastes processing time and compromises quality by introducing aliasing artifacts. MP3 has an aliasing cancellation stage specifically to mask this problem, but which instead produces frequency domain energy which must be encoded in the audio. This is pushed to the top of the frequency range, where most people have limited hearing, in hopes the distortion it causes will be less audible. Layer II's 1024 point FFT doesn't entirely cover all samples, and would omit several entire MP3 sub-bands, where quantization factors must be determined. MP3 instead uses two passes of FFT analysis for spectral estimation, to calculate the global and individual masking thresholds. This allows it to cover all 1152 samples. Of the two, it utilizes the global masking threshold level from the more critical pass, with the most difficult audio. In addition to Layer II's intensity encoded joint stereo, MP3 can use middle/side (mid/side, m/s, MS, matrixed) joint stereo. With mid/side stereo, certain frequency ranges of both channels are merged into a single (middle, mid, L+R) mono channel, while the sound difference between the left and right channels is stored as a separate (side, L-R) channel. Unlike intensity stereo, this process does not discard any audio information. When combined with quantization, however, it can exaggerate artifacts. If the difference between the left and right channels is small, the side channel will be small, which will offer as much as a 50% bitrate savings, and associated quality improvement. If the difference between left and right is large, standard (discrete, left/right) stereo encoding may be preferred, as mid/side joint stereo will not provide any benefits. An MP3 encoder can switch between m/s stereo and full stereo on a frame-by-frame basis.<ref name=telos_audio/><ref name=joint_stereo_spatial>{{Citation|first=Jurgen |last=Herre |title=From Joint Stereo to Spatial Audio Coding |pages=2 |date=October 5, 2004 |publisher=[[International Conference on Digital Audio Effects]] |url=http://dafx04.na.infn.it/WebProc/Proc/P_157.pdf |archive-url=https://web.archive.org/web/20060405112352/http://dafx04.na.infn.it/WebProc/Proc/P_157.pdf |url-status=dead |archive-date=April 5, 2006 |access-date=2008-04-17 }}</ref><ref name=lame_ms>{{Citation |first=Roberto |last=Amorim |title=GPSYCHO - Mid/Side Stereo |date=September 19, 2006 |publisher=[[LAME]] |url=http://lame.sourceforge.net/ms_stereo.php |access-date=2016-11-11 |url-status=live |archive-url=https://web.archive.org/web/20161216140230/http://lame.sourceforge.net/ms_stereo.php |archive-date=December 16, 2016 }}</ref> Unlike Layers I and II, MP3 uses variable-length [[Huffman coding]] (after perceptual) to further reduce the bitrate, without any further quality loss.<ref name=mpeg_audio_faq/><ref name=audio_tutorial/> ====Quality==== MP3's more fine-grained and selective quantization does prove notably superior to MP2 at lower-bitrates. It is able to provide nearly equivalent audio quality to Layer II, at a 15% lower bitrate (approximately).<ref name=ebu_surround_test_2007/><ref name=stereo_aac_tests/><!--First ref proves the point, second scares off MP3 fans that feel like arguing--> 128 kbit/s is considered the "sweet spot" for MP3; meaning it provides generally acceptable quality stereo sound on most music, and there are [[diminishing returns|diminishing]] quality improvements from increasing the bitrate further. MP3 is also regarded as exhibiting artifacts that are less annoying than Layer II, when both are used at bitrates that are too low to possibly provide faithful reproduction. Layer III audio files use the extension ".mp3". <!--aliasing compensation: still need more details-->
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
MPEG-1
(section)
Add topic