Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Linear predictive coding
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Speech analysis and encoding technique}} {{Use American English|date=June 2021}} '''Linear predictive coding''' ('''LPC''') is a method used mostly in [[audio signal processing]] and [[speech processing]] for representing the [[spectral envelope]] of a [[Digital data|digital]] [[signal (information theory)|signal]] of [[Speech communication|speech]] in [[data compression|compressed]] form, using the information of a [[linear prediction|linear]] [[predictive modelling|predictive model]].<ref>{{cite book |last= Deng |first= Li |author2=Douglas O'Shaughnessy |title= Speech processing: a dynamic and optimization-oriented approach |publisher= [[Marcel Dekker]] |year= 2003 |pages= 41β48 |isbn= 978-0-8247-4040-5 |url=https://books.google.com/books?id=136wRmFT_t8C&pg=PA41}}</ref><ref>{{cite book | title=Fundamentals of Speaker Recognition | publisher=Springer-Verlag | author=Beigi, Homayoon | year=2011 | location=Berlin | isbn=978-0-387-77591-3}}</ref> LPC is the most widely used method in [[speech coding]] and [[speech synthesis]]. It is a powerful speech analysis technique, and a useful method for encoding good quality speech at a low [[bit rate]]. ==Overview== LPC starts with the assumption that a speech signal is produced by a buzzer at the end of a tube (for [[Voice (phonetics)|voiced]] sounds), with occasional added hissing and popping sounds (for [[Voicelessness|voiceless]] sounds such as [[sibilant]]s and [[plosive]]s). Although apparently crude, this [[Sourceβfilter model]] is actually a close approximation of the reality of speech production. The [[glottis]] (the space between the vocal folds) produces the buzz, which is characterized by its intensity ([[loudness]]) and [[frequency]] (pitch). The [[vocal tract]] (the throat and mouth) forms the tube, which is characterized by its resonances; these resonances give rise to [[formant]]s, or enhanced frequency bands in the sound produced. Hisses and pops are generated by the action of the tongue, lips and throat during sibilants and plosives. LPC analyzes the speech signal by estimating the formants, removing their effects from the speech signal, and estimating the intensity and frequency of the remaining buzz. The process of removing the formants is called inverse filtering, and the remaining signal after the subtraction of the filtered modeled signal is called the residue. The numbers which describe the intensity and frequency of the buzz, the formants, and the residue signal, can be stored or transmitted somewhere else. LPC synthesizes the speech signal by reversing the process: use the buzz parameters and the residue to create a source signal, use the formants to create a filter (which represents the tube), and run the source through the filter, resulting in speech. Because speech signals vary with time, this process is done on short chunks of the speech signal, which are called frames; generally, 30 to 50 frames per second give an intelligible speech with good compression. ==Early history== Linear prediction (signal estimation) goes back to at least the 1940s when [[Norbert Wiener]] developed a mathematical theory for calculating the best [[Wiener filter|filters]] and predictors for detecting signals hidden in noise.<ref>{{cite journal | author=B.S. Atal | title=The history of linear prediction | year=2006 | pages=154β161 | volume=23 | issue=2 | journal=IEEE Signal Processing Magazine| doi=10.1109/MSP.2006.1598091 | bibcode=2006ISPM...23..154A | s2cid=15601493 | url=https://www.researchgate.net/publication/3321695}}</ref><ref name="Sasahira">{{cite journal |author1=Y. Sasahira |author2=S. Hashimoto | title=Voice pitch changing by Linear Predictive Coding Method to keep the Singer's Personal Timbre | year=1995 | format=pdf| publisher = Michigan Publishing|url=https://quod.lib.umich.edu/cgi/p/pod/dod-idx/voice-pitch-changing.pdf?c=icmc;idno=bbp2372.1995.118;format=pdf}}</ref> Soon after [[Claude Shannon]] established a [[A Mathematical Theory of Communication|general theory of coding]], work on predictive coding was done by [[C. Chapin Cutler]],<ref>{{cite patent | inventor=C. C. Cutler | title=Differential quantization of communication signals | pubdate=1952-07-29 | country=US|number=2605361}}</ref> [[Bernard M. Oliver]]<ref>{{cite journal | author=B. M. Oliver | title=Efficient coding | journal = The Bell System Technical Journal |year=1952 | volume=31 | issue=4 | pages=724β750 | publisher=Nokia Bell Labs| doi=10.1002/j.1538-7305.1952.tb01403.x }}</ref> and Henry C. Harrison.<ref>{{cite journal | author=H. C. Harrison | title=Experiments with linear prediction in television | year=1952 | volume=31 | pages=764β783 | journal=Bell System Technical Journal| issue=4 | doi=10.1002/j.1538-7305.1952.tb01405.x }}</ref> [[Peter Elias]] in 1955 published two papers on predictive coding of signals.<ref>{{cite journal | author=P. Elias | title=Predictive coding I | year=1955 | pages=16β24 | volume=IT-1 no. 1 | journal=IRE Trans. Inform.Theory| doi=10.1109/TIT.1955.1055126 }}</ref><ref>{{cite journal | author=P. Elias | title=Predictive coding II | year=1955 | pages=24β33 | volume=IT-1 no. 1 | journal=IRE Trans. Inform. Theory| doi=10.1109/TIT.1955.1055116 }}</ref> Linear predictors were applied to speech analysis independently by [[Fumitada Itakura]] of [[Nagoya University]] and Shuzo Saito of [[Nippon Telegraph and Telephone]] in 1966 and in 1967 by [[Bishnu S. Atal]], [[Manfred R. Schroeder]] and John Burg. Itakura and Saito described a statistical approach based on [[maximum likelihood estimation]]; Atal and Schroeder described an [[adaptive filter|adaptive linear predictor]] approach; Burg outlined an approach based on [[maximum entropy spectral estimation|principle of maximum entropy]].<ref name="Sasahira" /><ref>{{cite journal |author1=S. Saito |author2=F. Itakura | title=Theoretical consideration of the statistical optimum recognition of the spectral density of speech | date=Jan 1967 | journal=J. Acoust. Soc. Jpn.}}</ref><ref>{{cite journal |author1=B.S. Atal |author2=M.R. Schroeder | title=Predictive coding of speech | year=1967 | journal=Conf. Communications and Proc}}</ref><ref>{{cite journal | author=J.P. Burg | title=Maximum Entropy Spectral Analysis | year=1967 | journal=Proceedings of 37th Meeting, Society of Exploration Geophysics, Oklahoma City}}</ref> In 1969, Itakura and Saito introduced method based on [[partial correlation]] (PARCOR), [[Glen Culler]] proposed real-time speech encoding, and [[Bishnu S. Atal]] presented an LPC speech coder at the Annual Meeting of the [[Acoustical Society of America]]. In 1971, realtime LPC using [[16-bit computing|16-bit]] LPC hardware was demonstrated by [[Philco-Ford]]; four units were sold.<ref name="Gray">{{cite journal |last1=Gray |first1=Robert M. |author1-link=Robert M. Gray |title=A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol |journal=Found. Trends Signal Process. |date=2010 |volume=3 |issue=4 |pages=203β303 |doi=10.1561/2000000036 |url=https://ee.stanford.edu/~gray/lpcip.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://ee.stanford.edu/~gray/lpcip.pdf |archive-date=2022-10-09 |url-status=live |issn=1932-8346|doi-access=free }}</ref> LPC technology was advanced by Bishnu Atal and [[Manfred Schroeder]] during the 1970s{{ndash}}1980s.<ref name="Gray"/> In 1978, Atal and Vishwanath ''et al.'' of BBN developed the first [[variable bitrate|variable-rate]] LPC algorithm.<ref name="Gray"/> The same year, Atal and [[Manfred R. Schroeder]] at Bell Labs proposed an LPC speech [[codec]] called [[adaptive predictive coding]], which used a [[psychoacoustic]] coding algorithm exploiting the masking properties of the human ear.<ref name="Schroeder2014">{{cite book|last1=Schroeder|first1=Manfred R.|title=Acoustics, Information, and Communication: Memorial Volume in Honor of Manfred R. Schroeder|date=2014|publisher=Springer|isbn=9783319056609|chapter=Bell Laboratories|page=388|chapter-url=https://books.google.com/books?id=d9IkBAAAQBAJ&pg=PA388}}</ref><ref>{{cite book|last1=Atal|first1=B.|last2=Schroeder|first2=M.|title=ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing |chapter=Predictive coding of speech signals and subjective error criteria |date=1978|volume=3|pages=573β576|doi=10.1109/ICASSP.1978.1170564}}</ref> This later became the basis for the [[perceptual coding]] technique used by the [[MP3]] [[audio compression (data)|audio compression]] format, introduced in 1993.<ref name="Schroeder2014"/> [[Code-excited linear prediction]] (CELP) was developed by Schroeder and Atal in 1985.<ref>{{cite book|last1=Schroeder|first1=Manfred R.|author1-link=Manfred R. Schroeder|last2=Atal|first2=Bishnu S.|title=ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing |chapter=Code-excited linear prediction(CELP): High-quality speech at very low bit rates |author2-link=Bishnu S. Atal|date=1985|volume=10|pages=937β940|doi=10.1109/ICASSP.1985.1168147|s2cid=14803427}}</ref> LPC is the basis for [[voice-over-IP]] (VoIP) technology.<ref name="Gray"/> In 1972, [[Bob Kahn]] of [[Defense Advanced Research Projects Agency|ARPA]] with Jim Forgie of [[Lincoln Laboratory]] (LL) and Dave Walden of [[BBN Technologies]] started the first developments in packetized speech, which would eventually lead to voice-over-IP technology. In 1973, according to Lincoln Laboratory informal history, the first real-time 2400 [[bit]]/[[Second|s]] LPC was implemented by Ed Hofstetter. In 1974, the first real-time two-way LPC packet speech communication was accomplished over the [[ARPANET]] at 3500 bit/s between Culler-Harrison and Lincoln Laboratory. ==LPC coefficient representations== LPC is frequently used for transmitting spectral envelope information, and as such it has to be tolerant of transmission errors. Transmission of the filter coefficients directly (see [[linear prediction]] for a definition of coefficients) is undesirable, since they are very sensitive to errors. In other words, a very small error can distort the whole spectrum, or worse, a small error might make the prediction filter unstable. There are more advanced representations such as [[log area ratio]]s (LAR), [[line spectral pairs]] (LSP) decomposition and [[reflection coefficient]]s. Of these, especially LSP decomposition has gained popularity since it ensures the stability of the predictor, and spectral errors are local for small coefficient deviations. ==Applications== LPC is the most widely used method in [[speech coding]] and [[speech synthesis]].<ref>{{cite journal |last1=Gupta |first1=Shipra |title=Application of MFCC in Text Independent Speaker Recognition |journal=International Journal of Advanced Research in Computer Science and Software Engineering |date=May 2016 |volume=6 |issue=5 |pages=805β810 (806) |s2cid=212485331 |issn=2277-128X |url=https://pdfs.semanticscholar.org/2aa9/c2971342e8b0b1a0714938f39c406f258477.pdf |archive-url=https://web.archive.org/web/20191018231621/https://pdfs.semanticscholar.org/2aa9/c2971342e8b0b1a0714938f39c406f258477.pdf |url-status=dead |archive-date=2019-10-18 |access-date=18 October 2019}}</ref> It is generally used for speech analysis and resynthesis. It is used as a form of voice compression by phone companies, such as in the [[GSM]] standard, for example. It is also used for [[COMSEC|secure]] wireless, where voice must be [[digitize]]d, [[encryption|encrypted]] and sent over a narrow voice channel; an early example of this is the US government's [[Navajo I]]. LPC synthesis can be used to construct [[vocoder]]s where musical instruments are used as an excitation signal to the time-varying filter estimated from a singer's speech. This is somewhat popular in [[electronic music]]. [[Paul Lansky]] made the well-known computer music piece [[notjustmoreidlechatter]] using linear predictive coding.<ref>{{cite web | archive-url=https://web.archive.org/web/20171224031037/http://paul.mycpanel.princeton.edu/liner_notes/morethanidlechatter.html |url=http://paul.mycpanel.princeton.edu/liner_notes/morethanidlechatter.html | accessdate=2024-06-02 | archive-date=2017-12-24 | title = More Than Idle Chatter | first = Paul | last = Lansky }}</ref> A 10th-order LPC was used in the popular 1980s [[Speak & Spell (game)|Speak & Spell]] educational toy. LPC predictors are used in [[Shorten (file format)|Shorten]], [[MPEG-4 ALS]], [[FLAC]], [[SILK]] [[audio codec]], and other [[lossless compression|lossless]] audio codecs. LPC has received some attention as a tool for use in the tonal analysis of violins and other stringed musical instruments.<ref name=tai>{{cite journal|last=Tai|first=Hwan-Ching|author2=Chung, Dai-Ting |title=Stradivari Violins Exhibit Formant Frequencies Resembling Vowels Produced by Females|journal=Savart Journal|date=June 14, 2012|volume=1|issue=2|url=http://savartjournal.org/index.php/sj/article/view/16/pdf}}</ref> ==See also== *[[Akaike information criterion]] *[[Audio compression (data)|Audio compression]] *[[Code-excited linear prediction]] (CELP) *[[FS-1015]] *[[FS-1016]] *[[Generalized filtering]] *[[Linear prediction]] *[[Linear predictive analysis]] *[[Pitch estimation]] *[[Warped linear predictive coding]] ==References== {{Reflist}} {{refbegin}} {{refend}} ==Further reading== *{{Cite journal|last=O'Shaughnessy|first=D.|year=1988|title=Linear predictive coding|journal=IEEE Potentials|volume=7|issue=1|pages=29β32|doi=10.1109/45.1890|s2cid=12786562}} *{{Cite book|first1=Alan | last1=Bundy | author-link1=Alan Bundy | first2=Lincoln | last2=Wallen| title=Catalogue of Artificial Intelligence Tools | chapter=Linear Predictive Coding | author-link2=Lincoln Wallen | year=1984 | series=Symbolic Computation | doi=10.1007/978-3-642-96868-6_123 | pages=61| isbn=978-3-540-13938-6 }} *{{cite book|last=El-Jaroudi|first=Amro|title=Wiley Encyclopedia of Telecommunications|year=2003|chapter=Linear Predictive Coding|doi=10.1002/0471219282.eot155|isbn=978-0471219286}} ==External links== *[http://soundlab.cs.princeton.edu/software/rt_lpc/ real-time LPC analysis/synthesis learning software] *[http://www.vintagecomputing.com/index.php/archives/528 30 years later Dr Richard Wiggins Talks Speak & Spell development] * [http://www-ee.stanford.edu/~gray/dl.html Robert M. Gray, IEEE Signal Processing Society, Distinguished Lecturer Program] {{Compression Methods}} {{DEFAULTSORT:Linear Predictive Coding}} [[Category:Audio codecs]] [[Category:Lossy compression algorithms]] [[Category:Speech codecs]] [[Category:Digital signal processing]] [[Category:Japanese inventions]] [[Category:Data compression]]
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Templates used on this page:
Template:Cite book
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite patent
(
edit
)
Template:Cite web
(
edit
)
Template:Compression Methods
(
edit
)
Template:Ndash
(
edit
)
Template:Refbegin
(
edit
)
Template:Refend
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Use American English
(
edit
)
Search
Search
Editing
Linear predictive coding
Add topic