Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Speech recognition
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===1970β1990=== * '''1971''' β [[DARPA]] funded five years for ''Speech Understanding Research'', speech recognition research seeking a minimum vocabulary size of 1,000 words. They thought [[natural-language understanding|speech ''understanding'']] would be key to making progress in speech ''recognition'', but this later proved untrue.<ref>{{Cite web |last=John Makhoul |title=ISCA Medalist: For leadership and extensive contributions to speech and language processing |url=https://www.superlectures.com/interspeech2016/isca-medalist-for-leadership-and-extensive-contributions-to-speech-and-language-processing |url-status=live |archive-url=https://web.archive.org/web/20180124071005/https://www.superlectures.com/interspeech2016/isca-medalist-for-leadership-and-extensive-contributions-to-speech-and-language-processing |archive-date=24 January 2018 |access-date=23 January 2018 |df=dmy-all}}</ref> [[BBN Technologies|BBN]], [[IBM]], [[Carnegie Mellon]] and [[Stanford Research Institute]] all participated in the program.<ref>{{Cite magazine |last1=Blechman |first1=R. O. |last2=Blechman |first2=Nicholas |date=23 June 2008 |title=Hello, Hal |url=https://www.newyorker.com/magazine/2008/06/23/hello-hal |url-status=live |archive-url=https://web.archive.org/web/20150120042048/http://www.newyorker.com/magazine/2008/06/23/hello-hal |archive-date=20 January 2015 |access-date=17 January 2015 |magazine=The New Yorker |df=dmy-all}}</ref><ref>{{Cite journal |last=Klatt |first=Dennis H. |year=1977 |title=Review of the ARPA speech understanding project |journal=The Journal of the Acoustical Society of America |volume=62 |issue=6 |pages=1345β1366 |bibcode=1977ASAJ...62.1345K |doi=10.1121/1.381666}}</ref> This revived speech recognition research post John Pierce's letter. * '''1972''' β The IEEE Acoustics, Speech, and Signal Processing group held a conference in Newton, Massachusetts. * '''1976''' β The first [[ICASSP]] was held in [[Philadelphia]], which since then has been a major venue for the publication of research on speech recognition.<ref>{{Cite web |last=Rabiner |date=1984 |title=The Acoustics, Speech, and Signal Processing Society. A Historical Perspective |url=http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/216_historical%20perspective.pdf |url-status=live |archive-url=https://web.archive.org/web/20170809113828/http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/216_historical%20perspective.pdf |archive-date=9 August 2017 |access-date=23 January 2018 |df=dmy-all}}</ref> During the late 1960s [[Leonard E. Baum|Leonard Baum]] developed the mathematics of [[Markov chain]]s at the [[Institute for Defense Analysis]]. A decade later, at CMU, Raj Reddy's students [[James K. Baker|James Baker]] and [[Janet M. Baker]] began using the [[hidden Markov model]] (HMM) for speech recognition.<ref>{{Cite web |date=12 January 2015 |title=First-Hand:The Hidden Markov Model β Engineering and Technology History Wiki |url=http://ethw.org/First-Hand:The_Hidden_Markov_Model |url-status=live |archive-url=https://web.archive.org/web/20180403191314/http://ethw.org/First-Hand:The_Hidden_Markov_Model |archive-date=3 April 2018 |access-date=1 May 2018 |website=ethw.org |df=dmy-all}}</ref> James Baker had learned about HMMs from a summer job at the Institute of Defense Analysis during his undergraduate education.<ref name="James Baker interview" /> The use of HMMs allowed researchers to combine different sources of knowledge, such as acoustics, language, and syntax, in a unified probabilistic model. * By the '''mid-1980s''' IBM's [[Frederick Jelinek|Fred Jelinek's]] team created a voice activated typewriter called Tangora, which could handle a 20,000-word vocabulary<ref>{{Cite web |date=2012-03-07 |title=Pioneering Speech Recognition |url=http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/speechreco/ |url-status=dead |archive-url=https://web.archive.org/web/20150219080748/http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/speechreco/ |archive-date=19 February 2015 |access-date=18 January 2015 |df=dmy-all}}</ref> Jelinek's statistical approach put less emphasis on emulating the way the human brain processes and understands speech in favor of using statistical modeling techniques like HMMs. (Jelinek's group independently discovered the application of HMMs to speech.<ref name="James Baker interview">{{Cite web |title=James Baker interview |url=http://www.sarasinstitute.org/Audio/JimBaker(2006).mp3 |url-status=live |archive-url=https://web.archive.org/web/20170828105222/http://www.sarasinstitute.org/Audio/JimBaker(2006).mp3 |archive-date=28 August 2017 |access-date=9 February 2017 |df=dmy-all}}</ref>) This was controversial with linguists since HMMs are too simplistic to account for many common features of human languages.<ref>{{Cite journal |last1=Huang |first1=Xuedong |last2=Baker |first2=James |last3=Reddy |first3=Raj |date=January 2014 |title=A historical perspective of speech recognition |url=https://dl.acm.org/doi/fullHtml/10.1145/2500887 |journal=Communications of the ACM |language=en |volume=57 |issue=1 |pages=94β103 |doi=10.1145/2500887 |issn=0001-0782 |s2cid=6175701 |archive-url=https://web.archive.org/web/20231208161616/https://dl.acm.org/doi/fullHtml/10.1145/2500887 |archive-date=2023-12-08}}</ref> However, the HMM proved to be a highly useful way for modeling speech and replaced dynamic time warping to become the dominant speech recognition algorithm in the 1980s.<ref>{{Cite report |url=http://www.ece.ucsb.edu/faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf |title=Automatic speech recognitionβa brief history of the technology development |last1=Juang |first1=B. H. |last2=Rabiner |first2=Lawrence R. |page=10 |access-date=17 January 2015 |archive-url=https://web.archive.org/web/20140817193243/http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf |archive-date=17 August 2014 |url-status=live}}</ref><ref>{{Cite journal |last=Li |first=Xiaochang |date=2023-07-01 |title="There's No Data Like More Data": Automatic Speech Recognition and the Making of Algorithmic Culture |url=https://www.journals.uchicago.edu/doi/10.1086/725132 |journal=Osiris |language=en |volume=38 |pages=165β182 |doi=10.1086/725132 |issn=0369-7827 |s2cid=259502346}}</ref> * '''1982''' β Dragon Systems, founded by James and [[Janet M. Baker]],<ref>{{Cite web |title=History of Speech Recognition |url=http://www.dragon-medical-transcription.com/history_speech_recognition.html |archive-url=https://web.archive.org/web/20150813223326/http://dragon-medical-transcription.com/history_speech_recognition.html |archive-date=13 August 2015 |access-date=17 January 2015 |website=Dragon Medical Transcription}}</ref> was one of IBM's few competitors.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Speech recognition
(section)
Add topic