Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Speech recognition
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Pre-1970=== * '''1952''' β Three Bell Labs researchers, Stephen Balashek,<ref>{{Cite news |date=22 July 2012 |title=Obituaries: Stephen Balashek |url=https://obits.nj.com/obituaries/starledger/obituary.aspx?page=lifestory&pid=158702138 |work=The Star-Ledger |access-date=9 September 2024 |archive-date=4 April 2019 |archive-url=https://web.archive.org/web/20190404231352/https://obits.nj.com/obituaries/starledger/obituary.aspx?page=lifestory&pid=158702138 |url-status=live }}</ref> R. Biddulph, and K. H. Davis built a system called "Audrey"<ref>{{Cite web |title=IBM-Shoebox-front.jpg |url=https://cdn57.androidauthority.net/wp-content/uploads/2012/04/IBM-Shoebox-front.jpg |access-date=4 April 2019 |publisher=androidauthority.net |archive-date=9 August 2018 |archive-url=https://web.archive.org/web/20180809153221/https://cdn57.androidauthority.net/wp-content/uploads/2012/04/IBM-Shoebox-front.jpg |url-status=live }}</ref> for single-speaker digit recognition. Their system located the [[formants]] in the power spectrum of each utterance.<ref>{{Cite web |last1=Juang |first1=B. H. |last2=Rabiner |first2=Lawrence R. |title=Automatic speech recognitionβa brief history of the technology development |url=http://www.ece.ucsb.edu/faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf |url-status=live |archive-url=https://web.archive.org/web/20140817193243/http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/354_LALI-ASRHistory-final-10-8.pdf |archive-date=17 August 2014 |access-date=17 January 2015 |page=6}}</ref> * '''1960''' β [[Gunnar Fant]] developed and published the [[source-filter model of speech production]]. * '''1962''' β [[IBM]] demonstrated its 16-word "Shoebox" machine's speech recognition capability at the [[1962 World's Fair]].<ref name="PCW.Siri">{{Cite magazine |last=Melanie Pinola |date=2 November 2011 |title=Speech Recognition Through the Decades: How We Ended Up With Siri |url=https://www.pcworld.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html |access-date=22 October 2018 |magazine=PC World |archive-date=3 November 2018 |archive-url=https://web.archive.org/web/20181103105727/https://www.pcworld.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html |url-status=live }}</ref> * '''1966''' β [[Linear predictive coding]] (LPC), a [[speech coding]] method, was first proposed by [[Fumitada Itakura]] of [[Nagoya University]] and Shuzo Saito of [[Nippon Telegraph and Telephone]] (NTT), while working on speech recognition.<ref name="Gray">{{Cite journal |last=Gray |first=Robert M. |date=2010 |title=A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol |url=https://ee.stanford.edu/~gray/lpcip.pdf |journal=Found. Trends Signal Process. |volume=3 |issue=4 |pages=203β303 |doi=10.1561/2000000036 |issn=1932-8346 |doi-access=free |access-date=9 September 2024 |archive-date=9 October 2022 |archive-url=https://ghostarchive.org/archive/20221009/https://ee.stanford.edu/~gray/lpcip.pdf |url-status=live }}</ref> * '''1969''' β Funding at [[Bell Labs]] dried up for several years when, in 1969, the influential [[John R. Pierce|John Pierce]] wrote an open letter that was critical of and defunded speech recognition research.<ref name="jasapierce">{{Cite journal |last=John R. Pierce |author-link=John R. Pierce |date=1969 |title=Whither speech recognition? |journal=Journal of the Acoustical Society of America |volume=46 |issue=48 |pages=1049β1051 |bibcode=1969ASAJ...46.1049P |doi=10.1121/1.1911801}}</ref> This defunding lasted until Pierce retired and [[James L. Flanagan]] took over. [[Raj Reddy]] was the first person to take on continuous speech recognition as a graduate student at [[Stanford University]] in the late 1960s. Previous systems required users to pause after each word. Reddy's system issued spoken commands for playing [[chess]]. Around this time Soviet researchers invented the [[dynamic time warping]] (DTW) algorithm and used it to create a recognizer capable of operating on a 200-word vocabulary.<ref>{{Cite book |last1=Benesty |first1=Jacob |title=Springer Handbook of Speech Processing |last2=Sondhi |first2=M. M. |last3=Huang |first3=Yiteng |date=2008 |publisher=Springer Science & Business Media |isbn=978-3540491255}}</ref> DTW processed speech by dividing it into short frames, e.g. 10ms segments, and processing each frame as a single unit. Although DTW would be superseded by later algorithms, the technique carried on. Achieving speaker independence remained unsolved at this time period.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Speech recognition
(section)
Add topic