Editing Speech synthesis (section)

== Applications ==

Speech synthesis has long been a vital assistive technology tool and its application in this area is significant and widespread. It allows environmental barriers to be removed for people with a wide range of disabilities. The longest application has been in the use of [[screen reader]]s for people with visual impairment, but text-to-speech systems are now commonly used by people with [[dyslexia]] and other [[Reading disability|reading disabilities]] as well as by pre-literate children.<ref>{{Cite journal |last1=Brunow |first1=David A. |last2=Cullen |first2=Theresa A. |date=2021-07-03 |title=Effect of Text-to-Speech and Human Reader on Listening Comprehension for Students with Learning Disabilities |url=https://www.tandfonline.com/doi/full/10.1080/07380569.2021.1953362 |journal=Computers in the Schools |language=en |volume=38 |issue=3 |pages=214–231 |doi=10.1080/07380569.2021.1953362 |hdl=11244/316759 |s2cid=243101945 |issn=0738-0569|hdl-access=free }}</ref> They are also frequently employed to aid those with severe [[speech impairment]] usually through a dedicated [[voice output communication aid]].<ref>{{Cite book |last1=Triandafilidi |first1=Ioanis I. |last2=Tatarnikova |first2=T. M. |last3=Poponin |first3=A. S. |title=2022 Wave Electronics and its Application in Information and Telecommunication Systems (WECONF) |chapter=Speech Synthesis System for People with Disabilities |date=2022-05-30 |chapter-url=https://ieeexplore.ieee.org/document/9803600 |location=St. Petersburg, Russian Federation |publisher=IEEE |pages=1–5 |doi=10.1109/WECONF55058.2022.9803600 |isbn=978-1-6654-7083-4|s2cid=250118756 }}</ref> Work to personalize a synthetic voice to better match a person's personality or historical voice is becoming available.<ref>{{Cite book |last1=Zhao |first1=Yunxin |last2=Song |first2=Minguang |last3=Yue |first3=Yanghao |last4=Kuruvilla-Dugdale |first4=Mili |title=2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI) |chapter=Personalizing TTS Voices for Progressive Dysarthria |date=2021-07-27 |chapter-url=https://ieeexplore.ieee.org/document/9508522 |location=Athens, Greece |publisher=IEEE |pages=1–4 |doi=10.1109/BHI50953.2021.9508522 |isbn=978-1-6654-0358-0|s2cid=236982893 }}</ref>  A noted application, of speech synthesis, was the [[Reading machine|Kurzweil Reading Machine for the Blind]] which incorporated text-to-phonetics software based on work from [[Haskins Laboratories]] and a black-box synthesizer built by [[Votrax]].<ref>{{Cite journal |date=1984 |title=Evolution of Reading Machines for the Blind: Haskins Laboratories" Research as a Case History |url=https://www.rehab.research.va.gov/jour/84/21/1/pdf/cooper.pdf |journal=[[Journal of Rehabilitation Research and Development]] |volume=21 |issue=1}}</ref>
[[File:Stephen Hawking.StarChild.jpg|thumb|upright=.7|left|[[Stephen Hawking]] was one of the most famous people to use a speech computer to communicate.]]
Speech synthesis techniques are also used in entertainment productions such as games and animations. In 2007, Animo Limited announced the development of a software application package based on its speech synthesis software FineSpeech, explicitly geared towards customers in the entertainment industries, able to generate narration and lines of dialogue according to user specifications.<ref>{{cite news|url=http://www.animenewsnetwork.com/news/2007-05-02/speech-synthesis-software |title=Speech Synthesis Software for Anime Announced |work=Anime News Network |date=2007-05-02 |access-date=2010-02-17}}</ref> The application reached maturity in 2008, when NEC [[Biglobe]] announced a web service that allows users to create phrases from the voices of characters from the Japanese [[anime]] series ''[[Code Geass: Lelouch of the Rebellion R2]]''.<ref>{{cite web|url=http://www.animenewsnetwork.com/news/2008-09-09/code-geass-voice-synthesis-service-offered-in-japan |title=Code Geass Speech Synthesizer Service Offered in Japan |publisher=Animenewsnetwork.com |date=2008-09-09 |access-date=2010-02-17}}</ref> 15.ai has been frequently used for [[content creation]] in various [[fandom]]s, including the [[My Little Pony: Friendship Is Magic fandom|''My Little Pony: Friendship Is Magic'' fandom]], the ''[[Team Fortress 2]]'' fandom, the ''[[Portal (series)|Portal]]'' fandom, and the ''[[SpongeBob SquarePants]]'' fandom.{{citation needed|date=June 2024}}

Text-to-speech for disability and impaired communication aids have become widely available. Text-to-speech is also finding new applications; for example, speech synthesis combined with [[speech recognition]] allows for interaction with mobile devices via [[natural language processing]] interfaces. Some users have also created AI [[virtual assistant]]s using 15.ai and external voice control software.<ref name="automaton2"/><ref name="Denfaminicogamer2"/>

Text-to-speech is also used in second language acquisition. Voki, for instance, is an educational tool created by Oddcast that allows users to create their own talking avatar, using different accents. They can be emailed, embedded on websites or shared on social media.

Content creators have used voice cloning tools to recreate their voices for podcasts,<ref name=":162">{{Cite web |date=2023-06-20 |title=Now hear this: Voice cloning AI startup ElevenLabs nabs $19M from a16z and other heavy hitters |url=https://venturebeat.com/ai/now-hear-this-voice-cloning-ai-startup-elevenlabs-nabs-19m-from-a16z-and-other-heavy-hitters/ |access-date=2023-07-25 |website=VentureBeat |language=en-US}}</ref><ref>{{Cite web |date=April 9, 2023 |title=Sztuczna inteligencja czyta głosem Jarosława Kuźniara. Rewolucja w radiu i podcastach |url=https://www.press.pl/tresc/75988,sztuczna-inteligencja-czyta-glosem-jaroslawa-kuzniara_-to-zapowiedz-rewolucji-w-radiu-i-podcastach |access-date=2023-04-25 |website=Press.pl |language=pl}}</ref> narration,<ref name=":13"/> and comedy shows.<ref>{{Cite magazine |last=Knibbs |first=Kate |title=Generative AI Podcasts Are Here. Prepare to Be Bored |url=https://www.wired.com/story/generative-ai-podcasts-boring/ |magazine=Wired |language=en-US |issn=1059-1028 |access-date=2023-07-25}}</ref><ref>{{Cite web |last=Suciu |first=Peter |title=Arrested Succession Parody On YouTube Features 'Narration' By AI-Generated Ron Howard |url=https://www.forbes.com/sites/petersuciu/2023/05/09/arrested-succession-parody-on-youtube-features-narration-by-ai-generated-ron-howard/ |access-date=2023-07-25 |website=Forbes |language=en}}</ref><ref>{{Cite news |last=Fadulu |first=Lola |date=2023-07-06 |title=Can A.I. Be Funny? This Troupe Thinks So. |language=en-US |work=The New York Times |url=https://www.nytimes.com/2023/07/06/nyregion/artificial-intelligence-comedy.html |access-date=2023-07-25 |issn=0362-4331}}</ref> Publishers and authors have also used such software to narrate audiobooks and newsletters.<ref name=":2">{{Cite web |last=Kanetkar |first=Riddhi |title=Hot AI startup ElevenLabs, founded by ex-Google and Palantir staff, is set to raise $18 million at a $100 million valuation. Check out the 14-slide pitch deck it used for its $2 million pre-seed. |url=https://www.businessinsider.com/elevenlabs-ai-voice-intelligence-startup-raises-2-million-2023-1 |access-date=2023-07-25 |website=Business Insider |language=en-US}}</ref><ref name=":02">{{Cite web |date=January 30, 2023 |title=AI-Generated Voice Firm Clamps Down After 4chan Makes Celebrity Voices for Abuse |url=https://www.vice.com/en/article/ai-voice-firm-4chan-celebrity-voices-emma-watson-joe-rogan-elevenlabs/ |access-date=2023-02-03 |website=Vice.com |language=en}}</ref> Another area of application is AI video creation with talking heads. Webapps and video editors like Elai.io or [[Synthesia (company)|Synthesia]] allow users to create video content involving AI avatars, who are made to speak using text-to-speech technology.<ref>{{cite web |title=Usage of text-to-speech in AI video generation |url=https://elai.io/ |website=elai.io |access-date=10 August 2022}}</ref><ref>{{cite web |title=AI Text to speech for videos|url=https://www.synthesia.io/text-to-speech|website=synthesia.io |access-date=12 October 2023}}</ref>

Speech synthesis is a valuable computational aid for the analysis and assessment of speech disorders. A [[voice quality]] synthesizer, developed by Jorge C. Lucero et al. at the [[University of Brasília]], simulates the physics of [[phonation]] and includes models of vocal frequency jitter and tremor, airflow noise and laryngeal asymmetries.<ref name=":0" /> The synthesizer has been used to mimic the [[timbre]] of [[dysphonic]] speakers with controlled levels of roughness, breathiness and strain.<ref name=":1" />

=== Singing synthesis ===
{{excerpt|Music technology (electronic and digital)|Vocal synthesis after 2010s}}