Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Speech synthesis
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Applications == Speech synthesis has long been a vital assistive technology tool and its application in this area is significant and widespread. It allows environmental barriers to be removed for people with a wide range of disabilities. The longest application has been in the use of [[screen reader]]s for people with visual impairment, but text-to-speech systems are now commonly used by people with [[dyslexia]] and other [[Reading disability|reading disabilities]] as well as by pre-literate children.<ref>{{Cite journal |last1=Brunow |first1=David A. |last2=Cullen |first2=Theresa A. |date=2021-07-03 |title=Effect of Text-to-Speech and Human Reader on Listening Comprehension for Students with Learning Disabilities |url=https://www.tandfonline.com/doi/full/10.1080/07380569.2021.1953362 |journal=Computers in the Schools |language=en |volume=38 |issue=3 |pages=214–231 |doi=10.1080/07380569.2021.1953362 |hdl=11244/316759 |s2cid=243101945 |issn=0738-0569|hdl-access=free }}</ref> They are also frequently employed to aid those with severe [[speech impairment]] usually through a dedicated [[voice output communication aid]].<ref>{{Cite book |last1=Triandafilidi |first1=Ioanis I. |last2=Tatarnikova |first2=T. M. |last3=Poponin |first3=A. S. |title=2022 Wave Electronics and its Application in Information and Telecommunication Systems (WECONF) |chapter=Speech Synthesis System for People with Disabilities |date=2022-05-30 |chapter-url=https://ieeexplore.ieee.org/document/9803600 |location=St. Petersburg, Russian Federation |publisher=IEEE |pages=1–5 |doi=10.1109/WECONF55058.2022.9803600 |isbn=978-1-6654-7083-4|s2cid=250118756 }}</ref> Work to personalize a synthetic voice to better match a person's personality or historical voice is becoming available.<ref>{{Cite book |last1=Zhao |first1=Yunxin |last2=Song |first2=Minguang |last3=Yue |first3=Yanghao |last4=Kuruvilla-Dugdale |first4=Mili |title=2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI) |chapter=Personalizing TTS Voices for Progressive Dysarthria |date=2021-07-27 |chapter-url=https://ieeexplore.ieee.org/document/9508522 |location=Athens, Greece |publisher=IEEE |pages=1–4 |doi=10.1109/BHI50953.2021.9508522 |isbn=978-1-6654-0358-0|s2cid=236982893 }}</ref> A noted application, of speech synthesis, was the [[Reading machine|Kurzweil Reading Machine for the Blind]] which incorporated text-to-phonetics software based on work from [[Haskins Laboratories]] and a black-box synthesizer built by [[Votrax]].<ref>{{Cite journal |date=1984 |title=Evolution of Reading Machines for the Blind: Haskins Laboratories" Research as a Case History |url=https://www.rehab.research.va.gov/jour/84/21/1/pdf/cooper.pdf |journal=[[Journal of Rehabilitation Research and Development]] |volume=21 |issue=1}}</ref> [[File:Stephen Hawking.StarChild.jpg|thumb|upright=.7|left|[[Stephen Hawking]] was one of the most famous people to use a speech computer to communicate.]] Speech synthesis techniques are also used in entertainment productions such as games and animations. In 2007, Animo Limited announced the development of a software application package based on its speech synthesis software FineSpeech, explicitly geared towards customers in the entertainment industries, able to generate narration and lines of dialogue according to user specifications.<ref>{{cite news|url=http://www.animenewsnetwork.com/news/2007-05-02/speech-synthesis-software |title=Speech Synthesis Software for Anime Announced |work=Anime News Network |date=2007-05-02 |access-date=2010-02-17}}</ref> The application reached maturity in 2008, when NEC [[Biglobe]] announced a web service that allows users to create phrases from the voices of characters from the Japanese [[anime]] series ''[[Code Geass: Lelouch of the Rebellion R2]]''.<ref>{{cite web|url=http://www.animenewsnetwork.com/news/2008-09-09/code-geass-voice-synthesis-service-offered-in-japan |title=Code Geass Speech Synthesizer Service Offered in Japan |publisher=Animenewsnetwork.com |date=2008-09-09 |access-date=2010-02-17}}</ref> 15.ai has been frequently used for [[content creation]] in various [[fandom]]s, including the [[My Little Pony: Friendship Is Magic fandom|''My Little Pony: Friendship Is Magic'' fandom]], the ''[[Team Fortress 2]]'' fandom, the ''[[Portal (series)|Portal]]'' fandom, and the ''[[SpongeBob SquarePants]]'' fandom.{{citation needed|date=June 2024}} Text-to-speech for disability and impaired communication aids have become widely available. Text-to-speech is also finding new applications; for example, speech synthesis combined with [[speech recognition]] allows for interaction with mobile devices via [[natural language processing]] interfaces. Some users have also created AI [[virtual assistant]]s using 15.ai and external voice control software.<ref name="automaton2"/><ref name="Denfaminicogamer2"/> Text-to-speech is also used in second language acquisition. Voki, for instance, is an educational tool created by Oddcast that allows users to create their own talking avatar, using different accents. They can be emailed, embedded on websites or shared on social media. Content creators have used voice cloning tools to recreate their voices for podcasts,<ref name=":162">{{Cite web |date=2023-06-20 |title=Now hear this: Voice cloning AI startup ElevenLabs nabs $19M from a16z and other heavy hitters |url=https://venturebeat.com/ai/now-hear-this-voice-cloning-ai-startup-elevenlabs-nabs-19m-from-a16z-and-other-heavy-hitters/ |access-date=2023-07-25 |website=VentureBeat |language=en-US}}</ref><ref>{{Cite web |date=April 9, 2023 |title=Sztuczna inteligencja czyta głosem Jarosława Kuźniara. Rewolucja w radiu i podcastach |url=https://www.press.pl/tresc/75988,sztuczna-inteligencja-czyta-glosem-jaroslawa-kuzniara_-to-zapowiedz-rewolucji-w-radiu-i-podcastach |access-date=2023-04-25 |website=Press.pl |language=pl}}</ref> narration,<ref name=":13"/> and comedy shows.<ref>{{Cite magazine |last=Knibbs |first=Kate |title=Generative AI Podcasts Are Here. Prepare to Be Bored |url=https://www.wired.com/story/generative-ai-podcasts-boring/ |magazine=Wired |language=en-US |issn=1059-1028 |access-date=2023-07-25}}</ref><ref>{{Cite web |last=Suciu |first=Peter |title=Arrested Succession Parody On YouTube Features 'Narration' By AI-Generated Ron Howard |url=https://www.forbes.com/sites/petersuciu/2023/05/09/arrested-succession-parody-on-youtube-features-narration-by-ai-generated-ron-howard/ |access-date=2023-07-25 |website=Forbes |language=en}}</ref><ref>{{Cite news |last=Fadulu |first=Lola |date=2023-07-06 |title=Can A.I. Be Funny? This Troupe Thinks So. |language=en-US |work=The New York Times |url=https://www.nytimes.com/2023/07/06/nyregion/artificial-intelligence-comedy.html |access-date=2023-07-25 |issn=0362-4331}}</ref> Publishers and authors have also used such software to narrate audiobooks and newsletters.<ref name=":2">{{Cite web |last=Kanetkar |first=Riddhi |title=Hot AI startup ElevenLabs, founded by ex-Google and Palantir staff, is set to raise $18 million at a $100 million valuation. Check out the 14-slide pitch deck it used for its $2 million pre-seed. |url=https://www.businessinsider.com/elevenlabs-ai-voice-intelligence-startup-raises-2-million-2023-1 |access-date=2023-07-25 |website=Business Insider |language=en-US}}</ref><ref name=":02">{{Cite web |date=January 30, 2023 |title=AI-Generated Voice Firm Clamps Down After 4chan Makes Celebrity Voices for Abuse |url=https://www.vice.com/en/article/ai-voice-firm-4chan-celebrity-voices-emma-watson-joe-rogan-elevenlabs/ |access-date=2023-02-03 |website=Vice.com |language=en}}</ref> Another area of application is AI video creation with talking heads. Webapps and video editors like Elai.io or [[Synthesia (company)|Synthesia]] allow users to create video content involving AI avatars, who are made to speak using text-to-speech technology.<ref>{{cite web |title=Usage of text-to-speech in AI video generation |url=https://elai.io/ |website=elai.io |access-date=10 August 2022}}</ref><ref>{{cite web |title=AI Text to speech for videos|url=https://www.synthesia.io/text-to-speech|website=synthesia.io |access-date=12 October 2023}}</ref> Speech synthesis is a valuable computational aid for the analysis and assessment of speech disorders. A [[voice quality]] synthesizer, developed by Jorge C. Lucero et al. at the [[University of Brasília]], simulates the physics of [[phonation]] and includes models of vocal frequency jitter and tremor, airflow noise and laryngeal asymmetries.<ref name=":0" /> The synthesizer has been used to mimic the [[timbre]] of [[dysphonic]] speakers with controlled levels of roughness, breathiness and strain.<ref name=":1" /> === Singing synthesis === {{excerpt|Music technology (electronic and digital)|Vocal synthesis after 2010s}}
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Speech synthesis
(section)
Add topic