People robbed of the ability to talk due to a stroke or another medical condition may soon have real hope of regaining a voice thanks to technology that harnesses brain activity to produce synthesized speech, researchers said on Wednesday.
Scientists at the University of California, San Francisco, implanted electrodes into the brains of volunteers and decoded signals in cerebral speech centres to guide a computer-simulated version of their vocal tract - lips, jaw, tongue and larynx - to generate speech through a synthesizer.
This speech was mostly intelligible, though somewhat slurred in parts, raising hope among the researchers that with some improvements a clinically viable device could be developed in the coming years for patients with speech loss.
“We were shocked when we first heard the results - we couldn’t believe our ears. It was incredibly exciting that a lot of the aspects of real speech were present in the output from the synthesizer,” said study co-author and UCSF doctoral student Josh Chartier. “Clearly, there is more work to get this to be more natural and intelligible but we were very impressed by how much can be decoded from brain activity.”
<iframe width="853" height="480" src="https://www.youtube.com/embed/kbX9FLJ6WKw" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
Stroke, ailments such as cerebral palsy, amyotrophic lateral sclerosis (ALS), Parkinson’s disease and multiple sclerosis, brain injuries and cancer sometimes take away a person’s ability to speak.
Some people use devices that track eye or residual facial muscle movements to laboriously spell out words letter-by-letter, but producing text or synthesized speech this way is slow, typically no more than 10 words per minute. Natural speech is usually 100 to 150 words per minute.
The five volunteers, all capable of speaking, were given the opportunity to take part because they were epilepsy patients who already were going to have electrodes temporarily implanted in their brains to map the source of their seizures before neurosurgery. Future studies will test the technology on people who are unable to speak.
The volunteers read aloud while activity in brain regions involved in language production was tracked. The researchers discerned the vocal tract movements needed to produce the speech, and created a “virtual vocal tract” for each participant that could be controlled by their brain activity and produce synthesized speech.
“Very few of us have any real idea, actually, of what’s going on in our mouth when we speak,” said neurosurgeon Edward Chang, senior author of the study published in the journal Nature. “The brain translates those thoughts of what you want to say into movements of the vocal tract, and that’s what we’re trying to decode.”
The researchers were more successful in synthesizing slower speech sounds like “sh” and less successful with abrupt sounds like “b” and “p.” The technology did not work as well when the researchers tried to decode the brain activity directly into speech, without using a virtual vocal tract.
“We are still working on making the synthesized speech crisper and less slurred. This is in part a consequence of the algorithms we are using, and we think we should be able to get better results as we improve the technology,” Chartier said.
“We hope that these findings give hope to people with conditions that prevent them from expressing themselves that one day we will be able to restore the ability to communicate, which is such a fundamental part of who we are as humans,” he added.