March 2017 The Register and Medical Xpress

Kids who've never heard need 'habilitation' – they've never had a skill to rehabilitate

Getting a computer to understand speech is already a tough nut to crack. A group of Australian researchers wants to take on something much harder: teaching once-deaf babies to talk. Why so tough? Think about what happens when you talk to Siri or Cortana or Google on a phone: the speech recognition system has to distinguish your “OK Google” (for example) from background noise; it has to react to “OK Google” rather than “OK something else”; and it has to parse your speech to act on the command. And you already know how to talk.

The Swinburne University team working on an app called GetTalking can't make even that single assumption, because they're trying to solve a different problem. When a baby receives a cochlear implant to take over the work of their malfunctioning inner ear, he or she needs to learn something brand new: how to associate the sounds they can now hear with the sounds their own mouths make. Getting those kids started in the world of conversation is a matter of “habilitation” – no “rehabilitation” here, because there isn't a capability to recover.

tablet Children interact well with apps. Can one teach children to talk?

GetTalking is the brainchild of Swinburne senior lecturer Belinda Barnet, and the genesis of the idea was her own experience as mother to a child with a cochlear implant. As she explained “With my own daughter – she had an implant at 11 months old – I could afford to take a year off to teach her to talk. This involves lots of repetitive exercises.“ That time and attention, she explained, is the big predictor of success. In the roughly 10 years since it became standard practice to provide implants to babies at or before 12 months of age, 80 per cent of recipients achieve speech within the normal range. What defines the 20 per cent that don't get to that point? Inability, either because of family income or distance from the city, to “spend a year sitting on the carpet with flash-cards”. That makes it hard for parents in rural or regional locations, regional, or low-income mothers, Barnet said.

  

Belinda barnet

Belinda BarnetThe idea for which Barnet and associate professor Rachael McDonald sought funding looks simple: an app to run on something like an iPad that gives the baby a bright visual reward for speaking. However, it does test the boundaries of artificial intelligence (AI) and speech recognition, because of a very difficult starting point: how can an app respond to speech when the baby has never learned to speak? Barnet elaborated on other ways child development interplays with what the app and the AI need. “When a child has not heard any sound, they don't understand that a noise has an effect on the environment. So the first thing has to be a visual reward for an articulation.” At 12 months, she continued, children respond well to visual rewards – and even an “ahhh” or “ohhh” should get a response from the app, if (a big if even for machine learning) it's a deliberate articulation. Being developed to run on iPads or other tablets, GetTalking gives infants a bright visual reward for speaking.

Leon Sterling, a Swinburne computer science researcher, had his interest piqued as a member of the university panel assessing the project, and is helping bring a long experience of AI research to the project. He explained the hidden complexities behind what needs to present itself as a simple app. “You've got to get the signal, you have to extract the signals, separate them from the background noise, the parents speaking, et cetera.”

 Leon SterlingSwinburne's Leon Sterling

Most of those problems have precedent, but GetTalking needs yet more machine learning – like trying to measure the child's engagement with the app. “You've got to look at the ability to observe, to tag video strings together with audio strings.” The team understands that an app can't replace a speech therapist or parent, but only support them – and that adds new complexities like “building in the knowledge of how children interact with physiotherapists. You need to understand the developmental stages of children when they're interacting with the app.”

After distinguishing between speech and “the kid threw a bit pumpkin at the screen”, the app has to respond at a second stage, called “word approximation”. Here, the system's going to have to at once recognise that “da” might be an approximation for “daddy” (with reward), and support the child's development from approximation to whole words. “That's quite difficult. Sterling added another layer the system has to learn: “Is 'da' today the same 'da' as the same child said the other day?” After the app recognises any kind of speech from the infant it then has to recognise what word the child was trying to say, rewarding them for speaking words and approximations of words. That needs to be cross-matched with thousands of articulations from normally-speaking babies,” Barnet explained. Swinburne's BabyLab will help here, with a large collection of speech samples the GetTalking team needs. Those samples will help GetTalking respond to the word-approximation by re-articulating the correct word, “and show the baby a picture of what they're saying”. 

Sterling's previous experience with AI to help children comes from a Swinburne project teaching an off-the-shelf "NAO" robot to help with physiotherapy. Since 2015, he's been part of a team using the familiar “humanoid” robots to keep children recovering from injuries engaged with their physio. The university's input is to write software specific to physio – for example, demonstrating exercises to children, and giving them encouragement to keep up with it. That work has given Swinburne a handle on how children interact with technology and while it's not a replacement for a physiotherapist, “you can't have a health professional with you 24/7”.  As both Barnet and Sterling emphasised, it's impossible to replace the role of the speech therapist or parent. “I've been working in AI research for 35 years,” Sterling said. “People have consistently overestimated what they expect.” Rather than outright automation, Sterling says, most of the time what matters is to provide AI as an aid for people – “how to make a richer experience for people, to help people with their environment”. In the case of GetTalking, one thing he reckons the AI behind the app will do well is do a better job of diagnosing whether or not the child is making progress. “It's a co-design problem; you work with speech therapists, parents, kids – and see what works”, Sterling said. GetTalking is in its early stages, with support from the National Acoustic Laboratories (which operates Hearing Australia). After the app development stages, GetTalking will need a clinical trial to demonstrate its effectiveness. Those aren't cheap, but Barnet said she hopes to secure federal funding at that point. Since disadvantage is so strongly associated with holding back children who receive the implants, Barnet's hope is that GetTalking could be free to those who need it. It's possible that not everything the GetTalking team needs has to be written from scratch. For example, while their speech recognition might be ground-up; both Barnet and Sterling said the team is looking at how a long-standing project, LENA, could lighten the development load.  The LENA Project has its own focus: measuring a child's early language development from birth to 48 months old. Some of its components, however, look tantalising: speech recognition and analysis directed towards GetTalking's target age group.

Apple never revealed the price it paid to acquire the team that developed Siri, but rumours of US$150 million don't sound unreasonable – and Siri takes its input from someone who knows how to speak. For all the effort that's gone into speech recognition and AI, we also know it remains so difficult it's been automated for only a couple of per cent of languages.