Jun 28, 2019
“I create personalities for the computers that people talk to” This quote comes from Wally Brill, Google’s Head of Conversation Design Advocacy & Education this week on the Shiny New Object Podcast. In this conversation, he takes us on a journey from being a record producer to an interactive opera composer and how he ended up leading Voice design for Google.
Wally points out that being able to have a conversation does not give you the qualification to teach a computer how to speak. There’s a “deeper layer” to conversation which we don’t always notice. For example, when we talk to each other we use “micro-expressions” which act as conversational cues and indicate our underlying emotions. But Voice assistants can’t see these micro-expressions. It’s like Google Home or Alexa only ever talk to you with their eyes closed. Wally’s job is to design human-like conversations for Google Assistant even though they don’t see and speak as we do. We aren’t born with the ability to read a room, we learn it from experience, which is why we have to teach it to computers. Wally goes deep into his craft explaining the importance of ‘Prosody’ which “is the melody of speech.” This melody is nuanced and not clear to everyone which is why there’s an abundance of ex-music professionals working in Voice User Experience design.
Wally’s Shiny New Object is Text-to-Speech. Midway through 2018 Google presented their ‘Duplex’ project. An AI function which talks over the phone like a human on your behalf and enables you to make dinner reservations without you actually speaking to someone. This demonstrates Wally’s stance that speech is the easiest interface to use. Learning how to use a computer is difficult - talking isn’t. His belief is that eventually, Voice will be sophisticated enough to understand all of the nuances in language. This will result in a seamless experience.
In a heated debate, Wally and Tom discuss the ways in which these advancements in Text-to-Speech will help people in real need. Voice Tech should be solving issues that affect people who can’t afford or don’t have access to smart speakers. Wally tells us that Text-to-Speech is helping illiterate people communicate and consume information through their mobiles giving them access to the internet in a way that wasn’t possible before. This could make a tangible difference to the lives of millions of people. Not just those ordering a Frappuccino on UberEats who would like it to be just a tad easier.
You can hear more from Wally in person at the brilliant www.madfestlondon.com/picnic on the 10th of July.
This podcast is sponsored by, Khoros.com. Khoros was very recently born after the coming together of Spredfast and Lithium. Khoros is a technology software company that has over 15 years of leadership in marketing, care, and communities. And Khoros now offers one platform that unites marketing and care and helps brands create customers for life.
A big thank you to pavegen.com/ for providing the venue for this podcast. Pavegen are a UK energy and data firm who are working on converting footfall into energy that is stored off-grid.
Written by Jack Mitchell
Subscribe to the Shiny New Object podcast on Apple Podcasts here -
Listen to the Shiny New Object podcast on Spotify here - https://open.spotify.com/show/03SUtq4qPFOhz0MYTAdOTX?si=1LLDVPy3Tp-o435EwVBwIw