Spoken Voice AI Applications in the Smart Home – with Peter Cahill from Voysis

Raghav Bharadwaj

Raghav is serves as Analyst at Emerj, covering AI trends across major industry updates, and conducting qualitative and quantitative research. He previously worked for Frost & Sullivan and Infiniti Research.

Spoken Voice AI Applications in the Smart Home - with Peter Cahill from Voysis

Episode Summary: Over the last couple of years there has been a definite but small shift from mobile as the primary interface focus for businesses to voice. With home assistant devices like the Amazon Echo and the Google Home becoming more commonplace, we aim to focus on how voice based AI applications are being used by businesses today and what this adoption will look like in the future.  

In this week’s episode of AI in Industry, we speak with Peter Cahill, the founder and CEO of Voysis, a voice AI platform that enables voice-based natural language instruction, search, and discovery. Peter explores areas where voice related AI applications will be used by businesses in B2B and B2C spaces today and what this might look like in five years.  

Subscribe to our AI in Industry Podcast with your favorite podcast service:

Guest: Peter Cahill, Founder and CEO of Voysis

Expertise: Text to speech, Natural language processing

Brief recognition: Peter has earned a bachelor’s degree from the Dublin Institute of technology and a PhD from the University of Dublin in text to speech and next generation localization. He went on to serve as a lecturer in the university of Dublin for a period of 3 years, after which he went on to found Voysis.

Big Idea

Peter claims that the big next step after mobile applications in customer facing verticals is going to be the integration of voice. According to Peter, over the last couple of years, voice has grown from a ‘far-out’ AI application to one that performs within reasonable levels of accuracy for certain specific use-cases like in home assistants like the Amazon Echo or the Google Home with consumers learning how to use this technology better.

Peter adds that he sees voice becoming ubiquitous in the future in customer-facing applications due to the fact that they are ideal in terms of availability of data for configuration of the platform. Interestingly, Peter also seems to suggest that the home entertainment sector (like for TVs) will be one of the first adopters of this technology due to the limited functions that will be performed (like switching channels or turning on/off) and availability of structured data that the platform needs to train on.

  • For example, a typical cable TV company would offer several hundred channels and using a remote in such a situation is not the most efficient input format. In the future this could look like just saying ‘I want to watch the heavyweight fight tonight’ and the TV would contextually understand the query and switch to the correct sports channel.
  • Other examples here for the home entertainment sector could be using voice to turn your TV on or off or just saying “I want to listen to Justin Bieber” and the platform could potentially choose the right tracks and play it from the user’s preferred device in the room that they are currently in.

The crux of the AI compatibility for voice applications in home entertainment here is the relatively small number of tasks that need to be understood and acted upon by the platform. Integration is also made easier by the fact that the data input required to configure the platform is readily available and structured in most cases.

Due to commonalities along the above mentioned factors for home entertainment and eCommerce, voice-based AI technology is already being used in the eCommerce industry in applications like improving product search.

The process of configuring the AI for voice in eCommerce would involve data input (of the product catalog) to the platform. The next step would be a human-guided (although few companies claim to have automated this process, there isn’t a clear proof of concept yet) tagging of the data to enable the platform to contextually understand a voiced search query.

For example in a typical eCommerce application, a customer could just say ‘I want black leather shoes for $50’ and the AI platform would understand that black and leather are adjectives describing the shoe and contextually return search results for all black leather shoes under $50.

Business leaders looking to adopt voice as a next layer of customer facing operations today would need to explore if their intended application has a relatively limited range of task handling and availability of structured data and data collection processes is already well established in order to have the highest rates of successful integration.

Interview Highlights with Peter Cahill from Voysis

The main questions Peter answered on this topic are listed below. Listeners can use the embedded podcast player (at the top of this post) to jump ahead to sections they might be interested in:

  • (2:00) What’s now possible with AI for voice which was not possible 2 years ago?
  • (13.18) How does the technology actually work? What is involved in adding a voice layer to say an eCommerce application?
  • (21.45) When you look ahead 5 years, where do you see voice based AI application becoming ubiquitous

Subscribe to our AI in Industry Podcast with your favorite podcast service:

Header image credit: Adobe Stock

Subscribe
subscribe-image
Stay Ahead of the Machine Learning Curve

Join over 20,000 AI-focused business leaders and receive our latest AI research and trends delivered weekly.

Thanks for subscribing to the Emerj "AI Advantage" newsletter, check your email inbox for confirmation.