The Mechanics and Business Value of Speech Technologies
Technology is constantly distracting us. Omar Tawakol of Voicea (formerly Voicera) wants to use voice software to help improve our attention and provide a valuable service to the world of meetings. Meetings may have a terrible reputation, but Voicea is hoping to change that with a new type of voice assistant, EVA, that uses speech technologies to help us both focus our attention fully and record what happens in meetings without distraction.
[Related Article: Why Value-Stream-As-A-Service Could Be Your Business’s Next Big Thing]
How Speech Technologies are Transforming Meetings
There are three pillars of voice according to Tawakol:
- Information-center: this is classic voice tech. There’s no social context, and anyone can ask a question.
- Action-centric: Your smart assistant, Siri or Alexa for example, allows you to use voice not just to find information but to perform a task or action.
- Conversation-centric: This tier of conversation is not consumer-facing or central to a device. Instead, it uses your conversations to facilitate actions based on your own interactions with others.
Gartner predicts that soon meetings will mostly use some type of virtual assistant, making that last type of voice technology even more critical. People spend a ton of time in meetings, but the workflow doesn’t really translate to pure action. Someone has to remember what’s said and decided before anything actually occurs.
Meetings are a huge waste. We can’t necessarily get rid of meetings, but we mostly agree that they’re terrible. For Tawakol, transforming meetings through the targeted use of AI is a critical component of reducing that waste.
- Focus: AI can take notes for you so that you don’t forget. It helps you to follow up and streamlines the process. It can also help you to maintain face time so that you aren’t too focused on taking notes and missing the human connection.
- FOMO: Even when you can’t attend the meeting, or you don’t have to, AI can summarize information and send you the vital portions.
Voice Assistants in Business
There were a few critical pieces to building this type of tech. Tawakol and his team had to consider these features to build something that would be helpful and not just another shiny object.
Different Social Cues
When you’re at home, you can speak directly to the bot to get things done. lAt first, TAwakol’s model followed this type of lean forward model. What emerged is that this type of command is socially unacceptable when you’re in a business environment. Who is everyone talking to? How does it interrupt the conversation?
They decided to build the model through a lean back option. The idea is that the leader of the meeting can speak to attendees just like the would typically and the AI assistant picks up on those vocal cues to get things started. No need for a blatant command to take notes. Instead, the assistant listens to the meeting agenda and can move forward more like a human.
Design Challenges
Despite this lack of explicit commands, the focus of AI isn’t to blend into the background the way your home system tries to do. Instead, the assistant makes sure that everyone in the meeting is aware that AI is present.
It emails attendees to announce that it will be at the meeting. It makes a vocal announcement and has an actual video presence through a moving avatar. Anyone who is in the meeting can delete, which protects the individual and provides trust. Plus, ear muffs allow you to turn the assistant off when the time is appropriate.
The team is working on reducing the discoverability of recordings. This also creates a trust base by eliminating the recording while retaining the notes taken during the meeting. This satisfies concerns for privacy and that later on recordings could be used against the company or individual.
On top of this, EVA is androgynous. Tawakol believes that the preference for female voices could be an implicit bias. EVA is a character and has the default voice as male to avoid replicating that bias. Participants can change it to female if necessary.
The assistant is there to capture essential moments by being more intuitive. This produces fewer transcripts by highlighting critical points through the meeting and allowing meeting participants to relabel what the highlights are.
Accuracy in Meeting Notes
In real-time meetings, noise is high. Most voice assistants have high accuracy in clean, short environments, which no one would use to describe most real-world meetings. Voicea uses an ensemble model that provides better accuracy than individual engines. This improves accuracy over each of the leading voice providers. While there are still plenty of issues and challenges, this is a big step.
Accuracy isn’t the only metric for quality. Other factors, such as some mistakes are bigger deals than others, or the clarity and context of the overall recording are also important. However, with this system, Voicea gets us closer to a functional AI assistant that can accurately transcribe meetings without a heavy human labor load.
[Related Article: Deep Learning for Speech Recognition]
Voicea integrates with a host of other programs, including places like Slack. When it connects to the workflow, it provides you true assistance. Tawakol predicts it could be a mainstay in the enterprise in the next few years, so keep an eye on how it evolves with better and better voice features.
Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday.