Alireza Kenarsari-Anhari: Picovoice Enabling Developers To Train and Deploy State-of-the-Art On-Device Speech Models

Updated: November 21, 2022
Categories: Interviews
Made in CA Exclusive Interview

The views and opinions expressed in the interviews published on Made in CA are those of the interviewees and do not reflect the official policy or position of Made in CA.

The information provided through these interviews is for informational purposes only and does not constitute an endorsement or recommendation of any products, services, or individuals featured. We strongly encourage readers to consult with appropriate professionals or authorities in the relevant fields for accurate information and advice.

Picovoice is the first and only ubiquitous on-device voice AI platform.

Picovoice offers speech-to-text, voice search, wake word, speech-to-intent (intent detection), and voice activity detection engines. Its stack can run on anything from embedded devices to web browsers, providing an immersive experience not achievable by any FAANG.

Picovoice’s mission is to be “the developer-first platform for adding voice to anything.”

Tell us about yourself?

I decided to start Picovoice when I was a senior engineer at Amazon. I had expertise and patents in deep learning and speech recognition. Voice AI, especially in the consumer space with Alexa and Siri, was growing fast, and there were flaws in the market. Only big tech had access to machine learning experts to train state-of-the-art voice models. I wanted to build a voice AI platform that enables millions of other enterprises that don’t have FAANG resources.

If you could go back in time a year or two, what piece of advice would you give yourself?

Everybody says running and even working at a startup is different than FAANG. Most FAANG employees leave FAANG for smaller companies to have more impact because there is only so much you can do at a big organization. You’re one of tens of thousands of employees. I felt the same, too. However, startups can be a culture shock for successful FAANG employees. If the results are not good, your impact is still significant. It requires mental toughness, grit, and resilience. You have to get things done with almost no resources. It requires agility and adaptability. Startups have to innovate faster to compete with FAANG, which has brand and money. It feels like you are sprinting a marathon. Plus, no fancy employee perks. It sounds crazy, yet it’s crazier than it sounds. It’s not for everybody.

What problem does your business solve?

Picovoice enables developers to train and deploy state-of-the-art on-device speech models. It follows modern development principles and enables organizations to apply user-centric and iterative processes to streamline the user experience for building accurate, private, and cost-effective voice products. Picovoice offers ubiquitous voice products that run across platforms and that are ready to be deployed anywhere, anytime, and instantly.

What is the inspiration behind your business?

Before Picovoice, voice technology was exclusive to only a handful of multinationals. Most enterprises couldn’t afford it. Most developers didn’t have access to state-of-the-art models. You had to go through a long and exhausting sales process even if you could afford it. I founded Picovoice to build the developer-first platform for adding voice to anything.

Today, what developers can do with Picovoice in a day used to take months, if not years. Picovoice serves everyone, hobbyists, early-stage startups, or Fortune 500 companies.

What is your magic sauce?

Picovoice technology brings an innovative and fresh approach to voice recognition. The magic sauce is that we dare to do things differently. While everybody is focused on building accurate speech models, we’re focused on building accurate and efficient speech models. We have the smallest models in the market, yet they beat the cloud accuracy – hence the name Pico and the ability to run across platforms.

What is the plan for the next 5 years? What do you want to achieve?

Our north star is to be the voice AI platform for developers. We are working to enable more developers and more applications. We have a packed roadmap with new products, features, and languages. Recently, we announced a startup program that offers a 90 per cent discount after introducing the Free Tier with no platform, product, SDK, or time limit.

What is the biggest challenge you’ve faced so far?

Picovoice prides itself on running across platforms without sacrificing accuracy, privacy, affordability, or reliability. It’s very hard to achieve for open-domain, large vocabulary speech recognition, i.e. speech-to-text. Building on-device speech-to-text models that beat cloud accuracy was the toughest technical challenge we faced. We could fit hundreds of thousands of words in 20MB, but it was not easy.

How can people get involved?

For more information, please visit, www.picovoice.ai

Everyone can create a free Picovoice Console account and start building in seconds. Enterprises can engage commercially by contacting sales.