Multilingual AI voice and dubbing startup pyannoteAI has announced that it has raised EUR 8.1m (USD 8.9m) in seed funding to grow and scale its core technology — speaker diarization — in multiple languages. Speaker diarization uses AI to understand who said what in a conversation.
The round was led by Crane Venture Partners and Serena, and included participation from angel investors Julien Chaumond, CTO at Hugging Face, and Alexis Conneau, ex-Research Scientist at Meta and Co-creator of the Advanced Voice Mode neural network at OpenAI.
Vincent Molina, Co-Founder and CEO of pyannoteAI told Slator, “The AI voice industry today is mostly focused on 1:1 conversations between humans and AI. But real-life conversations aren’t like that. They’re full of multi-speaker situations, [with] overlapping speech, interruptions, and short and chaotic speech turns.”
“For AI to truly become part of our daily lives, it has to thrive in that chaos. Focusing on the speaker is the key to unlocking the full value of spoken conversations,” he stated.
pyannoteAI aims to solve the “speaker problem,” transforming how AI listens, processes, and understands those conversations, using speaker identification to reveal meaning in voice interactions.
Molina told Slator that “the entire voice AI landscape is accelerating — across all layers: infrastructure, applications, models, and more. Since we’re positioned at the very beginning of the value chain, and because we serve the entire ecosystem, we received strong interest from a wide range of investors. We encountered a lot of traction from investors across the EU, UK, and US.”
“We see more and more demand for streaming capabilities to process voice in real-time, such as live dubbing of events.” — Vincent Molina, Co-Founder and CEO, pyannoteAI
Multilingual Applications
According to Molina, pyannoteAI supports “all languages, as it is based on voice sound. We are building a platform that empowers developers across the globe to build AI products based on voice, […] [and] we see more and more demand for streaming capabilities to process voice in real-time, such as live dubbing of events.”
“We are providing the tools for developers to build voice-based products that can bring all of AI’s potential to our everyday life. We want to accelerate and serve use cases everywhere it matters for our customers, which means through an API, on-premise, and on-device for application use cases,” he added.
Molina told Slator that the company already has “strong traction everywhere in the US, EU, India, and Japan, serving various use cases such as healthcare consultations, courtroom transcriptions, meetings, customer service, [and more].”
pyannoteAI confirmed that it will use the funds to grow its community, build a commercial platform, and hire research and technical experts on voice AI to deliver its “Speaker Platform” vision to provide deeper speaker insights.