Speechmatics Introduces Flow API for Advanced Speech Interactions

July 30 2024, 02:10

Speechmatics, a pioneer in speech recognition and currently a leader in multilanguage transcription technology, has announced the launch of an API that will enable developers to build voice interactions into any product, including AI assistants and agents. Flow combines Speechmatics' real-time automatic speech recognition (ASR) with large language models (LLMs) and text-to-speech capabilities, offering a complete solution for voice-based interactions that are accurate, responsive, and secure.

Having always focused on enterprise applications, Speechmatics knows how companies have long struggled to implement voice assistants that can accurately understand diverse accents and languages, maintain a natural conversation flow, and ensure data privacy. Existing solutions often fall short in accuracy, latency, or flexibility, limiting their effectiveness in real-world business applications. 

As the Cambridge, UK-based company explains, Flow is built on the foundations of Speechmatics' ASR technology, which understands speech in 50 languages, across diverse accents, and in any noisy environment. With secure infrastructure, low latency, and ability to integrate with any preferred LLM, Flow offers flexibility and security for enterprise-ready voice interactions.

"Flow represents a significant leap forward in enterprise voice technology," says Trevor Back, Speechmatics Chief Product Officer. "By combining our world-class ASR with advanced conversational AI capabilities, we're enabling businesses to create more natural, efficient, and secure voice interactions across a wide range of applications."

Trevor Back, Speechmatics Chief Product Officer

Virtual Assistants
Using Speechmatics' best-in-class real-time engine, Flow enables the creation of virtual assistants that respond quickly to questions, hear and understand every word being said, as it is being said, and handle interruptions and cross talk extremely well. And because the API supports speaker identification and diarization (the process of partitioning an audio stream containing human speech into segments according to the identity of each speaker), it can address multiple people by name, ignore speakers until called upon, or ignore background voices even when they are clearly heard, depending on the situation.

Developers can integrate Flow into existing products and services through an API, allowing for quick deployment and customization to meet specific business needs. Flow also offers the ability to add custom prompts to personalize the assistant for specific customer needs. It will also offer the ability to include answers from internal documentation for ensuring accurate responses to specific customer queries.

This new approach for Speechmatics builds upon the company's extensive experience in speech technology, combining the latest breakthroughs in AI and ML to accurately understand and transcribe human-level speech into text in real-time. Because of its pioneering lead in powering transcriptions for large media organizations, from real-time live captioning to metadata generation from larger television archives, Flow benefits from this huge vocabulary. Speechmatics processes over 500 years of transcription worldwide every month,  in 50 languages, and can translate 69 language pairs. 

Having pioneered machine learning in speech recognition, Speechmatics' neural networks consider acoustics, languages, dialects, multiple speakers, punctuation, capitalization, context and implicit meanings. Combining the latest AI-driven speech capabilities, allows Speechmatics to offer a solution that utilize summaries, topics, sentiment, translation and more, across use cases and industries.

The range of use cases for the Flow API is vast. It includes generalist AI assistants, but it can also be used for any AI agent or product that would benefit from people being able to speak to it. Speechmatics has just opened up a waitlist for Flow, before a general release later in 2024.
www.speechmatics.com/flow

About Joao Martins

Since 2013, Joao Martins leads audioXpress as editor-in-chief of the US-based magazine and website, the leading audio electronics, audio product development and design publication, working also as international editor for Voice Coil, the leading periodical for... Read more

« Back