HEAD acoustics Launches VoCAS Speech Recognition Evaluation Software

December 1 2016, 03:10
German company HEAD acoustics launched VoCAS (Voice Control Analysis System), an efficient software analysis solution for evaluating speech recognition systems. From voice control in vehicles to the use of speech commands for smartphones, tablets or telephone hotlines, VoCAS allows quick and objective quality evaluation, under realistic and repeatable test conditions, of Automatic Speech Recognition (ASR) systems, found in an increasing number of applications today.

Founded in 1986 by Prof. Dr. Klaus Genuit, HEAD acoustics GmbH, is one of the world’s leading companies for integrated acoustics solutions as well as sound and vibration analysis, widely recognized for its contributions in the development of hardware and software for measuring, analyzing, and optimizing speech and audio quality, in particular for the automotive and telecommunications industries. The company also develops audio solutions for IT, office, and household appliances, as well as for companies and institutions working in the area of acoustic environment protection. Along with its own research and development work, HEAD acoustics is involved in research projects and cooperates with universities and other scientific institutions worldwide.

The result of an extensive development on speech analysis, HEAD acoustics’s new VoCAS software is an advanced solution that takes crucial factors into account, such as background noise, language or accent which significantly influence the performance of voice control systems. VoCAS allows the use of use predefined test sequences for such ASR systems to determine their quality, to analyze the weaknesses of the systems and to optimize them based on the results.
VoCAS configurable test sequences. ©HEAD acoustics GmbH
Depending on the device under test (DUT) and the requested test case, the appropriate test sequence can be defined in VoCAS. From speech commands for vehicle navigation (“Navigate to New York airport”) to speech commands for a call via mobile phone (“Call John Doe”), all possible commands for controlling an ASR system can be evaluated. Each test sequence consists of different elements and is processed sequentially. These elements are, for example, playing test sentences or background noises, inserting pauses for acoustical feedback of the voice control system or the evaluation of the DUT. All elements can be arranged flexibly, added as often as required and adjusted individually (volume, length etc.).

Each test sequence is reproducible. For background noise, a wide range of realistic sound scenarios are available (e.g. cafeteria, vehicle, train station). The user can test the measurement object by choosing various parameter sets such as different speaker, languages, background noises, destination address or person to be called.
Example of a voice test sequence for mobile phones using Voice Control Analysis Software (VoCAS). ©HEAD acoustics GmbH

For testing different voice control systems, audio source databases with appropriate speech commands are required. Audio databases in VoCAS can be individually expanded by importing own speech recordings. In addition, VoCAS provides an integrated recorder for recording individual speech commands easily and fast. Larger lists of imported or recorded audio files can be cut, filtered and adjusted to defined speech levels automatically. VoCAS also offers the possibility to manually tag keywords to each speech command. There are often recordings, which contain the same command, but which are nevertheless available in different acoustic variants, because different languages, speakers or user accents were used for the recordings. With the help of the tagging system, VoCAS systematically guides the user through the requested variants, creates the appropriate measurement sequence and helps to keep an overview.
The solution also provides a clear representation of test results. Both, percentage values (e.g. 60 % of speech commands recognized, 40 % not recognized) as well as a colorful accentuation can be chosen for an optimal interpretation. Furthermore, a direct comparison of different voice control systems is possible. All available attributes (e.g. utterance, speaker, language, background noise) can be selected for result presentation. This enables the user to check which test sentence has passed or failed the test with certain attributes. The results can be exported to Microsoft Excel for further post processing.
VoCAS allows fast benchmarking of different ASR systems and software versions under realistic and reproducible test conditions. The analysis software is compatible with other HEAD acoustics products. The front end MFE VI.1 can be controlled via VoCAS for playback and monitoring of speech recordings via the artificial head measurement system and for mouth equalization. Furthermore, the background noise simulation systems 3PASS, HAE-BGN or HAE-car can be managed via the Voice Control Analysis Software.
related items