A year ago, we explored new technologies in microspeakers and their challenges to transform the consumer electronics industry just as microelectromechanical systems (MEMS) did to electret condenser microphones (ECMs). A year later, while we all have had more than our share of “distractions,” we will expand this view both to new MEMS devices for audio and discuss the implications for other game-changing technologies that reduce size, increase performance, extend battery life, and enhance features and functionality.
When first-generation hearables (e.g., Doppler Lab’s HearOne) first hit the market, the promise was a device that could be worn extensively and become as much a part of you as your smartphone. Great, except the battery life was an hour or so! For sure batteries are not going to achieve 10× energy density anytime soon, but in the last year there have been steps forward on device runtime from Qualcomm’s QCC Bluetooth SOCs to Vesper’s “wake-on-sound,” and Ambiq’s ultra-low power Appolo4 SPOT ARM processors. Most of these products are design wins and will reach consumer products in 2021.
Small Devices
Let’s start our exploration with MEMS in the technology of very small devices, usually consisting of a micro-sensor/transducer and an application-specific integrated circuit (ASIC) that amplifies and otherwise processes the signal or data. High-tech is associated with fast-moving development, yet MEMS evolution and ramp-up over the last three decades has been slow and painful.
The semiconductor industry’s favorite joke regarding MEMS development roadmaps are that they are calculated in dog years (seven times that of human years). However, MEMS devices became practical when they could be manufactured with high yields using integrated circuit (IC) fabrication and device packaging processes. With every smartphone boasting three or four MEMS mics, along with every laptop, tablet, and smart speaker each with their share of MEMS mics, the processes now have the foundation for ever more ambitious applications.
The global audio market continues its growth path. According to market research firm Yole Développement, the global consumer market for microphones, microspeakers, and audio ICs is set to grow at a healthy compound annual growth rate (CAGR) of 6.6%, from $14.1 billion in 2018 to $20.8 billion in 2024. The microspeaker market, which now amounts to $9.1 billion, is expected to grow at a CAGR of 3% to $10.9 billion in 2024.
However, it has taken more than 20 years for MEMS mics to fully displace ECMs. On one front, MEMS mics’ performances have continued to slowly improve stability and signal to noise, and acoustic overload parameters. There are quite a few critical steps in MEMS fabrication and getting high yields on every step always seemed to be another development phase away. It took more than 20 years for the first billion MEMS microphones, and two years for the second billion’s production, compared with monthly production for the last few years reaching 1 billion monthly.
Today, MEMS microphones totally dominate smartphones, tablets, laptops, portable media players, speech recognition systems, personal computers, surveillance cameras, 3D cameras, radars, anti-theft alarms, headphones, smart speakers, music recorders, and various smart home voice command appliances, including air conditioners, refrigerators, and service robots. But concurrently the most dramatic progress is in lower power consumption, multi-functionality, and on-board functions as an alternative to connecting to the cloud.
Diverse Technologies
This time around we see commercialization of an eclectic group of diverse technologies, some not precisely MEMS but all relevant for earphone, microspeaker and/or consumer audio applications. Most have gone through significant development and manufacturing barriers to reach the edge of mass production. But I don’t need a crystal ball for this, just a rear-view mirror to see how the consumer electronics industry was transformed the last time around by the switch from MEMS to ECMs.
MEMS devices are not just microphones, but accelerometers, vibration/shock sensors (e.g., burglar alarms and air bag sensors), gyros, and now microspeakers and earphone transducers. The implementation of MEMS smartphone receivers, earphones, and microspeakers is daunting compared to mics due to the far higher excursion requirements. The first MEMS speakers are now becoming viable commercial products, but we also need to consider their practical applications, unit costs, and acoustical strengths and weaknesses.
The microspeaker and earphone driver market is about $10 billion annually. Just consider the appeal of shifting production lines, even automated speaker production line manufacturing over to semiconductor foundries is mind boggling. The titans of microspeaker manufacturing typically have about 50,000 employees, while MEMS foundries producing similar quantities of devices have staffs of less than 500. Yes, the wafers from the foundry will still need to be “packaged,” but a few zeros in workforce numbers will still be eliminated. And then there is the promise (already being delivered by USound) of automated pick-and-place of MEMS speakers for the surface-mount technology (SMT) population and reflow rather than hand soldering of billions of speakers. With the rising cost of salaries in China, MEMS microspeakers will have a dramatic impact on staffing along with other far-reaching implications.
While MEMS microphones have taken the lion’s share of the microphone market, there are also microspeaker transducers for earphones, smartphone receivers, and most other speakers almost 100% electro-dynamic (magnetic structure with a voice coil). The micro-mechanism in MEMS mics only needs to have enough movement to respond to the acoustic signal, MEMS speakers need to move air. The typical smartphone speaker diaphragm footprint is 10mm×15mm (and the TWS earphone driver has about a 6mm diameter) and has about 0.5mm to 0.7mm Xmax peak excursion.
The air moving power of MEMS speakers is significantly less than even the lowest-performance microspeakers. Some of the new contenders point out their sound production technology does not follow the conventional physics of moving diaphragm transducers. Just a caveat here, perhaps the rules are different but may bring with them other challenges. MEMS speakers promise to be ideal receiver for true wireless stereo (TWS), which is especially appealing for two-way or more in-ear monitors (IEMs) with a “woofer”) in-canal hearing aids, and implantable hearing devices (cochlear and auditory brainstem implants).
These applications have very small "air pumping volume" required for adequate acoustic output due to the enclosed duct and close proximity to the middle ear. A more ambitious step is in-canal IEM earphones, which require not much more acoustic output than implantable transducers. IEM tweeter transducers are another application (many IEM earphones are two-way or more designs using balanced armature drivers). The first MEMS speakers are already on the market today, with USound having just launched its second-generation piezo MEMS devices, xMEMs sampling earphone MEMS, and Arioso are also close behind.
So, lets have a look at the companies that are exploring this technology.
Ambiq Micro
www.ambiq.com
Ambiq’s place in the sun is its ultra-low power microcontroller (MCU) System-on-Chips (SoCs), and has recently launched the Apollo4 system processors. A complete hardware and software solution enables the battery-powered devices (e.g., TWS) to achieve a higher level of intelligence extended run times required for always-on applications.
The Apollo4 is an application processor/coprocessor for battery-powered devices, including TWS smartwatches, fitness bands, far-field voice remotes, predictive health and maintenance devices, smart security devices, and smart home devices. Leveraging Arm’s Cortex-M4 and Arm Artisan physical IP for wide dynamic range and low power operation. With up to 2MB of MRAM and 1.8MB of SRAM, the Apollo4 has more than enough compute and storage to handle complex algorithms and neural networks. External memory is supported through multi-bit SPI and eMMC interfaces. The Apollo4 is available now with CSP and BGA offerings, as well as an Apollo4 Blue with Bluetooth LE.
Arioso Systems GmbH
www.arioso-systems.com
Arioso Systems GmbH, a spin-off from Fraunhofer Institute for Photonic Microsystems (IPMS) in Germany, is continuing the development and commercialization of a novel disruptive microspeaker chip and amplifier technology for audio products. Targeting in-ear headphones and hearables, the pure silicon-MEMS based technology enables fast scaling to mass markets. Based on a low voltage electrostatic MEMS drive compatible with mainstream CMOS manufacturing technologies. Dozens of moving beams located in the interior of a chip replace the diaphragm membrane. These sound generating beams consist of three electrodes within an electrostatic field, which cause a force along each beam and an in-plane deflection. This principle is based on the patented Nanoscopic Electrostatic Drive (NED) from the Fraunhofer IPMS and a bit reminiscent of the squeezing of the Heil air motion transformer (AMT).
Arioso plans to shrink the TWS microspeaker from around 400mm³ to less than 50mm³ reducing the overall size, while enabling adding power demanding features such as additional sensors or chips for edge computing in the TWS. Read this article for more information.
Audio Pixels
www.audiopixels.com.au
Audio Pixels, founded in July 2006, has developed a revolutionary technological platform for reproducing sound. Its patented Digital Sound Reconstruction platform employs ultrasonic techniques to generate sound waves directly from a digital audio stream using MEMS rather than conventional speakers. This innovation, still in the developmental phase, promises speaker products that deliver better performance than conventional speaker technologies, in a 1mm thick package. During 2020 Audio Pixels has made considerable progress, getting ever closer to releasing its new technology.
Axign B.V.
www.axign.nl
Axign’s ticket for inclusion in this directory is its development of the AX5689 Digital Audio Converter and Amplifier Controller chip. Mainstream switching Class-D audio amplifiers are well-known for their excellent high-efficiency, not for their audiophile reproduction, until now. Axign’s Digital Audio Converter and Amplifier Controller, AX5689-based solutions, can be configured for 1 to 16+ channels, for low power to high power per channel and can be optimized on performance and cost. Axign offers cost-competitive solutions with superb audio quality for active speakers, smart speakers, party speakers, TV soundbars, streaming audio amplifiers, subwoofers, and more.
The AX5689 works through a “post filter feedback” solution in combination with a wideband digital control loop. Axign technology places the feedback behind the low-pass output filter, across the loudspeaker terminals and suppresses audible artifacts with a stable fifth-order digital control loop. This results in a loudspeaker independent frequency response with full control over the loudspeaker.
Even difficult to drive loudspeakers will sound balanced and detailed, resulting in a class-A audio amplifier performance at Class-D high-efficiency. Axign has the credentials for these ambitious goals, being founded in 2014 by former Philips and NXP employees who worked on pioneering Class-D efforts. Market acceptance has also validated the claims of Axign with high volume mass production programs in the market early in Q4 2020, led by Harman Kardon’s new Citation AMP.
Fraunhofer Institute for Silicon Technology (ISIT)
www.isit.fraunhofer.de
Inside Fraunhofer there are around 72 institutes and several of them are doing research on MEMS loudspeakers. Arioso (mentioned above) was a spin off of Fraunhofer IPMS and is responsible for the transfer of the electrostatic MEMS speaker to the market.
Development continues on a piezoelectric MEMS with concentrically cascaded lead zirconate titanate actuators, making it the first integrated two-way MEMS speaker. Based on this concept first speaker prototypes have been fabricated featuring high-efficiency piezoelectric bending actuators, which are directly used for sound generation. In combination with a dedicated driving and signal processing unit, the MEMS loudspeakers enable impressive performance at very small device size.
Fraunhofer ISIT and Fraunhofer Institute for Digital Media Technology (IDMT) introduced the world’s first fully integrated MEMS microloudspeaker technology at 145th AES Convention NY 2018. Since then the technology has been successfully demonstrated for different applications including in-ear headphone systems. The core element is a chip loudspeaker manufactured solely using standard and thus very cost-efficient MEMS technology. It features high-efficiency piezoelectric bending actuators, which are directly used for sound generation. In combination with a dedicated driving and signal processing unit, the MEMS loudspeakers enable impressive performance at very small device size.
Presently both institutes are further optimizing the MEMS speaker technology. ISIT is currently developing a new MEMS speaker generation featuring 120dB SPL, even smaller size, higher sensitivity and full CMOS compatibility. Developments at IDMT focus on new low-noise driving electronics with adaptive signal processing enabling auto frequency response adjustment to any target curve.
GraphAudio
www.graphaudio.com
GraphAudio’s electrostatic driver boosts a pure graphene diaphragm functioning as part of the “motor.” Its initial product is an earphone using a graphene diaphragm sandwiched between electrodes. It’s essentially an electrostatic speaker; but instead of a metalized polymer film diaphragm, graphene is used. The ability to power graphene earphones and speakers using conventional mobile battery power expands the application from just the boutique end of the market. Demonstration earphones have been produced and demonstrated with audiophile quality results. Ongoing work on its MEMS implementation continues.
Nanusens
www.nanusens.com
Nanusens’ significant contribution to audio MEMS technology is combining sensors built inside CMOS with the adjunct circuity—all on the same die. All other MEMS package and wire together the transducer and the separate circuit (ASIC) into a MEMS package. Nanusens’ patent pending technology shrinks MEMS sensors and fabricates nanoscale sensors (NEMS or Nano Electro Mechanical Systems) along with the control electronics using standard CMOS processes. This creates single-chip solutions that are significantly smaller than the equivalent multi-part MEMS.
The freed-up space can be used for larger batteries for longer operational life battery (e.g., TWS earbuds by up to 20%). Many different sensors can be built into the same tiny chip without taking up more space. The first product in the pipeline from Nanusens is a 3D motion detector for TWS earbuds to implement tap and double tap for control, wake-on-movement and sleep-on-rest functions with an optional 3D accelerometer. Also coming is a bone conduction sensor for noise cancellation integrated into its single-chip solution.
ORA Graphene Audio
www.ora-sound.com
ORA Graphene Audio Inc. has developed GrapheneQ, one of the first high-content graphene technologies to enter the market, the first in the audio industry. The proprietary, "freestanding" nanomaterial, exploits the exciting mechanical properties of graphene and was specifically designed and optimized for use as diaphragms in acoustic transducers.
With a very rare combination of high stiffness, low density, and an unmatched high damping/loss factor, GrapheneQ enables the design of smaller/lighter, more energy efficient loudspeakers, all while fundamentally improving sound quality. ORA’s patented manufacturing process has the ability to precisely shape graphene oxide into typical loudspeaker membrane geometries (e.g., cones, domes, ribs, bumps, sandwich skins, etc.). More importantly, the process is highly scalable, which has caught the attention of some of the biggest consumer brands, already prototyping graphene powered audio solutions for the next generation of TVs, smart speakers, headphones, laptops, and even smartphones.
The company is currently working with major OEMs, seeking to include ORA’s GrapheneQ technology inside the next generation of TVs, home audio, hearing aids, cellphones, laptops, tablets, automobiles, and other consumer electronics. ORA engages with industry through paid OEM development kits.
Sonion
www.sonion.com
Sonion’s electret electrostatic in-ear monitor (IEM) tweeter provides smoother, cleaner sound at higher sound levels than the balanced armatures. Initial offering is a super-tweeter from 7kHz. Considering Sonion’s heritage with hearing aid transducers, in-ear voice is surely on the development roadmap. The electret tweeter does not have the high bias requirement of electrostatics. It is currently offered with a size of 3.55mm×3.55mm×1.27 mm.
Syntiant
www.syntiant.com
Syntiant is one of a new category of “specialized audio edge processors with an audio-specific instruction set, machine learning to run neural workloads.” The Syntiant NDP100 processor consumes less power and provides higher throughput of the typical low-power MCU solution helps deal with latency, and power consumption budget issues. These processors can deliver enough compute power to process complex audio with AI and machine-learning algorithms at a fraction of the energy of an application processor or generic DSP implementation. Syntiant’s processors enable customized voice control interfaces implementation on-device, across multiple product and use cases, enabling wake words, command words, speaker identification, and event detection, free from cloud connectivity, ensuring privacy and security.
Currently implemented in a wide range of battery-powered devices, from smartphones to smart speakers and earbuds, and built from the ground up for these demanding applications, Syntiant’s NDP processors offer power efficiency and higher throughput at half the die size—compared to current MCU and DSP solutions. In partnership with Infineon to combine the NDP processors with Infineon’s XENSIV MEMS mics for low-latency voice interaction, supporting audio event and environment classification, and sensor analytics. Another partnership is with Sensory for deep learning, multi-lingual voice interface for battery-powered devices. The combined solution merges Sensory’s TrulyHandsfree wake word engine and voice control with Syntiant’s NDPs, bringing low-latency, real-time inference to edge devices. This cooperation enables manufacturers to implement seamless “voice” commands in dozens of languages.
TDK Invensense
www.tdk.com
www.invensense.tdk.com
TDK PiezoListen microspeakers are based on a piezo haptic device. The design has 12 layers of piezoelectric material stacked so that displacement and maximum sound level is increased with response down to 200Hz. The microspeakers are intended for consumer products (e.g., tablet computers and TVs). The speaker comes in two types: a “wide-range” type and “high-range” type. Though it does quite reach the low-end response of conventional speakers, it is adequate for tablets and laptops and TDK is exploring applications for automotive. TDK PiezoListen microspeaker actuators are available from Mouser and other TDK distributors.
USound GmbH
www.usound.com
USound is an Austrian startup that has developed a MEMS microspeaker, which integrates a piezo ceramic element with a cantilever scheme. USound is in production and has design wins of earphone and headphone drivers as well as variants that can be used as tweeters in mobile devices and as supplementary speakers for surround sound in smartphones.
USound was able to overcome the limitation of traditional piezo transducers and with its innovative MEMS concept has proven that it can generate large displacements. USound has developed and has already shipped several hundred thousand of what it believes are the smallest and first MEMS loudspeakers in the world. Key points confirmed are form factor and weight along with reflow solder compatible.
USound successfully developed its second-generation MEMS speakers called Conamara (pictured above), improving the performance of first-generation Ganymede by 6dB SPL reaching above 120dB for in ear headphones and reducing THD by a half. Conamara is a round MEMS speaker for TWS which has been successfully qualified for IPx8 water resistance, by surviving a 30-minute test under 3m wafer column. Conamara comes in two diameters: 6mm and 10mm with a thickness of 1.3mm.
Another milestone is the completed development of an ASIC power amplifier called Leda. The device is optimized for driving the typical loads of a MEMS piezo speaker. Leda implements an innovative patented concept for a charge recovery loop that boost its efficiency and makes it a perfect choice for TWS headphones.
USound has begun sales of its audio smart glasses under the Fauna brand. The technical solution uses a two-way system with a MEMS tweeter and an electrodynamic woofer. This configuration has been compared to Bose and Huawei products and is is clearly the best trade of solution for miniaturization and audio quality.
xMEMS Labs
www.xmems.com
Emerging from stealth mode, XMEMs Labs introduced Montara, the first fully monolithic MEMS speaker. Since July, global OEMs of TWS earbuds, in-ear monitors, and smart glasses have been working on Montara-based prototypes using Qualcomm and other leading Bluetooth SoC platforms. Montara implements the entire speaker (actuator and diaphragm/membrane) in silicon, reducing package height to less than 1mm and eliminating driver matching and calibration due to inherent variability in membrane assembly.
The combination of Montara’s monolithic capacitive piezo-MEMS manufacturing, and silicon membrane material results in precise actuation, quality sound, speaker-to-speaker uniformity and repeatability. The “array-of-cells” configuration minimizes membrane energy storage, enabling fast reaction to input signals.
The unique attributes of xMEMS Labs’ technology results in a transducer that is 100 times faster (15μs average group delay and 2° phase delay) than a traditional voice coil microspeaker and is expected to contribute to next-generation active noise cancelation (ANC) implementations across a more extended frequency range. This fast group delay means higher resolution and more definition, especially in the critical mid- and high-frequency bands.
Providing extended frequency response and amplitude greater than incumbent balanced armatures without their in-band resonances, gives earbud and IEM designers more flexibility in tailoring frequency response curves for virtually any application. With superior mid/high clarity and fidelity, higher SPL/mm3, and wider bandwidth xMEMS’ speakers are expected to be a strong alternative to multi-driver balanced armature approaches. VC
This article was originally published in Voice Coil, December 2020. The article was edited from the original version.
For the latest updates on each company mentioned in this article, click on the Tags available at the top left of this page.