Audio Electronics: Is Digital Jitter Really a Problem?

February 7 2024, 16:10
Since the dawn of the digital era decades ago, the timing error known as "jitter" in A/D/A converters has been accused of harming audio fidelity. Some people claim that jitter adds audible distortion and noise, affects bass fullness, and even harms stereo imaging width and depth. This is an artifact that is difficult to generate artificially in controlled amounts so there haven't been many tests. This article explains what jitter is, what causes it, and even lets you hear jitter in controlled amounts so you can decide for yourself how damaging it really is.
 

In an earlier article, “Artifact Audibility” [1], I wrote about the audibility of various artifacts including distortion and noise. Several audio examples are provided so people can learn at what levels artifacts can be heard using their own stereo system or headphones. Jitter is another artifact that’s often complained about, but it’s difficult to generate artificially in controlled amounts so there haven’t been many tests.

What Is Jitter?
Jitter is a noise-like artifact caused by slight timing errors in the master clock that controls the transfers between analog and digital formats. Jitter occurs both during audio capture into an analog-to-digital converter (ADC) and output from the digital-to-analog converter (DAC). This article will focus mostly on DAC jitter because consumers and their devices have no control over jitter that’s already been recorded. Jitter is specified in units of time, typically nanoseconds (billionths of one second). This defines how early or late the clock signal might switch. Figure 1 shows the square wave output of an ideal clock at top, then with some amount of jitter added below. In the first block the clock’s output turns on slightly late, then turns off a little earlier than it was supposed to. This type of jitter is called Random because the timing deviations are more or less random based on thermal noise variations.
 
Figure 1: Jitter causes the clock to transition at incorrect times, which creates noise in the audio output.

Even though jitter is a timing issue it manifests as FM sidebands — noise-like artifacts that are added to the music. This is why it’s expressed as a time span rather than an amount of phase shift or frequency deviation. Depending on the nature of the jitter, the sidebands may be harmonically related or unrelated to the music. The spectral content can also vary. Both of these affect how audible the jitter will be when music is playing because of the masking effect. Noise containing frequencies similar to the music source will be less audible than noise at frequencies farther away. Absolute volume or SPL also affects audibility: The louder something is, the more detail we can hear clearly. In that earlier “Artifact Audibility” article I made the point that jitter noise in modern digital devices is typically below the noise floor of a CD, even for inexpensive consumer-grade gear. In my experience that is far too soft to be audible.

For a given amount of jitter, the level of noise added to the audio also varies with the source frequency. Jitter affects higher frequencies more simply because the timing error is a larger percentage of the total period. One microsecond of jitter is 0.1% of the period at 1kHz, but 1.0% at 10kHz. The graph shown in Figure 2 from The Art of Digital Audio by John Watkinson [2] shows the amount of jitter noise you can expect at different frequencies, with the noise floor of various bit depths as a reference. For comparison, jitter is typically under 0.5 nanoseconds (ns) even with modest consumer devices, so more than 100dB below the music. In various audibility tests people were unable to detect jitter unless it was greater than 30ns.
 
Figure 2: Effects of sample clock jitter on signal-to-noise ratio (SNR) at different frequencies, compared with theoretical noise floors of systems with different resolutions. (Image courtesy of The Art of Digital Audio by John Watkinson).

One possible exception is jitter in the audio stream of an HDMI connection. In February 2009 British magazine hi-fi news published jitter measurements for four popular receivers. All but one had substantially more jitter in their HDMI outputs compared to their SPDIF outputs. One was as high as 7.6ns through its HDMI output, compared to only 183 picoseconds (ps) through its SPDIF output. But still, that seems to fall below the level of audibility.

What Causes Jitter?
Every digital audio device that contains a clock starts with an oscillator that generates the clock frequency. The best oscillators use a thin slab of quartz crystal sandwiched between metal plates that’s induced to oscillate by applying an electric current to the plates. When the highest accuracy isn’t needed ceramic can be used instead of quartz. Both elements have a very narrow bandwidth (high Q), much like a tuning fork, and so vibrate at a single stable frequency. The resonant frequency of a quartz slab varies with ambient temperature, so when the highest accuracy is needed the oscillator circuit board is placed into a tiny oven whose temperature is tightly regulated to just above the ambient temperature. Figure 3 shows the schematic for one type of crystal oscillator. Here the crystal is placed in the positive feedback loop of an op-amp thus fostering oscillation. At every cycle the op-amp’s output reverses direction when its input reaches a specific “transition” voltage.
 
Figure 3: This oscillator puts a quartz crystal in the positive feedback loop of an op-amp.

All electronic components possess some amount of thermal noise, which is a tiny voltage caused by random molecular motion. For example, a 10kΩ resistor at room temperature outputs 0.4µV (microvolts) RMS over a 10kHz bandwidth. Since resistors set the transition voltage, tiny amounts of their thermal noise can change the transition voltage by random amounts. Noise from the power supply can also get into the oscillator circuit. If that noise contains mains components at 60Hz or 50Hz, the resultant jitter can be modulated by that frequency.

Even when jitter is too minimal to affect perceived audio quality, it can still be measured. In June 2010 Sound On Sound (SOS) magazine (UK) published an article detailing its tests of seven external master clock products, all of which claimed that their low jitter amounts improve audio quality over the clocks built into typical converters and sound cards (see Resources).

External clocks are useful in complex audio systems that contain many digital devices that need to operate together, such as in radio and TV stations. But when SOS technical editor Hugh Robjohns compared the noise and distortion of several DACs, he found they all had less using their internal clocks than when connected to the external clock products.

From his conclusion: “Overall, it should be clear from these tests that employing an external master clock cannot and will not improve the sound quality of a digital audio system.” So, while an outboard clock product is useful when multiple devices must be sync’d to the same word clock, an external clock won’t improve audio quality in a typical setup with one sound card or DAC. [3]

Perhaps the most important point about jitter audibility was summed up in The Audio Critic where digital audio expert Robert W. Adams of Analog Devices, wrote: “Traditional THD+N versus frequency tests and FFT spectrum plots for input signals of various frequencies are adequate to cover the effects caused by jitter. There is no reason to single out distortion components caused by jitter as distinct from those caused by such other effects as D/A nonlinearity, op-amp distortion, etc.” [4]

So clearly there’s no basis for believing that jitter can affect bass fullness or stereo imaging as is commonly claimed. Fullness is a frequency response issue that’s easily verified, and imaging changes would require much larger timing errors that also differ between the left and right channels.

Audio Examples
Earlier I mentioned that jitter is difficult to generate artificially in controlled amounts. Fortunately, I discovered the Distort program available for free at distortaudio.org. Distort accepts input from wave files, and lets you add various artifacts in controlled amounts such as harmonic distortion, dither, noise, hum and buzz, reduced bit depth (quantization noise), and of course jitter.

To let readers hear for themselves at what level jitter is audible I created four sets of example music files, each with increasing amounts of jitter. These music examples are mostly “gentle, open sounding” productions, rather than loud compressed pop tunes that might be too dense to notice small amounts of added noise or distortion. All the tunes were ripped from their original CDs as full-quality wave files. Note that I applied an aggressive amount of jitter to make it obvious. The smallest amount is 10ns, about 10-20 times more than even consumer-grade audio devices. At 10ns I can’t hear a difference from the original source, but maybe younger listeners with better ears can. Then I added 10µs of jitter (1,000 times more) which brought the noise floor up to around -30dBFS, and then 100µs, which raised the noise to -15dBFS. At 100µs you can clearly hear the Tss Tss Tss percussive sound of the jitter artifacts as well as the generally gritty quality.

Before adding jitter, I normalized the files to around -4dBFS to guarantee that the audio will be well below any DAC’s clipping level. Rather than require readers to juggle a large series of audio examples I created a five-minute video that plays and identifies each tune four times in a row with increasing amounts of jitter. I also extracted just the jitter from two of the songs, so you can hear what the jitter alone sounds like. YouTube videos apply lossy compression to the audio, so I rendered this video in both QuickTime and Video for Windows formats using CD-quality PCM encoding. Then I hosted them on my own website to retain their full audio quality. Use the links listed under Project Files to either stream the video or save the files to your computer.

Conclusion
Assuming you watched the video, I think the most important revelation is that even 10 times more jitter than is usual for inexpensive audio devices is innocuous and doesn’t harm the music. The usual scapegoats “fullness and width” are clearly not affected by the addition of jitter noise. Even with the jitter at 10µs, which is huge and limits the signal-to-noise ratio (SNR) to only 35dB, you can appreciate the power of the masking effect. Unlike tape hiss and vinyl record crackles and pops, jitter noise is present only when the music plays. So, it’s always masked by the music and never audible between songs. Then, why do so many people believe that bass response and imaging are affected by jitter?

Through my research in room acoustics, I believe the acoustic phenomena known as comb filtering is the most plausible explanation for many of the differences people claim to hear from cables, power conditioners, isolation devices, low-jitter external clocks, ultra-high sample rates, and so forth. Comb filtering is a type of frequency response error that occurs when direct sound from the loudspeakers combines in the air with reflections off the walls, floor, ceiling, and other nearby objects. This causes a change in the frequency response reaching your ears, even across surprisingly small distances.

My earlier article “Why We Believe,” (see Resources), explains more about this phenomenon, and shows how moving your head even a few inches can change what you hear by a very large amount. The difference people hear may in fact be real, but it’s not caused by changing wires or adding a “power” product or isolation device. aX


Project Files
Download the demo video from Ethan Winer’s website in either format:
Video for Windows (57 MB): https://ethanwiner.com/jitter.avi
QuickTime (64 MB): https://ethanwiner.com/jitter.mov

References
[1] E. Winer, “Artifact Audibility,” http://ethanwiner.com/audibility.html
[2] J. Watkinson, The Art of Digital Audio, Routledge, 2000.
[3] H. Robojohns, “Does Your Studio Need a Digital Master Clock?, Sound On Sound, June 2010, www.soundonsound.com/sos/jun10/articles/masterclocks.htm
[4] R. W. Adams, “Clock Jitter, D/A Converters, and Sample-Rate Conversion,” Analog Devices, Inc., Wilmington, MA.

Resources
Distortion Audibility Tester, https://distortaudio.org
E. Winer, “Why We Believe,” http://ethanwiner.com/believe.html

This article was originally published in audioXpress, January 2024
Page description
About Ethan Winer
Ethan Winer has been an audio engineer and professional musician for more than 45 years. His Cello Rondo music video has received nearly 2 million views on YouTube and other websites, and his book The Audio Expert published by Focal Press, now in its second ed... Read more

related items