In a professional workflow Dither will be applied to audio clips (or mixes) when reducing word length. This process will mitigate errors that occur due to the subtraction of digital audio bits. I thought I’d cover the basics.
Digital Audio
Digital Audio incorporates individual samples consisting of bits created by the process of Quantization. This is essentially the conversion of a continuous, linear range of values present in analog audio into a fixed range of discrete values. Bit Depth (a.k.a. Word Length or Resolution) represents the number of bits stored in a sample’s measure of amplitude. It indicates the extent of inherent vertical precision. Higher bit depths (or bits per sample) encompass improved vertical dynamic resolution resulting in an extended Dynamic Range.
1 bit = 6dB of Dynamic Range. Theoretically 16bit audio has a quantified Dynamic Range of 96 dB. 24 bit audio has a quantified Dynamic Range of 144 dB. However, in order to accurately assess Dynamic Range we must also recognize the amplitude of the highest spectral component of the inherent noise floor. Specifically, where it resides relative to the maximum Peak value that a system is capable of reproducing. Dynamic Range is the measurement of this ratio or range.
Signal to Noise Ratio (SNR) is the quantified range between the nominal average signal level and the average level of the noise floor. Audio with an extended Dynamic Range will exhibit a higher SNR compared to audio with a reduced Dynamic Range. In essence 24 bit audio will allow you to work with additional headroom without any increase in noise compared to 16 bit audio.
Word Length Reduction
Truncation is the removal of bits with no compensating replacement. The repositioning of samples after converting to a lower resolution creates Quantization Errors resulting in audible artifacts and distortion. Dither is technology that adds minimal perceived noise to audio before word length reduction. This noise will mitigate (mask/remove) the audibility of distortion caused by Quantization Errors. The process preserves fidelity and Dynamic Range of audio throughout bit-depth conversion and/or bit-depth reduction exporting.
There is a trade off: you are replacing bad noise with alternative “good” noise that is smoother, less audible, and much more consistent.
Noise Shaping is a supplemental option that pushes noise into frequency ranges that are less audible to humans, thus allowing greater Dither with reduced perceptual noise.
(Take a look at the Noise Shaped frequency response curve in the attached image. There is a clear visual indication of increased gain at higher frequencies).
Podcasting
So what does this all mean for the typical Podcast Producer? Is Dither just another obscure aspect of professional Audio Mastering and/or Post Production that can be safely ignored?
Consider the following variables:
If you are recording spoken word using properly configured gear in a reasonably quiet and optimized environment – there is no discernible advantage recording 24-bit audio in preparation for 16-bit encoding and delivery. In my opinion 16-bit audio from acquisition to distribution will be more than adequate.
If you elect to record 24 bit audio, and you are not properly implementing word length reduction to 16 bit, you are essentially nulling the advantages of the original higher resolution audio. In essence fidelity degradation (artifacts/distortion) will occur due to the absence of efficient error masking. This is not my opinion – it is a fact.
Remember, I’m specifically referring to spoken word audio slated for Podcast distribution. If you are tracking music, well then by all means make full use of the advantages of higher resolution audio recording.
Consider this: The stand-alone version of iZotope’s Ozone 8 Mastering Suite processes all imported audio to 32 bit word length. The manual specifically states:
“Ozone processes files at 32-bit so Dither is desirable for files being exported to values lower than 32-bit …
… When exporting to a bit depth lower than 32-bit, checking this (Dither option) box will apply high-quality dithering to the exported file. This allows you to preserve the sound quality and dynamic range of a higher bit depth, when exporting the audio file to a lower bit depth.”
Most DAWS include Dither options. In some cases it’s by way of a plugin. You may also notice Dither options included in application Preferences or Export dialogs.
Hopefully after reading this article you will understand what Dither is, it’s purpose, and whether you should consider implementing it. Please note: Dither must be applied at the very last stage of any processing chain.
-paul.