Do you know the technological concept behind storing audio as digital data? In today’s modern audio recording era, we are able to store our favourite music and audio files in various digital devices. The principles behind converting sound into digital form is something every audio engineer should know. Hence, the question for today is – What is Audio Sampling?
In this article, I hope to be able to clarify and explain the fundamental principles in audio signal processing. We will go through the various relevant topics surrounding audio sampling. Find out more about the side effects of sampling audio, how it affects the quality of sound, and also its impact on human hearing. Ready to start learning? Let’s begin then!
Sampling – What is that?
In order to have a clearer understanding of how digital sound reproduction works, we first need to familiarise ourselves with the concept of sampling itself. In signal processing, sampling is the act of reducing a continuous signal to a discrete signal.
A typical example of this process, is the conversion of sound waves (continuous signal) to a sequence of individually separate, and distinct samples (a discrete-time signal). To put it simply, a sample is a value (or set of values) at a point in time and/or space.
In order to extract samples from a continuous signal (in this case, sound waves), a subsystem or an operation that is often called sampler, is needed. Theoretically speaking, an ideal sampler will be able to produce samples that are equivalent to the instantaneous value of the sound wave, at any desired point.
The two main topics that we will be discussing in this article are:
- Audio applications
In most consumer and professional audio applications, analog audio signals are often converted to digital signals using a sampling method called pulse-code modulation (PCM). This whole conversion process will include an analog-to-digital converter (ADC), digital-to-analog converter (DAC), storage (for example, PC hard disk) and transmission.
When you take a closer look at digital systems, they are actually discrete-time, discrete-level version of a previous electrical analog system. What makes digital systems great, is that they allow us the ability to store, retrieve and transmit signals, while maintaining the same level of signal quality.
Samples are measured by “S/s”, which indicates “Samples per Second”. To illustrate, “1MS/s” means one million samples per second. Keep in mind that in audio, the unit “Hz” is used to represent the number of samples per second (sampling rate).
In order to accurately capture sounds that cover the whole human hearing range (20 to 20,000 Hz), such as when recording music in studios, or audio playback during live events, audio waveforms are typically sampled at 44.1 kHz (CD standard), 48 kHz, 88.2 kHz, or 96 kHz.
These sampling rates are a result of adhering to the Nyquist Theorem, which states that in order to faithfully reconstruct a signal, it has to be sampled at a rate, of at least twice of it’s highest frequency component. Take note that sampling audio at rates higher than 50 to 60kHz, will not provide any useful information to human listeners (due to the human hearing range).
You might have heard of a growing trend in the professional audio industry, of using sampling frequencies that are well beyond the standard requirement, such as 96 kHz and even 192 kHz. Higher sampling rates allow for a more relaxed low-pass filter design requirements for ADCs and DACs, and also provide more processing options with a higher audio quality.
Some common audio sampling rates (recommended by the Audio Engineering Society):
- 44.1 kHz – Audio CD standard and widely used with MPEG-1 audio. Sony originally adapted this, as it could be recorded on video equipment running at 25 frames per second (PAL) or 30 frame/s (using an NTSC monochrome video recorder) and also cover the 20 kHz bandwidth which is a standard for many professional analog recording equipment.
- 48 kHz – Used by professional digital video equipment such as tape recorders and video servers. This rate was initially picked as it is capable of reconstructing frequencies up to 22 kHz and also work with NTSC videos running at 29.97 frames per second. Many professional audio equipment such as mixers and digital recorders, use 48 kHz sampling rate.
- 96 kHz – Adapted for DVD-Audio, some LPCM DVD tracks, BD-ROM (Blu-ray Disc) audio tracks, and HD DVD (High-Definition DVD) audio tracks. Some professional production equipment allow for 96 kHz sampling. This sampling frequency is double that of the 48 kHz standard, typically used with most professional audio equipment.
The topic of bit depth deserves an article on its own, but for this post, I will briefly explain its concept. Bit depth represents the number of bits of information in every sample, and is closely related to the resolution of each sample.
The general rule is that, the higher the bit depth of the recording, the more accurate the sound wave’s digital amplitude representation will be. A 16-bit recording has a theoretical dynamic range of 96dB, and a 24-bit recording will have a range of 144dB.
Looking at the fact that most digital signal processing operations can typically have a very high dynamic range, it is common practice among professional engineers (or producers) to mix and master audio tracks at 32-bit, and then converting them to 16 or 24 bit for distribution.
Here’s a great video illustrating the principles of audio sampling!
By now, you should have realised that the CD standard sampling rate is 44.1 kHz. However, if we faithfully follow the Nyquist Theorem, we should only need to use a sampling rate of 40 kHz, since the upper limit of the human hearing range is 20 kHz. So the question we need to ask is, why use a higher sampling rate?
Aliasing is an inherent problem in digital signal reconstruction. It happens when a frequency that is different from the original one sampled, is being reproduced, as a result of not having a high enough sampling rate. Hence, by following the Nyquist Theorem, all frequencies below the highest frequency (in this case 20 kHz) will not be aliased.
However, for frequencies that are exactly half of the sampling rate (in this situation, the rate is 40 kHz), the samples will land exactly on the zero points of the wave, and this ultimately produces a sound wave that is completely flat (entirely silent). Thus, a sampling rate higher than 40 kHz is needed in order to keep the highest frequency in the hearing range.
The sampling rate of 44.1 kHz will be able to produce the highest frequency of just below 22050 Hz (22.05 kHz), which is already way above the human hearing range. In practice, frequencies near the hearing limit might not be sampled accurately. Hence, professional audio devices uses a sampling rate of 48 kHz to provide enough headroom to minimise audible aliasing artifacts.
This is all I have for you guys today. Hopefully these concepts are not too hard to grasp, especially for those of you who are completely new to this whole audio sampling thingy!
Do leave comments or questions below, and share this article with your friends!