You are seriously looking forward to a professional career in the audio industry, but do you know how sound actually affects the human brain? As an audio engineer, regardless of the engineering discipline, you will always be dealing with sound, in relation to the audience’s perception. Thus, the very important question that you need to be asking yourself is – What is Psychoacoustics?
In today’s article, I aim to present to you enough information about this very important subject of psychoacoustics. Hopefully, without making it too overwhelming (especially for beginners), you will have a good fundamental understanding about the various concepts surrounding this subject by the end of this article. So, are you ready for another awesome lesson!? (seriously though, don’t answer me) Then, lets begin!
To put it in a nutshell, psychoacoustics is defined as the scientific study of the perception of sound. To be more specific, it is a branch in science, that studies the psychological and physiological impact of sound (with speech and music) on humans. Psychoacoustics is also known as a branch of psychophysics.
Hearing sound is not solely a mechanical process of waves propagating, but it is also an occurrence that is “sensory” and “perceptual” in nature. For example, when you hear noise, that noise arrives at your ear as a mechanical sound wave propagating (travelling) through the air, but within your ear it is transformed into neural “action potentials” (also known as “nerve impulses”).
Your brain will then receive these nerve impulses, and that is where the “noise” is perceived. Hence, in order to solve many problems in the audio world (especially acoustics), it is absolutely essential to not just observe the acoustical response of the environment, but also take into account the fact that both the ear and the brain, play a vital role in a person’s listening experience.
An example of an audio application that takes advantage of psychoacoustics, are “Data compression techniques” (such as MP3). The process of compressing audio files, takes into account the fact that significant signal processing occurs in the inner ear, to convert sound waves into neural stimuli. Hence the distinctness between various waveforms (as a result of data compression) may be inaudible, at least to the untrained ear.
Now let us take a look at some of the various concepts surrounding psychoacoustics:
- Limits of perception
- Sound localization
- Masking effects
- Missing fundamental
- Applied psychoacoustics
Limits of perception
In general, humans can hear sounds that are within the frequency range of 20 Hz to 20,000 Hz (20 kHz). The upper limit will normally decrease as people get older (most adults are said to be unable to hear sounds beyond 16 kHz). Under a strict laboratory environment, the lowest frequency identified as a musical tone is 12 Hz, and tones between 4hz to 16 Hz can be perceived by the brain through the body’s sense of touch.
Within the range of 1000 – 2000 Hz, the frequency resolution of the ear is 3.6 Hz. This means that any change in pitch larger than 3.6 Hz, are audible in a controlled environment. However, smaller pitch differences can also be identified through other means. For instance, when two pitches interfere, they are often heard as a repetitive variation in volume of the tone. This causes a “tremolo” effect, that occurs with a frequency equal to the difference in frequencies of the two tones (also known as “beating”).
In terms of the intensity of audible sounds, the human ear drum can sense variations in sound pressure, detecting even the smallest change of a few micropascal (unit of measurement for pressure) to greater than 100 000 Pascal. For convenience, sound pressure level (SPL) measurements are logarithmic (dB), with all sound pressures referenced to 20 Pa (1 Pa = SPL 94dB).
Thus, the lower limit of audibility is referenced at 0 dB, but the upper limit remains unclear. The upper limit is more dependent on whether the ear will be physically harmed or faces the risk of “noise-induced” hearing loss. A more in-depth analysis of the lower limits of audibility, will reveal the fact that the “minimum threshold” at which a sound can be perceived, depends on its frequency.
In order to find out the lower limits of audibility, minimum intensities for testing tones of various frequencies are measured. From this, an absolute threshold of hearing (ATH) curve can be realised. The results will reveal that the ear has a peak of sensitivity (aka “its lowest ATH”) between 1 – 5 kHz, however, the threshold will change with age, and older people will be less-sensitive to anything above 2 kHz.
You can find the “ATH” on the lowest of the “Equal-loudness contours”. These contours are meant to show you how high (or low) the sound pressure level (dB SPL) has to be for each frequency, in order for you to perceive that sounds from across the range of audible frequencies, will be of equal loudness.
Fletcher and Munson were the ones who pioneered the equal-loudness contours at Bell Labs in 1933. By using pure tones reproduced through headphones, they collected the data and plotted it on a graph which was later known as “Fletcher–Munson curves”. Since loudness was subjective and thus, difficult to measure, the Fletcher–Munson curves had to be averaged over many test volunteers.
The process of determining the whereabouts of a specific sound source is called “Sound localization”. The subtle variances in tone, loudness and timing between your two ears, allows your brain to locate sound sources. Localization involves the three-dimensional position, known as the azimuth (horizontal angle), the zenith (vertical angle), and the distance (for static sounds) or velocity (for moving sounds).
Humans, together with most animals that are four-legged, are comfortable at detecting direction in the horizontal angle, but not as effective in the vertical due to the symmetrical position of our ears. There are species of owls that have a set of asymmetrically positioned ears, which detects sound in all three planes (useful for the hunting of small animals at night).
Amazing isn’t it?
This is a relatively simple concept to understand. In many situations throughout our busy lives, a perfectly clear, audible sound can be masked by another sound. Lets take for example a conversation at a bus stop – it can be almost impossible to hear one another, if a huge truck is driving past. This phenomenon is known as “masking”. A quieter sound is “masked” if it is overshadowed by the presence of a louder sound.
The concept of “missing fundamental” deserves an article on its own, in order to be explained. However, we shall just cover it very briefly here.
A “missing fundamental” implies that the overtones of a note, suggests a fundamental frequency is present, even though the sound actually lacks a component at the fundamental frequency itself. Our brain identifies a pitch not just by its fundamental frequency, but also by its higher harmonics. Confused? Okay lets look at an example.
When a note (say from a bass instrument) has a pitch of 100Hz, that pitch actually contains frequencies that are integer multiples of that fundamental frequency (100Hz). In essence, you are also hearing 100, 200, 300, 400, 500Hz and so forth. Even if some loudspeakers may not be able to reproduce frequencies that low, you’d still be able to hear the pitch of “100Hz”, thanks to its overtones and the way our brain perceives them!
The study of psychoacoustics, allows for the compression of high quality lossy signal, without most of us realizing that there is a distinct drop in quality. This is done by knowing which parts within any given digital audio signal can be removed (or largely compressed), without the human ear being able to perceive the difference.
A great analogy to further illustrate this process would be this – imagine how a sharp clap of the hands in an empty room would be painfully loud and sharp, but after a car backfires on a busy road outside, it is hardly noticeable. This concept basically forms the basis of the compression ratio, that is used by almost all modern lossy audio compression formats such as MP3, WMA and MPEG-1.
Given that the ears have various limitations in perceiving sounds (high frequency limit, absolute threshold of hearing, temporal and simultaneous masking), a compression algorithm can be designed to give lower priority to sounds outside the human hearing range and carefully shift bits away from less distinct components, towards the important ones, thus ensuring that the sounds you are most likely to perceive are of the best quality.
The various concepts and topics within the subject of psychoacoustics are also very relevant to psychological studies in music, and music therapy.
Psychoacoustics includes the study of pitch, timbre, loudness and duration of musical sounds. All these topics are at the centre of “music cognition” (the perceived structure of music). Other concepts such as auditory illusions and sound localization are also very relevant to the study of musical compositions and the design of venues for live performances.
Industries of computer science, computer engineering, and computer networking have always been deeply associated with psychoacoustics. Even internet pioneers J. C. R. Licklider and Bob Taylor, are both academically qualified in psychoacoustics. Companies like BBN Technologies, initially specialized in acoustics consulting issues before moving on to designing and building the first packet-switched computer networks.
Psychoacoustics finds it way into many fields ranging from software development (mapping of experimental mathematical patterns by developers), digital signal processing (using of psychoacoustic models by audio compression algorithms such as MP3, to maximise compression ratios) as well as defence systems where new acoustic weapons (emitting frequencies that impair, damage, or kill) are being developed to be used in the military for combat!
You can also see in music today, where more artists are creating new auditory experiences by masking certain frequencies of instruments, thus enhancing other frequencies as a result. Another common application lies in the design of lower-quality loudspeakers, where the phenomenon of “missing fundamentals” allows listeners to still perceive lower fundamental notes that in actuality, can’t be produced by these low-quality speakers.
And finally, we have come to the end of this article (it was a rather lengthy one was it?). I hope this has given you folks a decent enough understanding about some of the various concepts within psychoacoustics!
Do share, and leave a comment if you like! Cheers!