The loudness war and normalization practices of publishing platforms (e.g. streaming, broadcasting) made sure that loudness became a much-discussed topic. You can find a lot of tips, approaches and explanations about loudness in connection with compression, limiting and publishing. But what about the dynamics of a track? Dynamics are closely connected to loudness and we think that they deserve much more attention than they usually get. So, let’s talk dynamics (and yes, a little bit about loudness too)!
Dynamics (& Loudness)
The loudness and dynamics of a track are inherently linked. By reducing the dynamics with a compressor or a limiter, the loudness can be increased. Quiet and loud sections are squeezed closer together (peaks are going to be cut when using a limiter) and the newly gained “space” can be used to increase the overall level of the track.
For quite a while, one the main goals of producers was making a track as loud as possible – leading to the formula “louder = better”. Why? If one track is louder than another during playback, chances are that the louder track will immediately stand out as the one with greater power and having a higher audio quality. It’s all in our heads – but try it yourself by listening to the same track at two different levels…
Over the past decade, normalization practices of streaming platforms have started to make sure that all uploaded tracks are played back at the same perceived loudness level to avoid this loudness bias. Therefore, a track no longer gets our attention just for being louder than the others and that’s actually a good thing for its dynamics. By regulating loudness, something great has re-emerged. We now have more creative choices and the freedom to shape the dynamics of music in a way that suits its musical content. If you want to know more about normalization and streaming platforms, check out this article.
Dynamics from a creative point of view
The importance of dynamics starts at the very beginning of creating music – in the arrangement and the choice of instruments. You can use dynamics to create an arc of suspense; thereby keeping your track interesting and exciting across its entire length –think of the drop in EDM.
Dynamics also play a central role in defining the character of a piece of music. While high dynamics help to make songs livelier and more natural sounding, reduced dynamics induce a continuous vibe and help to maintain an ongoing flow of energy.
Each genre usually has a distinct aesthetic ideal regarding its dynamics. To name just a few examples: Classical music or acoustic tracks typically aim for increased dynamics and little distortion since they focus on naturalness, spaciousness and transparency. Metal or hard EDM tracks, however, are characterized by distortion and a dense, powerful sound pattern. In these genres, dynamics, although important for transparency, are usually secondary.
So it’s not loudness that defines the “right” amount and style of limiting or compression, but rather the envisioned dynamics of a track. For example, our limiter plug-in smart:limit uses the dynamics value and not the loudness value as a learning target when computing the limiter parameters. It doesn’t make sense to aim for a certain loudness if that doesn’t fit the style of your track (e.g. making a pop track overly-dynamic simply to avoid it becoming too loud).
The influence of dynamics on the character and aesthetics of an audio track
|natural and lively sound||track may “fall apart” due to a non-homogeneous acoustic image|
|punchy and transient||potential lack of power and constant energy|
|good spaciousness (natural decay from “loud” to “quiet”)||problems during playback on systems with a small dynamic range|
|no distortion (caused by dynamics processing)||quiet sections may become inaudible when played back in a noisy environment|
|powerful and ongoing flow of energy||track may become flat and lose tension|
|dense and compact acoustic image||naturalness of real-world sound sources may be reduced|
|high playback level when played back on a system without loudness normalization||spaciousness and transparency are limited (no natural decay possible)|
Dynamics and technical limitations of recording media
The maximum (theoretical) dynamic range of human hearing is approximately 120-140dB (varying according to frequency) – this is the range between the minimum and maximum sound pressure that the ear can handle.
The ear converts changes in air pressure into sound and it can perceive sound pressure levels ranging from 20 μPa (hearing threshold) till 20 Pa (pain threshold). The actual sound pressure level (SPL) is computed by 20 log10 p1/p0, where p0 is the minimum sound pressure level, and p1 is the measured level. Using this formula, we can see that whispering has an SPL of 20dB, a conversation has a level of 60dB and loud club music hits our ears with an SPL of 100dB or even higher.
While the maximum dynamic range is 140dB, the human ear cannot handle the entire range at once. It adapts its sensitivity to the average input level in order to protect the inner ear (e.g. the eardrum becomes “stiff” and less sensitive when you are in a very loud environment). The ear has its own internal “compressor”. Early recording media had a limited dynamic range and audio needed to be compressed due to such technical restrictions. Cassettes or vinyl only provided a dynamic range of 60dB to 70dB – so the quietest (noise floor) and loudest parts of a track had to fall within this range.
The CD uses a resolution of 16Bit, providing a dynamic range of 96dB – and by using dithering it can even go up to a perceived 120dB. So in terms of dynamics, there was basically no need to compress signals due to technical restrictions anymore. Modern media typically use 24Bit encoding, which leads to a dynamic range of more than 140dB. As this is more than the ear can (even theoretically) handle, the technical limitations of modern recording media don’t need to be considered.
There are several different measures and values that describe the dynamics of a track. Whenever you are working with a dynamics metering tool, you’ll come across some of them. Therefore, it’s good to have an idea of how these values are measured and what they tell you.
PLR (Peak to long-term Loudness Ratio)
The PLR is often used to describe the dynamics of a track. It is the difference between the integrated loudness and the maximum true peak value measured over the whole observation period (e.g. the entire track). It provides a macro perspective of the dynamics of a track.
PLR = Max. Peak - Integrated Loudness (computed over the whole observation period)
PSR (Peak to short-term Loudness Ratio)
PSR is the difference between the short-term loudness and the maximum true peak value measured in windows of three seconds. In very loud and highly compressed parts of a song, PSR can show you if there is still a healthy amount of dynamics left.
PSR = Max. Peak - Short Term Loudness (computed within a 3 second window)
Dynamics in our plug-in smart:limit
smart:limit uses the median of all measured PSR values to measure the dynamics. Although the PLR value is a more commonly used descriptor for dynamics, we have found that our descriptor more closely matches the actual – also short term – dynamics of a track.
Our tests have shown that the PLR value can sometimes be a bit misleading since it’s basically another descriptor for the integrated loudness. For example, if multiple tracks are normalized to -1dB True Peak, the PLR value is simply their Integrated Loudness value subtracted by 1.