Avoiding the Collapse: From Stereo to Mono (Compatibility)

November 16, 2022 | Know-how

Mono compatibility in the context of modern digital mixing means that the overall sound and feeling of a stereo mix should remain intact when played back in mono (typically on a single speaker). While the stereo image will obviously collapse to mono, the spectral balance as well as the presence of different sources should stay unchanged. The keyword here is “should”. There are a few serious issues that can literally ruin a stereo mix when played in mono. We explain the technical backgrounds of these issues and how you can avoid them.

Why is mono compatibility (still) important?

Intuitively, a lot of people tend to think that music is played back in stereo most of the time. And that’s true because when listening to music in a car, on a stereo hi-fi system at home or when wearing headphones, we typically have a left and right speaker that are able to recreate the stereo image intended by the mixing engineer. But besides these typical stereo listening situations, there is a surprisingly high number of scenarios where we are listening to music in mono (from a single speaker or from multiple speakers, all playing back the same signal):

  • Most modern wireless speakers (e.g. Amazon Echo, Sonos, Google Nest) are typically used as single mono speakers and not as stereo setups.
  • In a club or in a restaurant, music is very often played back in mono from multiple speakers. Since a left-right localization is not possible there, all speakers simply play back the same (mono) signal.
  • A lot of cell-phones still use a single speaker – or a stereo speaker pair with two speakers in very close proximity. A lot of mono compatibility issues (like frequency masking, see below) will also occur in these setups.

What are the main problems when playing a mix back in mono?

Mixing in stereo, adds an extra creative layer of “panorama” and “width”. It allows us to position sound sources at different spatial locations, making it possible to reproduce realistic acoustic images, e.g. an orchestra or a band playing in front of us, where some sources are located on the left, while others are located in the center or the right, or to simply create wide layers of sound. The additional information provided by the width of a track also enables us to acoustically separate sources overlapping in frequency by placing them in different positions of the acoustic image. When a stereo mix is played back in mono, this additional creative layer vanishes – and that can lead to major problems, namely phase cancellation and frequency masking.

 

Issues and how to fix them

From stereo to mono

When a stereo signal is played back in mono, the left and the right channel are merged into a single waveform. This process of mono downmixing can be expressed by the following simple formula:

Mono = (Left + Right)/2

So, the mono signal is simply the sum of the left and the right channel divided by two. The division by two makes sure that the summing does not increase the maximum peak level of the resulting mono signal.

The problem with the phase …

When summing two signals, the waveforms are combined. This combination of overlapping waves is also called interference and – depending on the shape, frequency and temporal relation (phase offset) of the signals – this interference can be positive (peaks and valleys add up and the waveform becomes larger) or negative (peaks and valleys cancel each other out and the waveform becomes smaller).


It’s easiest to understand the effect of interference when looking at a simple sinewave. If the same sinewave (same phase, amplitude and frequency) is present on two channels, summing them together will lead to a sine wave with double the amplitude. Conversely, if one of the sinewaves is phase-shifted in a way that the valleys of one sinewave are located at the peaks of the other wave (180° phase shift), the two signals will cancel each other out when summed together.

The shape of any real-world signal like music is much more complex than a single sinewave – actually, any signal can be expressed as combination of a huge number of sinewaves interfering with each other, that’s how the Fourier Transform is working. Hence, summing up the left and right channel of a stereo signal will never cancel out the entire signal as we have a mixed effect of positive and negative interference. Still, problematic temporal relationships between similar signal components on both channels can lead to a so-called comb-filtering effect.

A comb-filter emerges when two signals are summed together that carry similar frequency-components with a 180° phase-shift. These frequency components will cancel each other out when summed together, leading to a metallic and hollow sound. Especially at lower frequencies where wave lengths are huge and signal changes are comparably slow, emerging comb filters are a typical problem.

 

The comb-like filter-shape is based on the fact that a certain temporal offset between two signals cancels out all frequencies where the offset represents a 180° phase shift. For example, a time-shift of 5ms will cause a 180° phase shift for 100Hz components and all its odd multiples (e.g. 300Hz, 500Hz, etc.) while all even multiples (e.g. 200Hz, 400Hz, etc.) will positively interfere.

The problems with frequency masking …

While we’ve seen that time shifts can be one cause for unwanted filtering effects when downmixing a signal to mono, another effect is unwanted frequency masking caused by the loss of the width layer.

As outlined above, a mono signal is generated by summing the left and right channel of a stereo signal. The loss of the width layer means, that all signals covering a certain frequency region are now all coming from the same direction and are no longer separated by their spatial distance to each other. This collapse of all sources into one location can lead to problematic masking effects. A mix with clearly distinguishable sources on stereo may sound muddy in mono and quiet components may even be fully masked by competing sources.

 

An effect that can further aggravate this unwanted masking effect is the division of the summed stereo signal by 2. If a signal is hard panned to the left or right, it is only present on one channel, while a signal coming from the center is equally present on both channels. So, when summing up the channels and dividing the result by 2, the level of a signal panned to the center remains the same (it’s added to itself and divided by 2). However, a signal hard panned to one side loses about 6dB in level (it’s added to silence and divided by 2).

For mixing, this means that the wider a source is panned to one side, the more level it will lose when summed to mono.

 

Correlation: Bringing it all together

Now that we know that phase problems as well as frequency masking of overlapping sources can lead to problems in a mono downmix, let’s have a look at a number that helps us to identify both kinds of problem – the correlation value.

Simply put, the correlation value is a metric for the “similarity” of two signals. Hence, it’s a good indicator for the perceived width of the mix and for spotting potential phase cancellation problems. Although the actual correlation value of a signal heavily depends on the mix (instruments, number of sources etc.), it’s good to keep the following rules of thumb in mind when analyzing a mix:

 

  • The closer the correlation value is to +1, the more similar are the left and right channel and the smaller is the perceived width of the signal.

  • The closer the value is to zero, the more unrelated are the left and right channel and the larger is the perceived width. Close to 0, the signals become very wide and summing may cause unwanted frequency masking.

  • All signals with a correlation below 0 indicate out-of-phase components that will typically lead to unwanted phase cancellation effects if the signals are summed to mono.

 

 

What can I do to avoid mono-compatibility problems?

Handle stereo widening effects with care
Effects that are touching the phase of your signal to create width are a typical problem for mono compatibility. The additional width comes from intentionally “decorrelating” the left and right signal – and while the result may sound great in stereo, chances are that your mono downmix sounds hollow and flat.

Avoid negative correlation values
If the correlation meter tends toward a negative correlation value, check signal components that may be out-of-phase. Sometimes inverting the phase of one channel can help to resolve problems with phase cancellation. You can also try to slightly time-shift signal components that are present on both channels or use plug-ins to delay your track by a certain number of samples.

Mono your low end
Since very low frequencies (e.g. sub bass) are non-directional when played back, you should always keep them mono. Bass signals in stereo are particularity prone to phase-cancellation issues – so always make sure that the width of your bass is not unnecessarily wide.

Avoid extremely wide panning
The further left or right a signal is panned in the stereo mix, the better competing sources are separated by the additional layer of width. If you listen to your mix in mono and realize that one of your sources suddenly disappears, you may try to pan them closer to the center in the stereo mix or try to give each source more space in the (stereo) mix by spectral mixing.

Use Stereo-Widening Layers
Instead of spreading out a signal (e.g. the vocal), you may want to try to keep the main signal mono and add additional layers (e.g. background vocals) that are panned to the left and right.