sonible plug-ins: Artificial Intelligence inside

May 11, 2022 | Know-how

Have you ever asked yourself how and why your sonible smart plug-ins are powered by AI? sonible’s co-founder Alexander Wankhammer gives you answers in this short excursion into the history of sonible and explains what we are aiming for in developing audio production tools with intelligent processing.

Note: We replaced smart:EQ 2 with smart:EQ 3! The intelligent equalizer now sports two AI-powered functions and was overhauled extensively. The following insights regarding the processing and functionality are basically the same in smart:EQ 2 and smart:EQ 3 though.

How and when did sonible come up with the idea to incorporate AI into audio plug-ins?

Alex: When Peter, Ralf and I founded sonible, AI was already on our minds. The idea of using machine learning (ML) to enhance the performance of existing algorithms intrigued us. We saw what was possible at that time in other fields by using ML and wanted to make this processing power accessible to music producers.

The starting shot to fully delve into the technologies of AI was the development of frei:raum, which was released in January 2015. Its smart:EQ layer is enhanced with “classical” statistics-based ML algorithms that automatically correct spectral deficiencies. From working with ML we have now advanced to incorporating Deep Learning (DL) into our latest plug-in releases.

What does Deep Learning do in an audio plug-in?

Alex: How DL can work in an audio plug-in is best explained by looking at an example. Let’s take smart:EQ 3. At its core is a system that is primarly trained with the help of huge amounts of data. This system has learned how to transform “poor data” that is characterized by spectral deficiencies into “good data” characterized by spectral balance. The development of smart:EQ 3 included building a specialized convolutional neural network architecture. By showing it a spectro-temporal representation of the “poor data” and defined “good data” as the target for its output – and doing so thousands of times – the network learned how to correct any issues. The trained network works on its own from then on.

Since this process is purely data-driven, things such as selection of high-quality data, data pre-processing and quality control have an immense impact on the performance of the overall system. That’s why generating profiles for a smart plug-in is quite time-consuming. Finding the best combination of parameters, the right way to sonically evaluate the performance, re-tuning, etc. took us years and is something that continues to be improved.

To further fine-tune the automatically generated results, we combine the network with model-based approaches.

Why do you combine data-driven and model-driven methods?

Alex: The data-driven method is a kind of black box. Though the results are perfect for any technical aspects of mixing that are based on “right-wrong” or “good-bad” such as source separation or instrument recognition, in a lot of mixing tasks there are crucial decisions to be made that require a formal understanding of audio mixing. And that’s where model-driven methods come into play. These are also data-based to a certain extent but the algorithms are fully comprehensible and, more importantly, can be optimized manually. We have found that combining the two is the right way to go for us.

What is sonible’s approach when it comes to AI and audio production?

Today, there are generally three directions in which AI can be integrated into tools for music production: assistive mixing that focuses mainly on fixing technical issues while the creative process remains fully in the hands of the producer, generative AI that automatically suggests melodies or chord progressions and AI effects that extend the realm of possibilities in generating new sounds. In addition, these three options can all be combined.

We’ve chosen the assistive mixing approach for our plug-ins in order to help professionals save time when it comes to tasks that are repetitive as well as to support ambitious beginners in getting started more quickly. In the end, our goal is to help people with all the basic, iterative groundwork and to quickly get into the “creative flow”. I like to see our plug-ins as a kind of personal mixing assistant – a second, very experienced mixing engineer supporting you in your work.