Figuring out: Gain Staging

What is it?

Gain staging is all about managing the audio levels of different layers within an audio system. In other words, when you need to make something louder, good gain staging is knowing where in the signal chain would be best to do this. 

I will focus this article on the realm of mix & post-production work under Protools, since this is what I do daily, but these concepts can be applied in any other audio related situation like recording or live sound.

Pro Tools Signal Chain

To start with, let's have a look at the signal chain on Protools:

Untitled Diagram (10).png

Knowing and understanding this chain is very important when setting your session up for mixing. Note that other DAWs would vary in their signal chain. Cubase, for example, offers pre and post-fader inserts while on Pro Tools every insert is always pre-fader except from the ones on the master channel.

Also, I've added a Sub Mix Bus (an auxiliar) at the end of the chain because this is how usually mixing templates are set up and is important to keep it in mind when thinking about signal flow.

So, let's dive into each of the elements of the chain and see their use and how they interact with each other.

Clip gain & Inserts

As I was saying, on Pro Tools, inserts are pre-fader. It doesn't matter how much you lower your track's volume, the audio clip is always hitting the plugins with its "original" level. This renders clip gain very handy since we can use it to control the clip levels before they hit the insert chain.

You can use clip gain to make sure you don't saturate your first insert input and for keeping the level consistent between different clips on the same track. This last use is specially important when audio is going through a compressor since you want roughly the same amount of signal being compressed across all the different clips on a given channel.

So what if you want a post-fader insert? As I said, you can't directly change an insert to post-fader but there is a workaround. If you want to affect the signal after the track's volume, you can always route that track or tracks to an auxiliar and have the inserts on that aux. In this case, these inserts would be post-fader from the audio channel perspective but don't forget they are still pre-fader from the aux channel own perspective.

Signal flow within the insert chain

Since the audio signal flows from the first to the last insert, when choosing the order of these plugins is always important to think about whatever goal you want to achieve. Should you EQ first? Compress first? What if you want a flanger, should it be at the end of the chain or maybe at the beginning?

I don't think there is definitive answer and, as I was saying, the key is to think about the goal you have in mind and whichever way makes conceptual sense to your brain. EQ and compression order is a classic example of this. 

The way I usually work is that I use EQ first to reduce any annoying or problematic frequencies, having also a high pass filter most of the time to remove unnecessary low end. Once this is done, I use the compressor to control the dynamic range as desired. The idea behind this approach is that the compressor is only going to work with the desired part of the signal.

I sometimes add a second EQ after the compressor for further enhancements, usually boosting frequencies if needed. Any other special effects, like a flanger or a vocoder would go last on the chain.

Please note that, if you use the new Pro Tools clip effects (which I do use), these are applied to the clip before the fader and before the inserts.

Channel Fader

After the insert chain, the signal goes through the channel fader or track volume. This is where you usually do most of the automation and levelling work. A good gain stage management job makes working with the fader much easier. You want to be working close to unity, that is, close to 0.

This means that, after clip gain, clip effects and all inserts; you want the signal to be at your target level when the fader is hovering around 0. Why? This is where you have the most control, headroom and confort. If you look closely at the fader you'll notice it has a logarithmic scale. A small movement next to unity would suppose 1 or 2 dB but the same movement down below could be a 10 dB change. Mixing close to unity makes subtle and precise fader movements easy and confortable.

Sends

Pro Tools sends are post-fader by default and this is the behaviour you would usually want most of the time. Sending audio to a reverb or delay is probably the most common use for a send since you want to keep 100% of the dry signal and just add some wet processed signal that will change in level as the dry also changes.

Pre-fader sends are mostly useful for recording and live mixing (sending a headphone mix is a usual example) and I don't find myself using them much on post. Nevertheless, a possible use on a post-production context could be when you want to work with a 100% of the wet signal regardless of how much of the dry signal is coming through. Examples of this could be special effects and/or very distant or echoey reverbs where you don't want to keep much of the original dry signal.

Channel Trim

Trim is pretty much like effectively having two volume lanes per track. Why would this be useful? I use trim when I already have an automation curve that I want to keep but I just want to make the whole thing louder or quieter in a dynamic way. Once you finish a trim pass, both curves would coalesce into one. This is the default behaviour but you can change it on Preferences > Mixing > Automation.

VCAs

VCAs are a concept that comes from analogue consoles (Voltage Controlled Amplifier) and allows you to control the level of several tracks with a single fader. They use to do this by controlling the voltage reaching each channel but on Pro Tools, VCAs are a special type of track that doesn't have audio, inserts, inputs or outputs.  VCA tracks just have a volume lane that can be used to control the volume of any group of tracks.

So, VCAs are something that you usually use when you want to control the overall level of a section of the mix as a whole, like the dialogue or sound effects tracks. In terms of signal flow, VCAs are just changing a track level via the track's fader so you may say they just act as a third fader (the second being trim).

Why is this better that just routing the same tracks to an auxiliar and changing the volume there? Auxiliars are also useful, as you will see on the next section, but if the goal is just level control, VCAs have a few advantages:

  • Coalescing: After every pass, you are able to coalesce your automation, changing the target tracks levels and leaving your VCA track flat and ready for your next pass.

  • More information: When using an auxiliar instead of a VCA track, there is no way to know if a child track is being affected by it. If you accidentally move that aux fader you may go crazy trying to figure out why your dialogue tracks are all slightly lower (true story). On the other hand, VCAs show you a blue outline (see picture below) with the real affected volume lane that would result after coalescing both lanes so you can always see how a VCA is affecting a track.

  • Post fader workflow: Another problem of using an auxiliar to control the volume of a group of tracks, is that if you have post-fader sends on those tracks, you will still send that audio away regardless of the parent's auxiliar level. This is because you are sending that audio away before you send it to the auxiliar. VCAs avoid this problem by directly affecting the child track volume and thus also affecting how much is sent post-fader.

Sub Mix buses

This is the final step of the signal chain. After all inserts, faders, trim and VCA, the resulting audio signals can be routed directly to your output or you may also consider using a sub mixing bus instead. This is an auxiliar track that sums all the signals from a specific group of channels (like Dialogue tracks) and allows you to control and process each sub mix as a whole.

These are the type of auxiliar tracks that I was taking about on the VCA section. They may not be ideal to control the levels of a sub mix, but they are useful when you want to process a group of tracks with the same plugins or when you need to print different stems.

An issue you may find when using them is that you may find yourself "fighting" for a sound to be loud enough. You feel that pushing the fader more and more doesn't really help and you barely hear the difference. When this happens, you've probably run out of headroom. Pushing the volume doesn't seem to help because a compressor or limiter further on the signal chain (that is, acting as a post-fader insert) is squashing the signal.

When this happens, you need to go back and give yourself more headroom by making sure you are not over compressing or lowering every track volume until you are working on manageable level. Ideally, you should be metering your mix from the start so you know where you are in terms of loudness. If you mix to any loudness standard like EBU-R128, that should give you a nice and comfortable amount of headroom.

Final Thoughts

Essentially, mixing is about making things louder or quieter to serve the story that is being told. As you can see, is important to know where in the audio chain the best place to do this is. If you keep your chain in order, from clip gain to the sub mix buses, making sure levels are optimal every step of the way. you'll be in control and have a better idea on where to act when issues arise. Happy Mixing.

Dear Devs, 7 Reasons why your Game may need Audio Middleware.

There are several audio middleware programmes on the market. You may have heard of the two main players: Fmod and Wwise. Both offer free licenses for smaller budget games and paid licenses for bigger projects.

So, what is Audio Middleware? Does your game need it?

Audio Middleware is a bridge between your game's engine and the game's music and sound effects. Although is true that most game engines offer ready to use audio functionalities (and some of them overlap with the features explained below), middleware gives you more power and flexibility for both creating, organizing and implementing audio.

Here are the seven main reasons to consider using middleware:

1. It gives Independence to the Audio Team.

Creating sound effects and music for a game is already a good amount of work, but that is barely half the battle. For these assets to work, they need to be implemented in the game and be connected to in-game states and variables like health or speed.

This connection will always need some collaboration between the audio team and the programming team. When using middleware, this is a much easier process since once the variables are created and associated, the audio team will be free to tweak how the gameplay will affect the audio, without the need to go into the code or the game engine.

2. Adaptive Music.

Music is usually linear and predictable, which is fine for linear media like movies. But in the case of video games, we have the power to make music adapt and react to the gameplay, giving the player a much more compelling experience.

Middleware plays an important role here because it gives the composer non-linear tools to work and think about the soundtrack not in terms of defined songs but of different layers or fragments of music that can be triggered, modified and silenced as the player progresses.

3. Better handling of Variation and Repetition.

Back when memory was a limiting factor, games had to get by with just a few sounds, which would usually meant repetition, a lot of repetition. Although repetition is certainly still used to give an old school flavour, is not very desirable in modern, more realistic games.

When something happens often enough in a game, the associated sound effect can get boring and annoying pretty fast. Middleware offers tools to avoid this, like randomly selecting the sound from a pool of different variations or randomly altering the pitch, volume or stereo position of the audio file each time is triggered. When all these tools are combined, we end up with an audio event that will be different each time but cohesive and consistent, offering the player a more organic and realistic experience.

4. Advanced Layering.

Layering is how we sound designers usually build sounds. We use different, modified individual sounds to create a new and unique one. Middleware allows us to, instead of mixing down this combination of sounds, import all these layers into different tracks so we can apply different effects and treatments to each sound separatelly.

This flexibility is very important and powerful. It help us to better adapt the character and feel of a sound event to the context of the gameplay. For example, a sci-fi gun could have a series of layers (mechanical, laser, alien hum, low frequency impact, etc) and having all these layer separated would allow us to vary the balance between them depending on things like ammo, distance to the source or damage to the weapon.

5. Responsive Audio tools.

Sound effects are usually created using linear audio software, like Pro Tools, Nuendo or Reaper. These are also called DAWs (Digital Audio Workstations). The tools that we can find in DAWs allow us to transform and shape sounds. Things like equalization, compression and other effects are the bread and butter of audio people. Most of the modern music, sound effects and movies that you´ve ever heard came from a DAW.

But the issue is that once you bounce or export your sound it’s kind of set in stone, that’s how it will sound when you trigger it in your game. Middleware software not only give us the same tools that you can find in a DAW. More importantly, it also give us the ability to make them interact with variables and states coming from the game engine.

How about a monster whose voice gets deeper as it gets larger? Or music and sound effects that get louder, brighter and more distorted as the time is running out?

6. Hardware & Memory Optimization.

Different areas of a game compete for processing power and usually audio is not the first priority (or even the second). That´s why is very important to able to optimize and streamline a game´s audio as much as possible.

Middleware offers handy tools to keep the audio tidy, small and efficent. You can customize things like reverbs and other real time effects and also choose how much quality you want in the final compression algorithm for the audio.

7. Platform flexibility & Localization.

If you need to prepare your game for different platforms, including PC, consoles or mobile phones, middleware makes this very easy. You can compile a different version of the game’s audio for each of the platforms. Memory or hardware requirements may be different for each of them and you’ll need to maybe simplify sound events, bake-in effects or turn a surround mix into an stereo one.

You can also have a different version per language, so the voice acting would be different but the overall sound effects and treatment of the voices would be consistent.


I hope this gave you a bit of a taste of what middleware is capable of. When in doubt, don´t hesitate to ask us, audio folks!
Thanks for reading.

Exploring Sound Design Tools: Mammut

mammut.png

Mammut is a strange and unpredictable piece of software. It basically does a Fast Fourier Transform (FFT) of a sound file but unlike Paulstretch, which uses slices of the sound, Mammut uses the whole thing at once, creating more drastic results.

Mammut is not (in any way) a commercial tool but more of an experimentation one, so I won't go into detail about what is doing under the hood. Instead, I will focus on how it can be used to create interesting and cool sound design. If you want to follow along, you can download it here.

Software Features

Mammut has many processing tabs but I will only cover some of the most interesting ones.

Loading & Playing sounds.

Mammut works as a standalone software. You need to load a sound (using the browse button) to be able to start fooling around. The "Duration Doubling" section adds extra space (technically, FFT Zero Padding) after the sound. This extra space give some of the effects (like stretching) more time to develop and evolve.

Play and stop the sound on the Play section. There is also a timeline of sorts. Now that our sound is loaded, let's see what we can do with it.

Stretch

It creates a non-linear frequency stretching with frequency sweep effects. All frequencies are raised to the power of the selected exponent, so small changes are enough to produce very different results. Because of the frequency sweeps, it sounds quite sci-fi, like the classic star wars blaster sound. Here are some examples at different exponents:

As you can hear, as the values get further away from 1, the effect is more pronounced and it also starts sooner. Here are some results with values higher than 1:

And here are some interesting results with a servo motor sound.

This sounds remind me a bit of japanese anime or video games, maybe this could be one of the steps for achieving that kind of style from scratch.

Wobble

This stretches and contracts the frequency spectrum following a sinusoidal transfer function. You can control the frequency and amplitude of this change.

This one is weird (no surprise) and it doesn't really do what I was expecting. It tends to create sounds that are increasingly dissonant and "white noise like" as you go to more extreme parameters. Here are some examples:

Threshold

Quite cool. Removes all the frequencies below a certain intensity threshold. This means that you can kind of "extract" the fundamental timbre or resonance of a sound. Used on ambiences (3rd example below), it sounds dissonant at first and then, once you remove almost all frequencies, kind of dreamy and relaxing.

Block Swap

This one basically divides the frequency spectrum in chunks and interchanges their halves a given number of times. Hard to wrap your head around but it produces interesting results. First, the number of swaps seems to make the sound more "blurry" and abstract as you can hear:

Then, the block size seems to create different resonances around different frequencies as you increase it.

Mirror

Simple but hard to predict. It reflects the whole spectrum around the specified frequency. The problem with this is that when you flip the spectrum around a low frequency, everything ends up under it and is mostly lost. On the other hand, if you use a higher frequency, too much of the energy ends up on the harsh 5-15 KHz area.

A couple of examples:

Keep Peaks

Screen Shot 2018-06-29 at 15.50.33.png

This one doesn't even have controls or an explanation on the documentation. It seems to extend the core timbre of the sound across time which can be pretty useful. When using this option, the duration doubling function is specially handy.

Conclusions

Mammut is certainly original and unique. Since it only works standalone and is rather unpredictable and unstable, I don't feel it would be very easy to include in someone's workflow. Having said that, is definitely a nice wild card to have whenever you need something different.

Figuring out: Dolby Atmos

Figuring out: About this series
They say the best way to really learn about something is to force yourself to explain it to someone. That is the goal of this series. I will delve into a topic that I feel don't know enough about and explain my findings. Hopefully, we would both learn something useful!

Logo_Dolby_Atmos.svg.png

More than a gimmick?

Up until some months ago, Dolby Atmos was to me mostly about having speakers on the ceiling in the hope of attracting people back to the cinemas. After getting to know Atmos a little better, I wanted to see what it has to offer and if it is really going to be the new standard in professional audio. Consider this a 101 introduction on Dolby Atmos.

Surround Systems

Before Atmos, let´s start with something familiar. Surround systems have been used for decades to offer a more interesting audio experience for the listener. 5.1 and 7.1 are the more used formats for both cinemas and home setups.

Something important to understand about these systems is that they are channel-based. For example, a 7.1 system would offer us the following channels:

As you can see, these channels can be composed of just one speaker (like the central channel) or by several of them (like the left surround channel). We can send audio to any channel independently but we would have no control on how much is sent to each of the individual speakers that form a channel.

That is basically how all surround systems work, the only thing that varies is the amount of channels.

Dolby Atmos introduces two innovation to the table. Firstly, it uses an object-based approach on top of the previous channel-based system. Secondly, it expands the surround feel by adding speakers to the ceiling and unlocking 3D sound. Let´s look at both of these features:

Object-based

Dolby Atmos allows for 128 channels in total. We can use a certain amount of those for traditional channel-based stems and the rest for the new sound objects. 

Think about these sound objects as individual mono sounds that you can place and move around the room. If you place a sound object on a specific location, Dolby Atmos will play the sound on that location, addressing the nearby speakers individually as needed, regardless on how big the room is or how many speakers there are.

In other words, you are telling Atmos the coordinates of the sound instead of how much the sound is feeding each of the channels. It allows you to place sounds with great precision in big rooms but at the same time, the mix will translate well into smaller rooms or even headphones since Atmos is just using the coordinates of each sound object in 3D space.

3D Sound

The second innovation is probably the flashiest.

If you think about it, stereo is one dimensional, sound moves in a horizontal line. Surround audio is 2D, the soundscape is around you, on a horizontal plane. 3D is the next step: sound would be on a cube or a sphere.

Before Atmos, some surround 9.1 systems tried to achieve this by placing two speakers on top of the front speakers in order to give some "height" to some elements of the mix.

Dolby Atmos goes one step beyond adding speakers to the ceiling itself. Elements like ambiences, FX or music can now be placed overhead, opening the third dimension for the listener.

In theatres, these ceiling speakers usually go in two rows. There are also some extra surround speakers on the walls to make panning smoother when transitioning sounds between onscreen and offscreen. In total, up to 64 individual speakers are allowed on a theatrical Atmos installation.

At home, usually two or four overhead speakers are used, so you'll see configurations like 5.1.2 or 7.1.4. Note how the third set of numbers denotes the number of ceiling speakers. Up to 22 speakers are allowed on home setups.

Since installing ceiling speakers may not always be very practical on a home setting, sometimes sound is "fired" to the ceiling so that it bounces back to the listener giving the impression that it comes from above.

Crafting a soundscape with Atmos in mind

Knowing that a project will be mixed in Atmos changes the approach in terms of sound design and mixing, giving us more tools and challenges to achieve a compelling soundtrack.

For example, building ambiences now has an additional dimension. Imagine a scene inside a car while is raining. You could have different layers of the car engine and the city exterior and then the sound of the rain falling into the roof featured on the overhead speakers. A forest ambience could have discreet mono birds chirping above and around you, some of them static, some of them moving throughout the 3D space.

It's also worth noting that Atmos setups usually include one or more extra subwoofers close to the surrounds and overhead speakers. Although low frequencies are not very directional, it sill makes a difference in terms of sound placement to use the surround subwoofer instead of the one behind the screen.

Additionally, the Atmos standard makes sure that all surround speakers offer the same sound pressure level and frequency response as the onscreen ones. This means that while designing sound objects with a wide frequency range like a fighter jet going by overhead we have the whole spectrum at our disposal. This wasn't the case with previous systems, since the surround speakers did not have enough power and were best suited for simple atmospheric and background sounds.

Atmos makes you think more on where you want the audio to be in a 3D space rather than thinking about which channels and speakers to feed the audio to. It turns the mix into a full frequency canvas to position your elements.

Encoding for Dolby Atmos.

When preparing audio for Atmos, there are two distincts uses we can give to each of the available 128 channels. We can have sound objects as discussed above and we can also have channel-based submixes (beds). These beds can be created in any traditional channel-based configuration like 5.1 or 7.1 and are mapped to individual speakers or arrays of speakers the old fashioned way. In contrast, objects are not mapped to any speaker but saved with metadata that describes their coordinates over time.

This double approach (beds + objects) makes Atmos backwards compatible since we are also creating a traditional channel-based version when creating the masters.

To put all this information together we use a renderer. I won't go into a too much detail here, but Dolby basically offers two ways of doing this:

Dolby Mastering Suite + RMU:
This is the most advanced option, it is used for theatrical applications and Dolby certified rooms. It combines the Dolby Mastering Suite software with the Dolby Rendering and Mastering Unit (RMU), a dedicated Dell server computer that communicates with Pro Tools via MADI and processes all the Atmos information while compensating for any delays in the system. 

The RMU can be used for monitoring, authoring and recording Dolby Atmos print masters. It is also used for creating and loading room calibrations and configurations.

Note that the Dolby Mastering Suite software runs only on dedicated hardware (the RMU), while we would still need a different software package for any Pro Tools systems involved in the Atmos workflow. This would be the Dolby Production Suite, which I'm explaining below. The Dolby Mastering Suite includes three Dolby Production Suite copies but you can also buy the latter separately.

The mighty RMU

Dolby Production Suite:
This is the package that should be installed on the Pro Tools machines. It basically includes the renderer itself, a monitoring application and all the necessary Pro Tools plugins. In case you are using an RMU, this package will allow you to connect with it. If you are not, it will allow you to play, edit and record any Atmos mixes all within the same Pro Tools system.

While the Dolby Atmos Production Suite includes the ability to render Atmos objects, just like you can using the RMU, it has significant limitations. The software is an "in the box" renderer that runs on the same system as your Pro Tools session so if your project is large you may not be able to run it. Also, the software won't be able to compensate for any delays produced in the system.

Having said that, the Dolby Production Suite may be powerful enough for Blue-ray, streaming and VR projects with a limitation of up to 22 monitor outputs. For larger and/or theatrical projects an RMU is necessary, being capable of up to 64 outputs.

Dolby Atmos Everywhere

Atmos in home theatres is not rendered the same way as in cinemas because of limited bandwidth and lack of processing power. Close objects and speakers are clustered together conserving any relevant panning metadata. This simplified Atmos mix can be played through a home Atmos setup, like a 7.1.2.

Since ceiling speakers are cumbersome, home setups are becoming more accessible with the inclusion of sound bars and upward-firing speakers.

Blu-rays can carry an Atmos soundtrack and some broadcasting and streaming companies like Sky or Netflix are starting to offer Atmos content. The 2018 winter olympics was the first live event offered in Atmos.

In the world of video games, Dolby Atmos could be specially promising, enhancing the player's experience with immersive and expressive 3D audio. Currently, Xbox One, the PC and somewhat the PS4 offer dolby Atmos options via either an AV receiver or headphones (behind a paywall). There are a handful of titles ready for Atmos like Overwatch, Battlefield 1 or Star Wars: Battlefront.

Any Atmos mix can be scaled down into a pair of headphones. You don't need surround headphones for this, the Dolby algorithms convert all the Atmos channels into a stereo binaural signal that sounds around you in 360°. Some phones and tablets are starting to support this already.

Final Thoughts

It seems like Dolby Atmos is here to stay and become the new standard the same way stereo and surround sound replaced their older counterparts.

In my opinion, The key quality about Atmos is its object-based technology and scalability. Overhead 3D audio is very cool, but it may not be game changing enough and/or very accessible for the average user. It is still to be seen if binaural headphone technology and upward-firing speakers are going to be good enough to recreate the 3D feel that currently theatres can provide.