Here are my notes about the world of Ambisonics. This is a new area to me so, following this blog’s phylosophy, I will try to learn by explaining. Take this as an introduction to the subject.

The basic idea

We usually think about audio formats in terms of channels. Mono and stero being the most basic and used ones. If we open up the 2D space even more, we get surround audio like 5.1 and 7.1. Finally, the last step is to use the full 3D space and that’s where Ambisonics comes in.

The more complexity and channels we have, the harder is to make systems compatible between each other. In order to solve this, Ambisonics trascends the idea of channels and uses the concept of sound fields which represent planes of audio in 3D space.

This then allows to keep the aural information in a “speaker arrangement agnostic format” that can be decoded into any amount of speakers at the time of reproduction.

M/S Format

These planes of audio are represented in a special format called B-Format. You can think of this format as a natural extension of the M/S format so let’s start with that.

To get a M/S recording, we first use a figure of eight microphone facing sideways to the source (this is the “side”). This microphone will pickup the stereo information. At the same time, we use a cardioid microphone facing the source (this is the “mid”).

Once we want to decode these signals into stereo, we just need to sum mid and side to obtain the stereo left and then sum mid and the side with polarity reversed to obtain the right side. If you think about this, you realize that the “side” signal is basically a representation of the difference between left and right.

But why would we want to record things this way? Why not just record in stereo with a X/Y technique or similar? Recording using M/S has a few advantages. Firstly, we get automatic mono compatibility since we have the mid signal which we can use without fear of the phase cancellations that would happen if we sum the channles from an X/Y format. Additionally, since we can decode the M/S recording into stereo after the fact, we can control how wide we want the resulting stereo signal to be by just adjusting the balance between mid and side during decoding.

B-Format

Amisonics takes this concept and pushes to the next dimension, making it 3D by using additional channels to represent height and depth. B-format is then built with the following channels:

W: Contains the sound pressure information, similar to the mid signal in M/S. This is recorded with a omnidirectinal microphone.
X: Contains the front minus back pressure gradient. Recorded by a figure of eight microphone.
Y: Contains the left minus right pressure gradient, similr to the side signal in M/S. Recorded by a figure of eight microphone.
Z: Contains the top minus bottom pressure gradient. Recorded by a figure of eight microphone.

Note: A-Format is how we would call the raw audio from an ambisonic recording, that is, the individual signals from each microphone, while B-Format is used when we have already combined all these signals into a unique set.

Ambisonic Orders

Using the B-format described above works but comes with some drawbacks. The optimal listener position would be quite small and results won’t be very natural outside it. Also, diagonal information is not very accurate, since it has to be inferred from the boundary between planes.

A solution to these issues is to increase resolution by adding more selective directonal components which instead of using traditional polar patterns would use other specific ones resulting in a signal set that contains denser aural information.

There is really no theoretical limit in how many additional microphones we can add to improve the resolution but of course there are clear prctical limits. For example, a third order ambisonics set, would use 16 tracks so is easy to see how hard drive space and microphone placement can quickly become a problem.

Decoding B-Format

Regardless of the number of ambisonics orders we use, the important thing to keep in mind is that the resulting recording will be not channel dependent. We can build the sonic information of any point in a 3D sphere by just knowing the angles to this point.

This allows us to create virtual microphones in this 3D space with which we can then match with any number of speakers. This is very powerful because once we have an ambisonics recording we can then play it on any speaker configuration preserving as much as the spatial information as the reproduction system allows.

If the final user is using headphones, a binaural signal would the result of the decoding while the same source files can be used to decode a 3D Dolby Atmos mix in cinemas.

Nowadays, you can find a big selection of ambisonic plugins for your DAW so you can play around with B-Format files including coding and decoding them in any other mutlichannel format you can imagine.

Ambisonics suite for Reaper https://www.ambisonictoolkit.net/documentation/reaper/

Use in media

Ambisonics was created in the 70s but has never been used much in mainstream media. This is now changing with the advent of VR experiences where 3D audio makes a lot of sense since the user can move around the scene focusing on different areas of the soundscape.

In the area of cinematic experiences ambisonics achieves a similar result as Dolby atmos or Auro-3D but using different methods. See my article about atmos to know more about this.

Google’s Resonane Audio allows you ti use ambisonics in Fmod

Regarding videogames or just generally interactive audio, ambisonics is a great fit. You can implement B-Format files in middleware like Fmod or Wwise and also game engines like Unity. This gives you the most flexibility since the ambisonics format will be decoded in real time into whatever the user is using to reproduce audio and this decoding will react in real time to their position and direction which is particularly awesome for VR.

In closing

There is much more to learn about this so I hope i get the chance to work with Ambisonics soon. I’m sure there are many details to keep in mind once you are hands on working with this formats and will try to document what I learn on this blog as I go.

Blog

Figuring out: Ambisonics