Common Myths Surrounding Microphone Arrays and Beamforming (Part 1)


Recently, it feels like we see more and more companies coming out with microphone array solutions for the conferencing room environment. However, how good are they? What kind of performance can we expect from them? Which buzz words should impress us, and which should we ignore? This article is an attempt to shed some light on the subject. If after reading this article, you get the feeling that this topic is complicated, then I achieved one of my goals, which is to demonstrate just that. The topic of beamforming is a very complex one. These types of solutions have been researched for many years and show a lot of promise, but they don’t always deliver when used in real life situations. When you add to the equation the acoustics of the room, the imperfections of the microphone elements used in the process, and psychoacoustics, you end up with a very complicated puzzle that takes many years of trial and error to resolve.

Before we move forward, I’d like to touch a little bit upon my personal involvement with the subject. Decades ago, when I was three years into my career as an Electronic Engineer, I was assigned to develop a DSP-based 96 element beamformer. This was an underwater hydrophone (the underwater equivalent of a microphone) array assigned to listen, detect, and track noise sources from large distances. My entire professional career beyond that assignment has revolved around designing and developing beamformers. Everything from time domain beamforming to frequency domain, adaptive, delay and sum, super directional, constant beam-width, broadside, and end-fire. You name it, we’ve done it.

Looking back at my design approach towards my lifelong companion, the beamformer, it’s pretty evident that I’ve gone full circle. I started with incorporating super aggressive high-performance methods. After which, I mellowed them down and even abandoned them for a while, only to incorporate them back in with a more aggressive approach. This time, I was armed with loads of experience.

The reasoning behind these transitions was that traditionally these algorithms were used in underwater applications where sound waves are much more “behaved” and predictable. Advanced and aggressive algorithms worked well in this environment. The implementation of these advanced techniques received a boost in the early eighties with the appearance of the DSP, a processor optimized and dedicated for the kind of mathematical operations repeatedly used with these algorithms.

In the late eighties, several companies (our team included) took the concept out of the water and into the air to develop beamformers for military applications. These were used to detect and find the direction of firearms, rockets, mortars, as well as for surveillance purposes. A few of my colleagues and I took a personal leap of faith and started a company that took these attempts and developed them for civil applications, including hearing aids, dictation, conferencing, and many more. Several years into this era, we came to the realization that the performance we managed to obtain was excellent for some applications, such as surveillance and to a smaller degree automotive. However, conferencing proved to be disappointing and not adequate. At that point, we took a step back in our endeavors and started to focus on other audio enhancement techniques that were more likely to be accepted. We’ve now gone a full circle and are back to developing large array solutions. I believe our approach is more mature, more experienced, and the results are more encouraging.

So, what are beamformers? What are they good for? What are they not good for? What are the differences? Which of the buzz words you run into should impress you? Which are less meaningful?

Wikipedia defines Beamforming as “…a signal processing technique used in sensor arrays for directional signal transmission or reception. This is achieved by combining elements in a phased array in such a way that signals at particular angles experience constructive interference while others experience destructive interference.”

Hearing_RoyaltyFreeIn a more illustrated way, a microphone picks up sound waves without any distinction, both wanted ones (we’ll call these signals) and unwanted ones (we’ll refer to these as noises). A more sensitive microphone will pick up the same sound waves as a non-sensitive microphone. While it will produce a higher level of electronic signals, it will still pick up the same mixture of good and bad sound waves. Our goal is to eliminate as much noise and leave as much signal as possible. One way to do this is to limit our listening to a single direction, just like holding up our palm to our ear to hear better.

Myth – A More Sensitive Microphone Will Improve the Pickup Range

Not true. Well, there is a little truth to it, but very little. A more sensitive microphone produces higher level electronic signals, therefore it’s less susceptible to poorly designed amplification. However, you can obtain the same range with an insensitive microphone. A more important parameter for a microphone is its noise floor, or dynamic range, but these are critical in professional studio applications where the surrounding noise is minimal and below the microphone’s self-noise.

Before getting to the “how”, let’s talk about the “what for”. In other words, let’s assume for a minute that we know how to create a device that listens to one direction and eliminates all the sound waves that arrive from any other direction. Theoretically, utilizing this technology, we will pick up the wanted voice and eliminate all the sources coming from all other directions. This includes noises, but also reflections and reverberations arriving from different directions. Sound great! However, there’s a catch – some interferences and noises originated from unwanted directions will be reflected and will arrive at the array from the direction we’re actually listening to.

If this sounds a little confusing, let’s try a different way to illustrate. In an open space, free of any reflections, the sound waves will propagate from the source to the target in a straight line. If we listen to that direction, that’s what we’ll hear. In a more confined environment, full of reflections, the waves propagate from the source to the target through many paths, losing their directionality characteristics. In this case, the direction we’re listening to loses its correlation to the direction of the source. By listening to a certain direction, we’ll hear sources from many directions. The ratio between the levels of the direct path signal to the indirect path noises penetrating our beam is getting worse when the range to the wanted source increases. In other words, the efficiency of the beamforming decreases with distance. It’s still better than no beamforming, but it loses some of its advantage with range.

Myth – Beamforming Increases the Pickup Range

True, but the efficiency of the beamforming goes down when the range increases. This is because the ratio of signal to reverb is going down and the beamformer becomes less efficient and less effective.

So, how is the beamforming achieved? Sound sources propagate through air as waves. If we line up an array of microphones, the sound wave will hit the microphones closest to the sound source first, then the other microphones in the order of their distance from the source. If the array is positioned perpendicular to the sound wave, all the microphones will be at the same distance from the source. Therefore, the sound wave will hit them at the same time. When we sum up the signals received by the microphones, those that hit the microphones at the same time will add up and become emphasized. Other signals, originating from other directions, will hit the different microphones at different times and will diminish and even cancel each other out. To sum it up, the array will create a listening beam towards the direction perpendicular to the array which we call broadside.

What are the parameters that determine the sharpness or width of this listening beam? The most important parameters are the total aperture of the array (distance between the two ends of the array) and the frequency of the signal. The wider the array is, the narrower the beam will be for a given frequency. Likewise for a given width, the higher the frequency is, the narrower the beam will be.

Today we’ve discussed the topics of microphone sensitivity and pickup range, and also touched on the basic principal behind beamforming. Look for part two of this article later this week, where we’ll dive into wider arrays and how to increase performance.


About Author

Joseph Marash is the founder and CEO of Phoenix Audio Technologies, a leading developer of audio conferencing equipment. He has an MSc. degree in Electronic Engineering and 35 years of experience in leading the development of DSP based audio solutions. Joe is a serial entrepreneur, Phoenix being his third endeavor. He is an authority in digital signal processing for audio, the writer of numerous patents in the field and has served as a consultant to a number of companies seeking advice on product development.

1 Comment

    Raum Pattikonda on

    Very informative and comprehensive article Joseph. Will you be open to helping with beamformer design for an upcoming groundbreaking product? Let me know how to reach you

Leave A Reply