In the December 1996 ‘Oops’ I wrote about audio level metering and how “0” on the meter relates to actual voltage levels within your system. In May 1999 the classic VU meter celebrated its 60th birthday. It served the industry well, and when properly interpreted it can still be a useful tool to indicate loudness.
But today’s digital recording processes have brought us around to taking a hard look at the inadequacies of both the traditional VU meter and the peak meters for level management. Since nearly all audio production today involves at least one digital stage, we’ll focus this month on some aspects of audio level management in the digital domain, some headroom concepts, and how loudness and audio level are related (and how they’re not).
What’s a VU?
The Volume Unit meter was originally designed to help broadcast engineers keep the overall program level consistent between speech and music. There’s a well defined standard for the mechanical rise-and-fall response characteristics of the pointer (to which few of today’s VU-looking meters comply).
A standard VU meter actually responds a little too fast to be representing loudness accurately, but it’s fast enough to show movement between syllables, making the pointer movement look good when responding to speech. It was easy to tell with a glance at the meter when something wasn’t working, and engineers learned that brief excursions up to the +3 dB top end of the scale rarely caused distortion and didn’t result in any significant change in perceived loudness.
How is audio level measured?
Once sound is converted to electricity, we can represent the audio level by measuring electrical voltage. The audio signal alternates rapidly between positive and negative values of voltage. A symmetrical signal such as a sine wave—a single-pitched note with no overtones or distortion—spends as much time on the positive side of 0 volts as the negative.
The numerical average of the positive and negative voltages over many cycles is zero—not a very useful measurement when we want to know how loud a sound that voltage represents.
A measurement that corresponds closely to how we perceive loudness is called the RMS (root mean square) average, which expresses mathematically the amount of energy in the waveform. The RMS value is calculated by squaring the voltage at each point on the waveform (remember, the square of a negative number is a positive number), summing the squares, dividing by the number of points measured (the more points, the more accurate the calculation), then taking the square root of that quotient. For a sine wave the RMS value is 0.707 times the peak value. (Figure 1A)
You’ve probably heard this numeric relationship (it’s the square root of 2) between peak and RMS before. Understand, though, that it holds true only for a pure sine wave.
Look at the string of narrow pulses in Figure 1B. The voltage spends much more time at zero than at some measurable value. Even though the peak value is the same as the sine wave above it, the RMS value of this waveform is lower. It will read lower on a VU meter, and would sound quieter than the sine wave.
Since a square wave (Figure 1C) spends all its time either at peak positive or peak negative level, its RMS and peak amplitudes are equal.
The classic VU meter
Today we primarily use a VU (or pseudo-VU) meter to indicate an impending overload, but that was never part of its original function.
Headroom—that comfort zone between normal level and unpleasant distortion—was always designed into a professional system, so a meter wasn’t necessary to tell us when it was running out. Typically, an analog recorder is calibrated so that 0 VU corresponds to a signal that produces the reference fluxivity, and 10 dB above that level produces a known but acceptable level of total harmonic distortion, around 1% for professional machines.
For certain types of music it’s better to calibrate the recorder for more or less headroom, but that’s the engineer’s choice. We don’t, however, watch a recorder’s VU meter to tell how loud our recording is, but rather to assure that we don’t exceed the available headroom.
Is the VU meter actually a good headroom indicator? Only if you’re working with program material that’s already consistent in level and doesn’t require a generous allowance for surprise peaks.
Just look at the scale (Figure 2). While the meter scale has a range of 23 dB, fully half (the top half) of the scale represents only 6 dB. This is good resolution for when you’re reading steady tones, but pretty wasteful when working with recording media that’s capable of handling a dynamic range of better than 90 dB.
There’s virtually no usable resolution below -10 VU, so if we take it on faith that we have at least 10 dB of headroom above 0 VU even though we can’t see it on the meter, we have an indication of only about a 13 dB range.
Why such a compressed scale? Practicality. Loudness is a logarithmic function. While “twice as loud” is more subjective than “twice as many volts,” it takes more than twice the signal voltage for something to appear twice as loud.
In order to make a loudness indicator with a linear scale, a logarithmic amplifier ahead of the meter would be required. That’s a piece of cake with digital signal processing, but in the 1930s it required more electronics than could be stuffed inside what was supposed to be a simple panel meter. Besides, watching audio on a linear scale dB meter just doesn’t look right. (Remember, the VU meter response was designed to look good.)
It’s no surprise that a modern, highly compressed song will shoot the meter pointers up very close to 0 VU, and they’ll stay right there until the fadeout. On uncompressed material there’s plenty of good, perfectly audible stuff down below the -20 mark, but the inexperienced engineer trusts the meter rather than his ears and tends to regard anything that barely moves the meter as being too soft. He either boosts the volume or compresses to get the meter back up to the top half (6 dB range) of the scale.
Today there are a great number of meters, both mechanical and of the LED or LCD ladder style that look like VU meters but don’t meet the VU standards. They’re useful for establishing steady-state calibration levels when setting up a system, but they don’t represent either loudness or headroom accurately. Ladder meters are often found on digital equipment, but it’s rarely specified whether they’re indicating a measurement of analog voltage or number of bits used.
In either case, the garden variety meter doesn’t provide the dynamic response of a real VU meter. It can show you what’s happening, but not much about apparent loudness.
A digital meter is calibrated with 0 dB all the way at the top. This represents full digital level—all the bits on, regardless of how many bits the system uses. We refer to this maximum level as 0 dBFS (dB relative to full scale). It’s up to you, the engineer, to decide where to set the analog-to-digital converter’s input gain so that the signal level to the converter never exceeds what would be digitized to 0 dBFS.
Loudness doesn’t care if the source is analog or digital, but we view headroom and operating range differently in the two worlds. In the analog world, there’s a range over which distortion increases gradually, and we include at least a portion of that range in our useful working range. In the digital world, distortion is nearly independent of level until all the available bits are used up, and at that point it’s (literally) all over.
Early DAT recorders metered on the analog side of the converter, with the meter turning on the “Over” indicator when the input level exceeded the full scale input level of the converter. This lets you know that there’s a problem when you’re recording, but since the recording will never exceed the maximum level, “overs” metered on the analog side will never show up on playback.
Modern digital recorders and all software DAW meters monitor the actual digitized value, which as we know can never exceed the number of bits available in the converter. Any input voltage that exceeds the 0 dBFS level will be digitized as 0 dBFS, leaving a nice flat topped waveform looking quite unlike the original (Figure 3).
The digital level never exceeds 0 dBFS, but clearly there’s something wrong, We’d like to be warned about that. So the conventional way of determining a potential digital overload condition is to count the number of consecutive samples at 0 dBFS and turn on the “Over” indicator when that number exceeds a preset threshold.
The Sony PCM-1630, the original “official” digital standard recorder from which all CD glass masters used to be cut, indicates an Over any time it sees three consecutive full scale samples. This is reasonable, since we can assume that the first of a consecutive string of 0 dBFS samples occurred while the waveform was on the way up and the last occurred on the way down, therefore it must have tried to go over the 0 dBFS level some time in between.
Three consecutive 0 dBFS samples for an Over is pretty stringent. It could mean that only one sample tried to go over the limit (the ones on either side being exactly full scale).
At 44.1 kHz, one sample is too brief to be detected audibly by most listeners on most forms of program material. Outboard digital level meters, whether hardware or software, generally offer a choice of the number of contiguous full scale samples required to turn on the Over indicator, so you can be more or less conservative, depending on the music you’re recording.
Digital Over indicators are all about statistics and perception, though. While it might not be musically interesting, you can record a full scale level square wave accurately and the Over indicator will never go off.
The influx of 96 kHz A/D converters raises an interesting question: should we double the number of FS samples required to indicate an “over,” since that would represent a flat-topped waveform of the same duration as three samples at 48 kHz? The jury’s still out on this one. (Actually, it’s more like nobody ever asked.)
While all DAT, CD, and MDM recorders have Over indicators, curiously some outboard A/D converters do not. The AES/EBU and S/PDIF standards have no way of sending the indication of an “over” across the interface, so if you want to watch an indicator you have to depend on whatever is present on the recorder. Usually a 0 dBFS sample-counting Over indicator on the recorder will function on incoming data from an external digital source, but it’s a good idea to check yours out before relying on it to keep you out of trouble.
Something to be aware of with the on-screen metering provided by DAW software is that refreshing the meter on the screen is usually a low priority task for the program. Some DAW programs’ meters only work when not actually recording. They’re useful for setting levels initially, but they freeze or disappear once the “tape” starts rolling. Those that function while recording often lag far enough behind the A/D conversion to let you get into trouble if you’re working close to the edge.
When depending on an on-screen meter or built-in hardware meter for level monitoring, consider keeping your peaks at -1 dBFS. It won’t hurt the signal-to-noise ratio enough to worry about, and if your recording will be tweaked by a mastering engineer, it’ll leave a little breathing room for digital level adjustment or equalization.
Many video production houses have adopted a more stringent convention, primarily to assure that they don’t run out working space when mixing various elements of a project. They want the nominal level to be -20 dBFS, but rather than allowing peaks up to 0 dBFS, they don’t want to see anything higher than -10 dBFS. This tosses away about a bit and a half of resolution and raises the noise floor slightly, but this isn’t terribly important considering that most playback is from a standard television set speaker, and it absolutely assures no “overs.”
In the early days of CD manufacturing, if your digital master tape contained any Overs, the lab simply rejected it, telling you to fix your problems before cutting the glass master. They didn’t want to be blamed for your distortion.
Now that controlled distortion is an accepted production tool in popular music, judicious application of digital limiting and compression can keep the Over light from coming on while effectively holding the level within a fraction of a dB below full scale throughout the entire song.
What’s the difference between that and digital clipping? Control. If you don’t like how the limiter sounds, you can change it. But nobody uses digital clipping as an effect or a leveler.
But how LOUD is it?
Loudness is a function of both the recorded level and how far the listener (and you as the engineer) has the volume turned up. In the past few years, commercial (and following in their footsteps, independently produced) recordings have been in a race to make each one sound louder when played at the same volume setting than the last. A whole segment of our industry has sprung up as a result of our inability to train the user to get up and adjust the volume, but this is just a fact of life today.
Wouldn’t it be nice if there was a standard for loudness, just as there are standards for electrical operating level for audio signals? In the film industry there is. While many theaters don’t have their playback system properly calibrated, the Society of Motion Picture and Television Engineers (SMPTE) has established a standard relating electrical audio level and loudness.
Simply stated, pink noise at the nominal operating level will play back at a level of 85 dBC SPL (sound pressure level with “C” weighting). They expect that the playback system will provide at least 20 dB of headroom above this, and it’s plenty loud. [Judging by how painfully loud movies are these days, I suspect that they’re measuring with the SPL meter at the back of the theater and the pink noise in the front speakers behind the curtain!—NB]
But what’s the nominal operating level? Most DAT recorders have a little mark on the scale somewhere between -20 and -12 dBFS, and that’s the recommended operating level.
Dolby Labs, the household word in theater sound, has established the 85 dB SPL point on a digital system to be -20 dBFS, and this works out well in practice. In a theater with a properly calibrated playback system, one that actually plays back -20 dBFS at 85 dB, most of the audience can hear quiet dialog and don’t get blown out of their seats when the car chase ends in an explosion.
With the advent of home theater systems, Dolby recognized that sound mixed for 85 dB plus headroom in a loud theater doesn’t translate well to small speakers and small rooms. 20 dB is a lot of dynamic range, and 105 dB SPL is more than most listeners can tolerate, so they recommend lowering the monitoring level to 79 dB.
Of course nobody wants to admit to turning anything down, so let’s play with the numbers a little and see if we can make up that level. If we start out with the standard Dolby calibration of -20 dBFS = 85 dB SPL, we’ll reach 79 dB SPL at a digital level of -6 dBFS. Just on general principles, we want to be sure to hit 0 dBFS some time.
So suppose we raise our nominal operating level from -20 to -14 dBFS. This will jack everything up by 6 dB, getting us back to a peak level of 0 dBFS. Now, 0 dBFS still corresponds to 85 dB SPL, but instead of being able to go 20 dB more before running out of bits, we can only go up 14 more dB.
Okay, so we’ve reduced our available headroom by 6 dB. How do we deal with that? Reduce the ratio of peak-to-average level so that the average can be higher without the peaks trying to exceed the maximum possible level. 20 dB peak-to-average ratio is typical for uncompressed music of almost any genre, which is why we start out with a nominal operating level of -20 dBFS.
When judging apparent loudness, the ear responds to the average level rather than the peak level. In a typical pop song, the snare drum will be the loudest individual sound, and its peaks will ride a few dB above the rest of the mix. This is what sounds right to our ears, and it’s those peaks that determine the maximum recording level. Let’s say that at several points in the song when the drummer is feeling particularly enthusiastic the peak level hits 0 dBFS.
Now let’s do another mix, but this time we’ll sit on that snare with a limiter or compressor so that we don’t have any hits sticking way up. You’ll find that the meter no longer gets up to 0 dBFS, but the mix doesn’t sound any quieter either.
Now that we’ve taken care of those “outliers,” we can boost the master level by a few dB so that we’re back to peaking at 0 dBFS. Since boosting the master level raises the average level, the mix appears louder without exceeding the digital limit. Voila! Instant mastering!
A new meter and an experiment
In a presentation at the October 1999 Audio Engineering Society convention, mastering engineer Bob Katz of Digital Domain described the concept of monitoring at a calibrated level. He’s proposed a new type of meter scale that essentially displays headroom over calibrated monitor sound pressure level. His meter scale looks like a VU meter in that it’s calibrated above 0 VU as well as below, but unlike a VU meter it has a linear scale.
With Bob’s system (at this point dubbed the “K-System”), when mixing for maximum dynamic range the scale goes up to +20 VU with 0 VU representing a monitor level of 85 dB SPL. For typical pop music, full scale is +14 VU, while for heavily compressed “broadcast quality” music, full scale is +12 VU.
It may seem that the mix that we want to appear the loudest, the broadcast mix, has the lowest full scale level. But in reality it’s really the loudest.
Since 0 VU still corresponds to 85 dB SPL, with the smallest amount of range between 85 dB and maximum, mixing around 0 VU will keep the average level higher than the full dynamic range mix, which allows 20 dB headroom above 0 VU. If this sounds a bit too much like smoke and mirrors, you can read Bob’s full proposal (he’s continually updating it) on his web page (www.digido.com) and fill in some of the mental blanks.
I thought I’d give this a try at home to see how this works in practice. Feeding my DAT pink noise from my trusty Neutrik Minirator (reviewed 6/99), I set the record level so that my DAT’s meters read -20 dBFS. Using my Radio Shack sound level meter set for C weighting, I adjusted the control room monitor level for a reading of 85 dB at my normal listening position.
Then I put in a tape of a pop music mix and just about jumped out of my chair from the volume (though I didn’t blow anything and the system still had headroom—that’s good). A tape of an orchestral recording was much more civilized, with crescendos being a bit loud but in general a healthy level.
I then re-calibrated the monitor level pot so that -20 dBFS pink noise gave me Bob’s recommended “compressed pop” 77 dB SPL, played the pop music tape again, and darn if it didn’t sound just about right: good and loud but not window-rattling—about as loud as the crescendos of the classical music, but right up there all the time, not just on loud sections.
An experienced engineer working with a calibrated monitor system can pretty much set levels by ear and doesn’t need to look at the meter other than to verify his or her own “calibration.” Some of us have been doing this for a long time subconsciously.
An engineer I worked with in the mid ‘80s called the process of getting accustomed to the normal volume level of the monitors “Tennessee Voicing.” Dave Moulton has written about “Kentucky Voicing,” which has to do with muskets that you aimed by compensating for their offsets.
There really aren’t any secrets to mastering. But thinking this through may help you to gain some insight into what goes into making a recording “sound loud”—while keeping your levels under control.
Mike Rivers (talkback@recordingmag .com) is on the road again.