Problem number 1:
You have a modest home studio in one room of your apartment. What with all the good gear available at bargain prices these days, you have the potential capability to make and mix good sounding recordings.
But your monitoring setup is only so-so: a pair of decent near field monitors driven by an okay amp in a room with the usual acoustic problems. You don’t have the money for a major speaker upgrade, and even if you did there’s still the room. Major acoustical renovations are out of the question (you’re renting, after all) and you can’t afford to move.
Last week you heard one of your mixes played back in the new mastering suite of a local studio. It was great to hear your stuff in such a good room, but also depressing—problems with the mix that you hadn’t even been aware of were plainly audible over their big monitors.
Problem number 2:
You’ve been hired to record the local choir in their church. Your regular monitors are too cumbersome to lug on location, so your borrow a pair of highly-regarded near field monitors.
You set them up in the control room you’ve made out of a room adjacent to the nave. Unfortunately, the room’s acoustics are so abominable that the quality of the loudspeakers is practically irrelevant; you simply can’t hear what you are doing. The recording session begins in 30 minutes.
Since this article is about using headphones as monitors, you should be able to guess the solution to each problem: use headphones. However, a good many people might object to using headphones as critical monitors, and not without reason. They have probably encountered...
Problem number 3:
You’ve been hired to record a group of musicians on location. There is no possibility of setting up a separate control room; you will be in the same room with the musicians.
You monitor the session on headphones. Your mix sounds pretty good on the phones. You go home and listen to the recording on loudspeakers.
It doesn’t sound anything like you thought it would. What happened?
The idea of using headphones as critical monitors is at least controversial. Yet there are respected engineers and producers who do just that, sometimes out of necessity, sometimes even out of preference.
It is a practice that is especially common among engineers who record classical music for a living. That could be explained by the nature of the job: most orchestras, chamber groups, etc., record on location.
But the art of setting up temporary control rooms at such sessions has been successfully practiced for a long time by all the major record companies. Clearly, some engineers see real advantages to monitoring with headphones, apart from their obvious convenience when on location.
I have long counted myself among those who use headphones for monitoring out of necessity. But after thinking about it a bit, I realized that I have often chosen to use them even when I could have used loudspeakers; I just feel I can hear what’s going on better, and I have learned to adjust to the ways they differ from speakers—for the most part.
It is one thing to know this intuitively and quite another to explain it rationally. I needed to clarify my thinking, so I called up Jerry Bruck, an engineer who makes a good part of his living recording classical music. As I expected, he has used headphones as monitors a lot, and he had much to say on the subject.
So with Jerry on hand as our resident expert, let’s dive in.
I’ve got those interaural crosstalk blues
First the bad news: headphones really sound different from loudspeakers. Let us count the ways.
“The principal problem with headphone listening is you have no mixture of left and right channels; they’re piped directly into the respective ears. Because of that it is very hard to judge the width of your stereo stage, the amount of center that you’re getting, as opposed to left and right. You tend to misjudge that on headphones.”
The fancy name for what Jerry is describing is interaural crosstalk. Loudspeakers have lots of it (if your right ear can’t hear the left loudspeaker, something is wrong), but headphones have none. Mr. Bruck continues:
“A sometimes very serious misjudgement on headphones is to think that it’s wetter than it really is. You find on loudspeakers that you are much closer in perspective than you thought you would be listening on phones.”
The reasons for this are a bit complex, but they probably have a lot to do with aural masking—that is, with the tendency of one sound to mask another. In this case we are dealing with direct sound (a lead vocal, for instance) and reflected sound (as in the reverberation stimulated by that vocal). To quote another authority:
“Angular separation between direct and reflected sounds has a minor effect on the audibility of the reflections, except when the two are coincident, when the reflection tends to be masked by the direct signal.” —Ron Streicher and F. Alton Everest, The New Stereo SoundBook, 2nd edition.
In other words, if the reverb is coming from the same direction as the vocal or very close to it, it will tend to be masked by the direct sound. Here is how I think this applies to the headphones vs. speakers question:
With typical stereo loudspeaker monitoring the angle between speakers is about 60 degrees, or 30 degrees to each side of our listening position. This is a small portion of the 360 degrees over which we hear. It is safe to say most of the sound approaches from in front of us.
With headphones the earcups encompass an angle of 180 degrees relative to our ears, equivalent to speakers placed 90 degrees to each side. This is a big change, and it allows for greater angular separation between at least some of the direct and reflected sounds in a recording. A portion of the reverberation will literally be unmasked, and so the recording will sound subjectively wetter.
Experimental data on angular separation and masking imply that this difference is equivalent to increasing the reflected sound level by 5–10 dB. This would certainly be an audible difference, and it bears out Jerry Bruck’s observation (and my own experience). But...
The binaural difference
There is one circumstance where “it sounds wetter on headphones” does not apply. This is when you are making a binaural or quasi-binaural recording.
In its purest form, binaural recording uses small-diaphragm microphones placed in the ears of an anatomically correct dummy or, less conveniently, in the ears of a real person. Then there are quasi- binaural microphone arrays like the Crown SASS that mimic the ear spacing and acoustic shadowing effects of the human head.
In either case, recordings made with these systems and reproduced with headphones preserve the spatial cues of the recording environment with considerable accuracy. This allows our brain to sort out the direct and reflected sound as it does in real life, with a gain in realism and clarity that can be really startling.
When you play the same binaural recording over loudspeakers the spatial cues are altered. What had been all around you is now placed in front of you, as if viewed through an open window, and—masking effects or not—it tends to sound more distant.
This is, of course, a special case, as most recordings use conventional microphone arrays that do not preserve spatial cues accurately, regardless of the playback method. So lacking a realistic acoustic model to work with, our brain resorts to other methods of analysis, and things like masking effects assume greater importance.
The lowdown on bass
So far we’ve learned that headphones create a weird stereo image and do strange things to reverberation. What else have we got to look forward to from our potential critical monitors? Let’s have Jerry Bruck talk about low frequencies:
“Low frequencies are only partially a question of what you are hearing with your ears. A lot of it is, in fact, called “skin effect,” borrowing a term from another discipline. No matter what the frequency response of the headphones turns out to be, no headphone ever gives you the sense of low frequencies that you get from speakers.”
This is why Hollywood doesn’t use headphones when it wants to make you feel an earthquake or an explosion.
If headphones differ from loudspeakers in such significant ways, can we still use them as critical monitors? The answer, I think, is a qualified yes. Let’s look at the advantages that might make us say yes, along with the qualifications.
The case for headphones as monitors
There are many situations where headphones have the potential to offer a higher quality of sound than loudspeakers:
• They are not dependent on room acoustics, which can vary tremendously. As a sound reference that is consistent from venue to venue, headphones are a uniquely practical solution.
• Most practical high-quality loudspeakers use more than one driver for each channel and need a crossover network of some sort. The potential for sonic problems with this arrangement has always been a challenge for speaker designers. Headphones can bypass such problems entirely by using a single driver for each channel.
• There is no agreement as to the ideal polar radiation pattern for a loudspeaker. A variety of approaches are found in both consumer and professional settings, and they all interact with the listening room in different ways. With headphones this is not a consideration, let alone a problem.
• The very characteristic that makes the most difference in the way headphones sound—the lack of interaural crosstalk—makes them revealing of details in a recording to a degree that no conventional loudspeaker setup can match. This is one reason why many classical music engineers use them: if you need to catch things like that smudged entrance in the second violins or a fluffed note by the bassoonist, headphones will tell you about it much quicker than loudspeakers will.
• This same precision in rendering detail makes headphones superior for editing stereo program material. They reveal what is really going on at the splice point much more readily than loudspeakers—most of the time. (I’ll let Jerry Bruck tell you about the exceptions in just a bit.)
There are many practical advantages to using headphones as well:
• They are lightweight and portable.
• Dynamic-element phones have no need for large, powerful, expensive amplifiers. Suitable headphone amps are already built into a lot of recording equipment, and separate headphone amps are usually small and cheap compared to speaker amplifiers.
• Closed-back headphones provide some isolation from your surroundings, making it possible to monitor where it would otherwise be difficult or impossible.
• Some headphones are capable of deep bass response normally found only in very large full-range loudspeakers or in subwoofers. If you record pipe organs for a living, you may find this useful.
• Did I mention that they are lightweight and portable?
Even the more expensive top quality headphones offer more “bang for the buck” than loudspeakers. For example, the Sennheiser HD 580 has a street price of about $250. This gets you dynamic element headphones that are considered to be near-equivalent to the most esoteric electrostatic models. Loudspeakers with equivalent sonic performance could easily cost five times as much, or more.
A fly in the ointment: the editing bug
I said that my ‘yes’ to using headphones as critical monitors was qualified. Jerry Bruck explains one small but important problem that can happen when editing stereo program material:
“I love using headphones for editing because of the precision. If there’s some vestige of something you don’t want in a splice that’s, say, coming in or going out, the headphones will nail it in a way that loudspeakers never do.
“But—and this is a big but—a real danger exists, because I have many times had the experience, as others have had, of making a splice in music, and on headphones it is totally inaudible. You sit there congratulating yourself on what a wonderful splice you have just made, and then take the headphones off and turn up the speakers, and suddenly it’s glaring. It’s there, and you can hear it, and you go, ‘how can that be? Why can I hear it on speakers and not on headphones?’
“I don’t really have an explanation, but I will venture one: that it has to do with the phase relationships between the channels. Again, it’s the interaural mixture that occurs with speakers and doesn’t occur with headphones. Once that happens the two channels have phase relationships that are a give-away, that something has happened here that could never happen in real life.”
“So I would more than caution anyone, I would actually warn anyone who attempts editing in headphones: check your work on speakers. It doesn’t happen every time; it happens like one time out of 20, but that one time will really amaze you when it occurs, and you’ll have to go back and re-do it. Otherwise everyone is going to hear it.”
Interaural crosstalk blues II: why we need both headphone and loudspeaker monitors
A good case can be made for always using both headphones and loudspeakers to monitor our work. The two different monitoring methods each tell us different, useful things—stuff that we really need to know. (There is also the issue of the increasing use of headphones by our listeners with all those millions of portable music players out there.)
Once we decide to work this way we can reap some real advantages from it. For example, if I have really good headphones I have less need for big, expensive loudspeakers. The things I need speakers for—checking the stereo image, the direct to reflected sound balance, etc.—can be reproduced just fine by smaller, less expensive loudspeakers (provided their basic sound quality is adequate). Other things—fine balances between instruments in the mix, little details of performance, tonal colorations, etc.—are better checked on top quality headphones.
Jerry Bruck has taken this idea to its logical conclusion for some of his location monitoring. On some of his smaller scale jobs, where he doesn’t want to lug around a whole van-full of equipment, he brings along the Cambridge SoundWorks Model Twelve speaker system.
You’ve probably seen the ads for it: “stereo in a suitcase.” Two small satellite speakers and a small 3-channel amp fit into a medium sized carrying case that also has a built-in woofer. Unpack the amp and the satellites and the case becomes the subwoofer of a powered 3-piece system. According to Jerry, the sound quality of the Model Twelve is quite good—certainly accurate enough to tell him what he needs to know. Then for the fine details, on go the headphones.
[Editor's note: at the time of this writing, the Cambridge Soundworks system was, if not unique, unusual and worthy of mention in the industry. Nowadays there are a lot of fairly portable 2.1 speaker systems with tiny satellites and a reasonably portable subwoofer that might serve as alternatives to headphones.]
I’ll leave you with this thought before we consider how to evaluate different headphones:
“It is an axiom of any recording technique that the final result is only as good as the monitoring system used when making the recording. The more accurately the recording engineer can hear throughout the process, the better the final result will be.” —Streicher and Everest
Interaural blues III: evaluating headphone sound quality
How do you choose, from all the models on the market, the headphones that will work for you?
To start with, we could try measuring headphone performance in a laboratory setting, much as any manufacturer does. This is both easy and difficult. The easy part is placing the headphones on the ears of a measuring dummy (such as the Head And Torso Simulator by Bruel & Kjaer) or onto a specially designed coupler or artificial ear (the B & K type 4153, for example). You then run test signals through the headphones, they are picked up by the microphone(s) in the dummy or coupler, and you have your test data.
The hard part is deciding what the test data mean.
One problem: the dummy ears or coupler are meant to simulate average human ears. Who is average? No one. Does it make a difference? Yes. It’s like trying to determine how a loudspeaker will perform in one room by measuring it in a different one.
Now, speaker designers do that all the time; they measure loudspeakers in anechoic chambers, where no one in their right mind listens to music. It is a useful exercise but it does not tell the whole story.
Measuring a headphone with a coupler has similar limitations. Just as the anechoic chamber will not tell you much about room effects, the coupler will not tell you much about the variable effects of real human ears interacting with headphones.
Another problem: there is not complete agreement as to what equalization curve constitutes “flat” response when a headphone is measured on a coupler or dummy. One choice is free-field (sound arriving with no reflections), another is diffuse-field (sound arriving with many random reflections).
A strong argument for diffuse-field equalization is that it better matches real-world listening conditions. Several “diffuse-field equalized” headphones have been introduced over the past decade, with models available from AKG, beyerdynamic, and Sennheiser.
They do not all sound alike. Apparently there is no consistent standard for implementing diffuse-field equalization in headphone designs. While my own attempts to locate such a standard have produced no results, that does not mean it does not exist. If anyone out there does know about, say, an ISO standard for diffuse-field equalization of headphones, feel free to email me about it: firstname.lastname@example.org.
I do not have a testing laboratory, but I still need to evaluate headphones. So I use my ears. I listen to test signals and to music.
First, the test signals. I listen to two types of signals: warble tones and pink noise. The warble tones are sine waves that continually vary in frequency over a range of about 1/3rd octave. This prevents resonances from building up at any one frequency in the test environment (usually listening rooms, but it works for ears covered by headphones too). I use the warble tones to get some idea of the bass extension of the phones under test. At some point it becomes necessary to dramatically boost the signal to hear anything at all, and this is usually a good indication of the useful limit of bass response.
Pink noise is good for assessing overall tonal balance and showing up colorations—midrange humps, upper bass dips, or whatever. These show up as tonal changes in the pink noise.
The real test, though, is music. It is important to pick music recordings that have the right characteristics. Most commercial recordings, especially those of pop music, are disqualified from this test because we do not know what was done to them during recording and post-production.
We can listen to two different monitors with a given recording and say, for example, that one sounds brighter than the other. But which one is the more accurate monitor? What does an AKG C-12 tube microphone, nine inches on axis from a particular singer, put through a compressor, a parametric eq, and a Studer analog multitrack tape machine, really sound like? You tell me.
There are a couple of ways around this situation. One is to use recordings that you make yourself with simple techniques, no processing, and microphones considered to be accurate. I do this myself using the Crown SASS stereo microphone. I do not think the SASS is a perfect microphone, but its deviations from accuracy occur mostly at the frequency extremes. It also helps that it is a quasi-binaural array. So if I listen to a recording made with the SASS through particular headphones and it sounds more like I’m actually there, I figure I’m on the right track.
A second solution is to seek out commercial recordings made with simple techniques, relatively accurate microphones, and no processing. These do exist. Check the sidebar for some suggestions. One hint: almost any of Jack Renner’s recordings for Telarc would qualify.
Listen to such a recording on, say, two different headphones. If with one pair you hear midrange colorations or boomy bass while the other pair sounds open and well balanced, it is likely that the better sounding headphones really are better.
And I’m not going to leave you just with those guidelines, useful as I hope they are; we’ll review some good headphone candidates in Part 2, also available online at recordingmag.com.
Robert Auld is an audio engineer and theatre sound designer who works in New York City. He would like to thank Jerry Bruck for his contribution to this article. You can write to Robert at email@example.com.
Evaluating Headphones: Some Commercially Available Recordings
There are various CDs available with pink noise, warble tones, etc. The one I use is a French import: Compact Test Demonstrations, Pierre Verany, PV-784031 (issued 1984).
Music recordings that more or less fit my criteria for accurate pickup:
Charles Tomlinson Griffes: Goddess of the Moon. Perspectives Ensemble, Newport Classics NPD85634. Exotic chamber music, recorded in live performance by engineer Joe Stanko with the Crown SASS stereo microphone.
The King James Version: Harry James and his Big Band. Sheffield Lab 10068-2-F. Recorded direct-to-disk in 1976. Engineer Ron Hitchcock used a single stereo microphone (a modified AKG C-24), augmented by spot mics on the piano and bass. The CD reissue is still one of the most realistic recordings of a jazz big band ever made.
Paquito D’Rivera: Portraits of Cuba. Chesky JD145. In 1996, 20 years later, Bob Katz used virtually the same methods (four microphones, with the band on location in a church) to record this latin-tinged big band led by arranger Carlos Franzetti. This record won a Grammy for best jazz album.
Edvard Grieg: Lyric Pieces; Emil Gilels, piano. Deutsche Grammophon 429 749-2. Recorded in a church in Berlin, 1974. Emil Gilels was one of the great concert pianists of our time, with a kind of tonal control of the instrument that most pianists can only dream about. The recording is a good, straightforward job, probably done with two microphones.
P.D.Q. Bach: 1712 Overture & Other Musical Assaults. Telarc CD-80210. As is their custom, Telarc lists most of the equipment used to make this recording. The microphones were small diaphragm condenser models from Schoeps and Sennheiser and “the signal was not passed through any processing device (i.e., compression, limiting or equalization) at any step during production.” Engineer Michael Bishop did a superb job of recording the large orchestra used for most of the selections, and was also responsible for the imaginative sonic collages that accompany Prof. Peter Schickele’s introductions.
Michael Murray: The Great Organ at the Cathedral of St. John The Divine, New York City. Telarc CD-80169. One of the world’s great pipe organs, recorded in 1987 in the world’s largest gothic-style cathedral. Producer and engineer Robert Woods used Bruel & Kjaer 4006 omni microphones for his pickup. As you might expect, the bass extension of this recording is impressive.
Walton: Belshazzar’s Feast. Atlanta Symphony Orchestra & Chorus, Robert Shaw, Telarc CD-80181. One of engineer Jack Renner’s best recordings. The Walton is a big, flashy piece using a large chorus, baritone soloist, organ, and two antiphonal brass choirs, in addition to full symphony orchestra. I find the opening a cappella male chorus section useful for revealing midrange colorations in monitors. And about 16 minutes in there is a very complex section (“Praise Ye”) that separates the men from the boys. It takes really superior gear to do it justice.
Mahler: Symphony No. 8, “Symphony of a Thousand.” Atlanta Symphony Orchestra & Chorus, Robert Shaw, Telarc CD-80267. Jack Renner has written that the sessions for this recording were the three most difficult days of his life. The logistics had a lot to do with it: when you give nearly a thousand performers a break, it can take them close to a half hour just to get off the stage! Renner used a dozen or so microphones (Telarc’s usual arsenal of Schoeps, Sennheiser and B & K mics) to cover this huge ensemble, which he mixed down on the spot to a custom-built two channel A/D converter. (He must be a very brave man.) I have minor quibbles with some of the balances here and there, but there are other parts of the work that, heard on first rate equipment, have enormous impact.