In the vast majority of popular music, the vocals may well be the single most important element. They outline the catchy melody as well the lyrics. It is the nature of the expression of the words in a song that provides the greatest emotional impact. That’s why, after 20 articles laying groundwork, this series will now explore how to make great vocal recordings. Previous installments of TCRM have covered vocal miking and the influence of acoustics (#s 12 and 9 respectively) as well as good levels and a clean signal path (#6). This month’s article will provide specific suggestions and an overview of the whole process.
Elements of great vocals
The single most important factor in achieving a great recording is… (drum roll please)…:
a great performance.
While it’s true that there are many studio “tricks” to make a singer sound better, these can still only take you so far… a mediocre performance generally becomes a “slicked-up” mediocre performance (and even this result requires time and good audio engineering skills). On the other hand, if you start with a wonderfully inspired performance, musical magic is bound to ensue; mixing is much less of a chore. To perform well in the studio, a singer needs four basic things: talent, preparation, confidence and comfort.
Talent itself, though God-given, is not enough; it must be honed. Practice and discipline can amplify natural talent. There’s always room for improvement.
Before entering the studio, it is important that a singer prepare both physically and mentally. A large part of this comes from getting enough rest (for mind, body and vocal chords), and by practicing all songs to be recorded until they flow naturally. All of the words should be memorized to aid in performance. Music stands and hanging lyric sheets tend to cause noise and interference problems in the studio. In addition, the act of reading the words and keeping track of lines, verses and place on the page saps mental energy and distracts the performer. Singers should be well rehearsed, but take at least one day off from singing before hitting the studio.
As part of their rehearsal preparations, singers should practice both with, and without, a microphone. In the studio, the microphone is an important part of a vocalists sound, and should be considered a part of their instrument. They should give careful consideration to how the mic, their movements and vocal techniques interact to create a particular sound. Vocalists should listen over closed-back headphones, not a PA or monitor speakers.
Of course, the engineer/producer also needs to do some prep work. They should become familiar with the songs to be recorded as well as the singer’s style and tone. Engineers should discuss the particular sound desired by the singer, consider mic choices, placements and vocal treatments (compression, eq and/or effects to be used). Such preproduction practices will also help bolster the vocalist’s confidence. Speaking of which…
Because vocal sounds come directly from the human body, the tension and stress caused from a lack of confidence translates directly into a singers sound. A lack of confidence can be seen, heard, and felt. A mix of talent and preparation are good first steps towards gaining confidence in themselves. In the studio, however, singers must also be sure of the gear and the engineer.
Engineers/producers who are knowledgeable, prepared, confident, genuinely interested and encouraging can quickly help singers gain confidence and composure in the studio. During preproduction, play a bit of your best work for them. Ask them to bring in CDs of vocalists they like and discuss the recordings.
On the day of the session, the basic gear (mic, cables, signal path, water bottles) should be set up and ready to go when the talent arrives at the studio. Take time to set up correctly and get a good cue mix and vocal sound (with compression, EQ, and subtle reverb). The talent must sound good in the headphones before recording actually begins. The first few passes should be used to find exact settings for the gear and allowing the talent to get comfortable in their surroundings (more on this in a moment). To perform at their best, singers must be inspired by what they hear in the cue. Similarly, never let a vocalist hear their mic or tracks soloed. They almost never like such a naked truth! (The real-life version of the classic public-speaking-without-pants dream….)
In the cue mix do not use pitch correction, chorus, phase, flange, tremolo, vibrato, or overly heavy reverb on the voice. These effects should be saved for final mixdown. During tracking they will obscure a singer’s natural sense of intonation. When this happens, vocalists are more likely to give an out-of-tune performance. Furthermore, the lack of precise pitch reference feedback can actually undermine a performer’s confidence, if what they hear in their head and achieve with their vocal cords does not match what they hear in the headphones.
Most singers will also be more comfortable in a smaller space than a larger one. Dim the lights and restrict visual distractions - especially people. Remove all but the absolutely essential personnel from the control room or adjoining tracking spaces. Performing and recording vocals in the studio is an intimate, personal, and highly revealing process. Even seasoned pros get nervous if there are too many people sitting around critiquing them during tracking. Declare it a “closed” session.
It is more likely that a person will perform well if they are comfortable. Because the human throat itself is the sound source, it should be kept moist and free from undue tension. The throat and vocal cords are subject to fatigue over time. Singers must take a break every once in a while. How often this is necessary depends on the individual, but is influenced by both psychological and physical stressors, session duration, musical dynamics, tone, range, and the studio’s air quality.
Environmental issues must also be considered. The ambient temperature is important as anything too cold can cause the throat and lungs to tighten. Try to keep the temperature between 69 – 74 degrees at 35-45% relative humidity. This is the best range for relaxed breathing and personal comfort.
A space to sit and relax during short (or not so short) breaks is a must. It should have magazines as well as a video game system, TV and DVD or Blu-ray player with a selection of titles. There should be a supply of both room temperate and cooled spring water. Luke warm tea with a touch of honey is also preferred by some singers to revitalize their throats. Note: alcohol is not generally good for the voice or intonation. Often, booze only makes people think they are performing well. (Still, there are always exceptions….)
Breaks should be taken regularly before any throat fatigue is either heard or felt. Once it becomes obvious, it is usually too late. Only longer amounts of time and rest will revive overworked vocal cords. At that point you must either break for a meal, movie, or to just call it a day and resume tomorrow. In my experience, continuing to sing with a tired throat can start the session spiraling down a dangerous and debilitating path. The singer pushes and tries harder to make up for the fatigue. This quickly causes further loss of tone. The ensuing frustration and tension can undermine confidence and adversely color the rest of the recording process….
Recording and Mixing Techniques
Of course, starting with a good performance does not guarantee a good recording. The recording process can make vocals sound noisy, distorted, boomy, sibilant (full of grating “s”, “ch”, “f” and “t” sounds), too dynamic, overly compressed, nasal, comb filtered, or just plain miscolored. So, now that we’ve explored ways to get the best performance, it’s time to get down to the nitty-gritty of recording and mixing it.
The best mic for vocals…
In professional studios, the large diaphragm condenser (LDC) is by far the most common type of mic used on vocals. LDCs offer a clean sound, with a transient response range that works well up close on the human voice. Small diaphragm condensers tend to accentuate transients and sibilance in a way that is distracting and overbearing to the listener. While dynamic mics are generally less expensive and commonly used for vocals in live sound reinforcement, many have a more limited and/or uneven frequency response and a slower transient response.
…most of the time
Now that I’ve made these sweeping generalizations, let’s take a step back; the right mic is the one that sounds the best with a particular singer in a particular situation. Again, singer psychology can also be a factor here. For instance, whenever prospective clients come to check out my studio I tidy up, cue one of my better past projects, and hang my Lawson L47MP microphone in the booth. When they see the inspiring sight of that classically styled, large, gold-plated tube mic… many get excited and book time right away. The problem then is that, when they come in for their recording session, they do not even want to consider other microphones. In fact, they think they are getting screwed if I don’t use the Lawson!
Fortunately, the L47MP is a very versatile mic, with infinitely variable polar patterns that also modify the frequency response curves. Taking advantage of this, and through judicious use of eq, I can get a good sound on most singers. To be sure I get the best sound possible, however, I often employ a second mic (of my own choosing) hung right next to the Lawson. This allows me to choose between the two later (the client is usually none the wiser) or to mix the two together to create a cool stereo or double-tracked effect.
The exact frequency response contour of a microphone is an important factor in how it matches with a singer’s voice. The various accentuations and deaccentuations of particular frequency ranges may (or may not) compliment a singer’s voice, style, or the overall mix. A boost in the 2 to 8 kHz range is of particular interest for vocals and is known as the presence peak. This area is naturally important in determining vocal tone and intelligibility through vowel production and formants. Which frequencies are most important depends on physiological factors such as gender, stature, and vocal technique.
With a large diaphragm microphone, a plosive (pop) filter should be placed from 2 to 6 inches in front of the capsule. This will help reduce rumbles cause by air (breath) blowing across the mic diaphragm. Normally, the performer stands from 4 to 18 inches away from the front of the mic (on axis). At this distance, directional microphones will cause an increase in bass response called proximity effect. This can be offset by moving the singer back from the mic, changing to an omnidirectional pattern, or using a high pass filter. Of course moving the performer away from the mic or changing to an omnidirectional will both cause an increase in the amount of room tone captured.
So now it’s time to consider just how much room tone is desired in the recording. When the performer is further away from the microphone the more room acoustics are be picked up. At the same time, more gain will be needed later to compensate for distance. This adds noise. With vocals it is often best to get a close, clean recording with minimal room sound and add artificial reverb during mixdown. Reverb is easy enough to add when more is needed, but nearly impossible to remove if too much was initially recorded.
Also be mindful of the effects of first order reflections. If vocal sound bounces off a sidewall and then into the mic, it can interfere with the direct sound and cause comb filtering. Use gobos or careful mic/performer positioning, along with careful consideration of the mics polar pattern, to reduce early reflections. Angling can bounce early reflections away from the mic or into its less sensitive spots. Gobos, or other absorptive materials, can help reduce the energy of reflections.
Generally, it is best for the performer to sing standing. This allows greater expansion of the lungs and increased breath support. Unlike stage performance, the singer should be encouraged to stay relatively still, not swaying to the sides or moving their torso around too much. Minor swaying and bobbing is OK to get in the mood, but a few centimeters are all that should be required for groovy. The singer’s mouth should remain at approximately the same distance from the mic and aimed towards it throughout the recording.
Even though many performers are used to handheld microphones for the stage, it is generally best to avoid these in the studio. They often have frequency contours, SPL ratings, off-axis rejection, S/N ratios, and ruggedness specifically designed to suit live sound reinforcement concerns. In addition, when holding a microphone a performer tends to move it around too much creating air noise, thumps, and other handling noise.
Another, often overlooked, aspect of recording vocals is the performer’s attire. Articles of clothing or jewelry that we normally don’t give a second thought can become track-wrecking nuisances in a quiet vocal booth. It is not uncommon for singers to come to the studio wearing bracelets, jackets, necklaces, timepieces, assorted leather products, or hair beads. I even had a singer come to a session once with a mini sleigh-bell in his nose (no joke)! Be aware of these issues at the start of the session. All of the above articles, as well as any other possible noisemakers, should be removed (or otherwise silenced) before tracking starts.
Use of dynamics
Compression is usually a highly necessary part of vocal recording and mixing. In fact, it is even commonplace to use double or triple compression. This is due, in part, to the technical and historical generation of that particular sonic aesthetic. In order to maximize the signal-to-noise ratio of analog recordings, while keeping the tracks from distorting, compression was used to reduce the dynamic range of audio before it was recorded to tape. Further compression was added by the analog tape medium itself. Finally, compression was used in mixdown to shape the vocal sound and help them stand out in the overall mix. This is not only a sound that’s pleasing to the listener; it’s a sound we have all become accustomed to.
Recording initial vocal tracks with printed compression (a.k.a. compressing “to tape”) was good form on analog tape and even 16-bit digital systems, partly to increase the signal-to-noise ratio. On many current DAW platforms, tracking compression is not as easily done; much of the processing is done either non-destructively, or printed destructively after the initial recording has occurred. In either of these situations, there is no increase in signal-to-noise; in fact, there is a reduction. Fortunately, this is much less of a concern with a clean signal path and good 24-bit converters. Though there is no longer the same technical need for this initial compression stage, it can still be desirable for it’s contribution to the vocal sound. In fact, an initial tracking compression can help inspire the singer in his/her cue mix. It can be printed to the tracks, or added non-destructively.
This initial effect is usually on the gentler side: with a compression ratio of 4:1 or less. Set the threshold so that compression is triggered on the loudest notes, but not the softer ones. The gain reduction should max out at -6 to -8 dB or so, getting there only sporadically. Since the visual metering of this reduction can vary greatly, don’t forget to rely on your ears. The effect should not be blatant or awkward.
In general, the attack time should be from 5 to 25 ms depending upon the desired effect. The release time setting is generally around 100 to 150 ms. However, the tempo, musical feel, desired sound and singers performance style can all affect the exact settings of attack and release times. A comparison of the vocal styles of the Red Hot Chili Peppers Give it Away versus Under the Bridge offers a great example of how different tunes call for different attack and release times. The vocals on Give it Away are very staccato. The words often both begin and end abruptly. This calls for a shorter attack time and can also accommodate a shorter release than on Under the Bridge. Under the Bridge is more legato with held notes, requiring a longer release. The beginnings of many lines have long breaths or vocal scoops that are important to capturing the intimacy of the song. A longer attack time allows these initial bits to be accentuated.
Mixdown compression can be much more aggressive than initial compression. Ratios range from 2:1 to 8:1 or more. Be careful however to avoid the distortion or lifelessness of overcompression or the well-known “pumping” or “breathing” of inappropriate attack or release times. If a held note drops below the threshold, a shorter release time may bring the level back up before the note is over, causing a conspicuous swell. Similarly, a long attack time used on a sustained scream of “yeah” might decrease the level right in the middle of the held vowel. Very uncool.
More and more often these days, engineers are using multiband compression on lead vocals. This allows more complex, thorough and balanced compression regardless of vocal range (tessitura) or dynamic level.
When mixing, it may also be necessary to use a de-esser if the vocals exhibit too much sibilance. Though the “auto” function and presets can work well on some sources, de-essers with user-definable frequencies are the most flexible. Refer to parts 17 and 18 for more detailed discussions of compressors and their use.
The use of eq on vocals can be broken down into three areas: eq-ing for tone, eq-ing for intelligibility, eq-ing for the mix. First, to sculpt vocal tone, there are several common important frequency areas. 60-80 Hz can be boosted for greater low-end presence, bringing out natural chest resonances. If the vocals sound too “nasal” in quality (or like someone with a head cold), cutting somewhere from 250-400 Hz can also work wonders. It is often said that the 8 to 15kHz range controls “air,” “sparkle,” or “brilliance.” Dial up more or less of this as needed.
Frequencies in the 2-5 kHz range can be brought up to increase presence and intelligibility. Unfortunately, it can sometimes also bring out sibilance, so choose the range and gain amount carefully by ear. In tunes with thick guitar sounds, taking some of the energy out of this range in the guitars can help make way for the vocals, improving intelligibility as well. This is especially helpful when the tone of the vocals would be compromised by boosting their presence areas any further. This series also discussed eq in great detail in parts 14 and 15.
Effects and other treatments
The first type of processing everyone needs in their studio these days is automatic pitch-correction. Be aware that these work better the more information you supply them. Many allow the user to input a tuning reference number, key, and even to input a score to follow when necessary. Do not assume that they will work perfectly merely by inserting them on a mix channel.
Effects such as delay and chorus can be used to thicken up the vocal line, as can layering multiple takes or separate tracks dedicated to a single take of a multi-mic setup. Distortion, especially analog emulations, can really make the vocals jump out when used sparingly. On occasion, or for more aggressive musical styles, heavier distortion makes for a cool effect.
Stereo exciters work well, as does the use of two or more tracks, hard-panned to separate stereo locations, treated differently with EQ, distortion, or delay. Pitch shifters and harmonizers can also work well in stereo. Subtle sub-octaves are a cool effect for the chorus.
The single most common mistake with all effects is too use too many or to mix them in too heavily. Be creative, but be careful not to go overboard. Effects and their usage were outlined in depth in the last two installments of this series (TCRM 19 and 20).
Let your voice be heard
Hopefully, this has given you some useful ideas to aid in the process of recording and mixing this most finicky musical device. Now that we’ve gone over so many guidelines, be aware that there are always exceptions; some singers will perform best only with a handheld mic, in an open space, with live open-air monitoring and with the entourage hanging around…. Be prepared but be ready to be flexible and do whatever it takes to get the take!
Next time we’ll start an in-depth look into the current state of DAWs.
John Shirley is a recording engineer, composer, programmer and producer. He’s also a Professor in the Sound Recording Technology program at the University of Massachusetts Lowell and chairman of their music department. You can check out some of his more wacky tunes on his Sonic Ninjutsu CD at http://www.cdbaby.com/cd/jshirley.
Supplemental Media Examples
Previously in this series there have already been some 33 samples of how to record and treat vocals, each one focusing on a single idea or technique. The samples included here are intended to add to that collection but not necessarily to duplicate them. Though there’s a lot to take in here, going back to previous articles may be well worth the effort….
TCRM 12, .wavs 33-45, contain further examples of mic choice and placement.
TCRM 15, .wavs 34-40, offer more examples of how to eq vocals.
Refer to TCRM 16, .wavs 1 and 2, for how eq can be used on vocals for more extreme effects.
Wave files 1-4 of TCRM 18, .offer examples of the use of dynamics for vocals.
Finally, TCRM 20, .wavs 27-36, offer further examples of effects usage on vocals.
Now let’s get down to the new samples…. Don’t forget, the pictures bundled with this article all come from the generation of these example soundfiles so check them out while listening.
Now, a Mojave MA-200 at the same distance: TCRM21_2.wav
Here’s what it sounds like to use both tracks and pan them hard to opposite sides: TCRM21_3.wav
For the next ten examples, we’ll mute the R121 track and focus exclusively on the MA-200 and some possible treatments. First, let’s simply add a plate reverb: TCRM21_4.wav
Now, without the reverb, let’s try adding a bit of eq: TCRM21_5.wav
The addition of the eq may overstress the sibilance, so let’s try adding a de-esser after the eq: TCRM21_6.wav
Two compressors are now added; one to decrease dynamic range and smooth out levels, and one to accentuate attacks and releases of the vocals making them sound even more present and close: TCRM21_7.wav
Now that all of that’s done, let’s try adding a reverb back onto the vocals: TCRM21_8.wav
Again removing the reverb, let’s try adjusting the highs in the eq again: TCRM21_9.wav
To that we’ll add some harmonic distortion: TCRM21_10.wav
Another variation on the distortion effect: TCRM21_11.wav
Now let’s hear what that distortion sounds like when we add the reverb back: TCRM21_12.wav
Finally, let’s hear what adding a short, single-tap, modulating delay before the reverb sounds like: TCRM21_13.wav
OK, now let’s see what happens if we choose just the R121 track to work with. Here’s the original dry track again: TCRM21_14.wav
Now with eq: TCRM21_15.wav
Next, let’s add just the leveling compression: TCRM21_16.wav
Now that the compression is there, maybe we should reconsider the eq settings? TCRM21_18.wav
Finally, adding the reverb used in example 12: TCRM21_19.wav
Now, let’s add some reverb to the guitar for good measure: TCRM21_20.wav
To be fair, let’s add reverb to the guitar and listen to the MA-200 track with basic eq and dynamics, but not the distortion or delay: TCRM21_21.wav
Finally, you may have heard the air conditioning rumble and noise kicking in on these tracks, especially evident towards the end. Let’s apply a high-pass filter set at around 80 Hz and see if we can reduce this: TCRM21_22.wav
Now let’s listen to two final examples, using a different track, which showcase two different mics. First, the mix uses the R121 on vocals: TCRM21_23.wav
Our last example uses the ubiquitous Neumann U87 on the vocals: TCRM21_24.wav
Special thanks to Bernie Mack at Flashpoint for supplying the raw female vocal tracks to create these examples as well as to Claire Stalienbacker for lending her enviable talents on both voice and guitar. Thanks also to Bernie for supplying the last two mixes comparing the R 121 and the U87 on the same tracks.