Optimizing your DAW’s performance through system setup and personal workflow.
Welcome back. Last time we got a bit nerdy with all that computer stuff. Now, let’s finish it off with a short discussion of drivers and plug-in formats and then get down to the nitty-gritty of DAW use, including how to optimize performance through your approach to mixing. I’ll discuss using plug-ins, give editing tips, and outline good session hygiene.
Driving me mad
A driver is a piece of software that enables the operating system computer programs to communicate with a specialized piece of hardware. Drivers are necessary to properly integrate the seemingly infinite number or variations in hardware, software, and operating system configurations possible.
In an attempt to simplify things, some basic audio devices do not require the user to install any specialized driver or undergo any long installation process. For example, most removable media devices like USB flash drives or external hard drives do not require extra software and/or effort to get the operating system to recognize and install them. When you plug in such a device, the computer automatically determines what it is and what it’s for. These devices are called plug-and-play. It is not that they do not require a driver, but that they can use a generic driver already included with the computer’s operating system.
While plug-and-play capabilities make installation and setup quick and convenient, this is often at the expense of power, flexibility, and/or thoroughness of implementation. Furthermore, communications between audio software and sound card (or other interface) is often slowed considerably, since they are done indirectly through a bulky operating system. This can create greater amounts of latency (delay) in audio signals than is acceptable for professional level performance. For these reasons, more professional and comprehensive hardware usually requires installation of a specialized third-party driver.
Generic plug-and-play drivers are just that: they generically support basic functionality that is common to devices in a specific class. For example, a sophisticated DSP card’s generic drivers might allow simple .wav playback, but would not recognize or be able to use the more advanced functions of the card. These features rely on the manufacturers’ drivers to work properly. Generic drivers will also not be “tuned” for optimal performance with a specific device, really a necessity in DAW systems.
For Windows there are numerous common and distinct driver types: WDM (Windows Driver Model), WDF (Windows Driver Foundation), DirectX, and ASIO (Audio Stream In/Out). Each of these driver types is a framework of specifications; the first three are defined by Microsoft, the third by Steinberg. While WDM, WDF and DirectX might offer more functionality than generic audio drivers, the Microsoft framework isn’t always the best for performance. In general, if you have ASIO drivers, selecting them will be your best bet for optimal performance on your DAW. This is not to say that the Microsoft based drivers are inherently poor, it’s just that all drivers are not created equal; go with ASIO if you can.
That said, specialized drivers for Windows Multimedia (Windows Driver Model), and/or WDF, can sometimes offer a few more advanced (or device specific) features than most plug-and-play-only devices. These drivers are written by (and supplied from) the manufacturers of audio sound cards (or other interfaces). Since they are more device-specific than generalized drivers, their code may have been streamlined, making for less latency in the signal.
DirectX is an application program interface from Microsoft that was created to help develop and implement multimedia for Windows on the PC. Though the primary purpose of Direct X was initially for use in gaming, it has found a significant place in multitrack audio production. Specialized Direct X drivers allow greater power than most plug-and-play-only or WDM devices. These Direct X drivers also tend to exhibit less latency than WDMs, but there are many external factors that can influence the performance of either. When it comes to reducing latency as much as possible, the job is best suited for ASIO drivers.
ASIO is a software protocol designed specifically for professional multi-channel audio, MIDI, and synchronization applications. ASIO drivers require more work to install and configure than simple plug-and-play designs, but offer much greater power with lower latency. They accomplish this by being specific to the hardware and offering a more direct link between the audio software and hardware (essentially bypassing the bulk of the OS). ASIO drivers allow the user to address all features of their hardware both more independently and flexibly.
Though originally developed by Steinberg, ASIO is supported by many leading audio manufacturers. Generally, ASIO will offer the most comprehensive professional feature set as well as the best performance with most Windows audio applications.
Core Audio is a driver spec for audio and MIDI that all interfaces and applications must use under Mac OS X. Core Audio is generally stable, reliable, and fast – which is important, since you don’t have any other choices when setting up audio hardware on a Mac! (ProTools TDM, and their new HDX, systems are notable exceptions to this rule).
Control panels or setup (configuration) software.
More complex audio devices often also require another software application, called a control panel, for configuration and setup. In a number of cases, this software allows for device settings that cannot be achieved via the device’s front panel.
Plug-ins are audio processes of all kinds that can be added to a track or audio channel and processed in real-time (usually non-destructively).
Generally, they are inserted into a DAW project using channel inserts or by auxiliary send/return methods or subgroup (sometimes fx) channels. Effects such as reverb, guitar amp modeling, modulation, pitch correction, and dynamics are all available as plug-ins. On some systems, even basic EQ and/or virtual instruments are accomplished through plug-ins.
Currently there are a number of plug-in formats available. These include MAS, RTAS, AU, Direct-X, VST, TDM, and the new AAX format for ProTools 10 systems. Which format is right for you depends on what DAW software you are using as well as which operating system. Some hosts are even able to run two or more of these formats simultaneously.
As mentioned previously in this series plug-ins can be quite demanding of processing power (TCRM 20 discussed plug-ins at length). Since several articles have also discussed technological approaches to increasing or maximizing your processing power, let’s now look at how mixing methods can affect this as well.
Maximizing performance: the judicious use of effects
The flexibility of the computer DAW can cause a lot of confusion about how to best allocate its resources. Users who first learned to mix on a computer often didn’t benefit from one of the most important lessons a non-computer studio can teach you: how to conserve tracks and effects! As a result, the novice computer recordist piles on tracks and plug-ins until the DAW coughs up a “CPU overload,” “DSP error” or “buffer underrun” error – at which point he or she assumes they have to upgrade the computer, not realizing that it’s the way of working that’s bogging down the computer.
A common problem is the overuse of inserts rather than sends for effects like reverb, which can be shared. The novice inserts a new reverb on every track that needs it, leading to a dozen or more reverb plug-ins running at once – a recipe for disaster. Instead, place one reverb on an aux return (or fx channel on some DAWs) and use aux sends from each channel to send varying amounts of signal to the reverb. In this way, all drum tracks can share a reverb and seriously reduce the demand on the CPU. This has the added benefit of putting all tracks in the same reverberant space, for a more coherent-sounding mix. (See TCRM28_pic1 and pic2.jpg)
Of course, there are circumstances where inserting effects on individual channels makes the most sense. Dynamics and EQ are commonly used to sculpt the sound of individual tracks (though they are also common on subgroups and on the master bus). However, any other effects that are only needed for a single track should be inserted separately as well.
Another common misuse of DSP power occurs with the reliance on numerous tracks to create a single good take. Again, this stems from the way in which the material is recorded and edited.
For example, it is a common DAW technique to record numerous passes of the lead vocals, each to its own track. Later, the best performance of each phrase (or even note) is chosen from these to create a version that’s better than any one take alone. The traditional way to do this is to combine all of the desirable pieces onto a single track, called a comp track – similar to the way it was done before DAWs came along. But with modern editing and automation, there’s a temptation to leave all of the tracks running, and mute or fade them in and out as needed to create a comp take. This is an enormous drain on resources, as you need the same set of plug-ins on every take/track!
A better approach is to assembling your comp track on a single track, either by editing or by bouncing (ie mixing and recording the edited takes to a single track; more below), thereby committing the comp to a single track that takes up fewer resources. Then you can disable, and maybe hide, all of the original takes and their plug-ins to free up CPU power.
Note that after assembling a comp track, simply muting the original scratch tracks/channels may not free up their drain on the CPU. Actually removing the tracks/channels from the session seems to be the only surefire method of regaining these needed resources. This goes against the nature of the common track-hoarder who relies on the security of knowing he/she can always reconsider past decisions and go back to the earlier material for reworking.
Fortunately, the solution may not need to be so extreme, depending on the exact DAW software and system configuration. Virtual (or layered) tracks, of the kind found more often on stand-alone DAWs, work more effectively and predictably at releasing DSP power when moved out of the top (active) layer. On some of the DAW systems I have worked on, muting a channel did not decrease the load on the CPU, but muting the individual effects on the channel did. When effects-muting does not work, stripping the effects from unused channels will certainly reduce a lot of the load. Another approach that may do the trick is turning a track’s voice off, or assigning a channel to no output.
To determine which method works best, I’d recommend both referring to the manual and running some tests. Many DAWs now include a meter to show the load on the CPU. If yours does not, have no fear… your computer does. On a Mac it is called the Activity Monitor, and the Task Manager in Windows. Keep an eye on the level meter or usage percentages and try muting a CPU-draining effect like reverb (surround if you have it) or a series of effects while playing. If the level drops dramatically, you can keep all of your precious tracks. If the usage does not drop try channel muting, voice removal/turning the output off, and effects removal… in that order, until one works. If none of these manages to significantly reduce the load, then you may be forced to delete tracks/channels to release processing resources.
A note on CPU load metering: as with audio levels, it is a good idea to leave some headroom for unexpected spikes.
Another lesson to be taken from the days of fixed track counts is knowing how and when to print effects (or even submixes). Printed effects are written to audio files and, therefore, do not take any extra DSP power during playback. These effects are added by either of two methods: highlighting the region to be altered in the edit window and selecting from a menu of effects, or by bouncing tracks with plug-in effects. Bouncing is when one or more tracks are recorded to a new track in order to permanently add some effects or mix moves (usually to free up tracks or effects resources). Bouncing can be done by bussing the desired tracks to a new record-enabled track or by soloing those tracks and performing an audio mixdown. The audio file created from the mixdown can then be inserted on a new track. Remember, after bouncing it will be necessary to silence the original tracks and do what is necessary with them to regain some DSP power (as discussed above).
Another powerful CPU-saver that several hosts now support is a function usually called freeze. Freezing allows you to individually render a track with all of its virtual instruments, plug-ins and automation into a new audio file, substituted on the original track. The system then ignores the original audio and it’s associated processor-intensive tasks. A frozen track sounds exactly like the pre-frozen form but takes far fewer resources to play back. If you want to edit it, or change any of the plug-in parameters, you’ll have to unfreeze it temporarily, make the changes, and then freeze it again. Since it’s done with a single button, it is certainly a very easy way of reducing CPU overhead on your DAW.
A note on using the buffer
Most computer based DAW software allows you to adjust the size of the RAM buffer, referenced in amount of samples (64, 128, 256, 512, etc.) or latency time, expressed in milliseconds (3 ms, 6 ms, 12 ms, etc.) , it uses to aid in signal processing as well as both the recording and playback of audio files. While choosing a larger size can increase usable track counts and the number of simultaneously available plug-ins, it also increases latency. As mentioned in previous installments of this column, that can be especially bothersome for monitoring during overdubs or when bouncing tracks. It is not a solution without drawbacks.
When viewing a recorded waveform in most DAWs, the display shows positive going energy on top, negative going energy on the bottom, and a horizontal line down the middle to show the zero that exists between them. For illustrative purposes, think of the waveform as a graph of the movement of a speaker. The zero line represents the resting state of the speaker cone. Below that is the travel of the cone while it’s further into the cabinet. Above is the motion closer to the listener. While the curve is moving up, the speaker cone is moving towards you (sitting in front of the speaker); while the curve is moving down, the speaker is moving away from you.
Now consider editing a waveform. When you trim the side of an audio region, it is easy to leave the edge at a non-zero part of the waveform (see pictures). During playback, this will cause a click sound… the audible giveaway of a bad edit. This unwelcome noise is caused by asking the DAW to create a move that rarely happens acoustically: that matter or energy move instantaneously from one state to another.
To avoid this problem, it is best to make all edits at a point where the waveform crosses the zero. Some DAWs allow you to do this automatically by setting it as a preference. When done automatically, however, there still may be some clicks as other unrealistic waveforms can be created, especially those that too quickly change direction (See TCRM28_pic14 and wav6). Listen carefully for these, but also get used to quickly zooming in and out of the edit point to double check. Sometimes clicks you didn’t catch in your mixdown session can pop up on other listening systems and environments.
Though it certainly can be helpful in some circumstances, I often find automatic zero-crossing editing annoying for more reasons that the possible inadvertent bad waveforms. First, this mode has a habit of not letting you make the edit where you want because there’s a non-zero due to noise or DC offset issues. This can be solved manually using very quick fades (2 ms or faster). Second, stereo tracks can be extra problematic since the two sides rarely are at zero simultaneously. This means either there’s a bad edit on one side, or the zero-crossed edit may be very far from where you’d really like the edit to be. Again the solution is manually making the edit and using a quick fade to del with the non-zero transition.
Another way to avoid these clicks, and smooth the transitions between regions, is to use crossfades. Crossfades are moments where two sounds overlap as one is faded out and the other faded in. On many DAW systems, this can be done by overlapping regions, highlighting the shared area, and choosing crossfades (or just fades) from a menu, by right clicking, or by using a hotkey. While some systems perform the fade in real-time, most actually create short soundfiles that represent the requested fade. These later types require less of both the CPU and the drive, but extra care must be taken to ensure that the numerous extra files are kept with the session. Of course the crossfade effect can also be accomplished between audio on two separate tracks using the volume automation.
A note on timing
When placing regions (especially individual samples) to match up with a specific metric pattern, it helps to trim the start point to match the beginning of the attack of the sound. Then, especially in grid mode (aka snap, see TCRM 26), you can easily place these regions in the desired metric/rhythmic pattern. What may come as a surprise, however, is that many instrumental (and vocal) timbres may sound slightly late when using this approach. That is because musicians have learned to perform so that the peak of the attack arrives on the beat, not the onset of the attack. It makes the beat stronger. To offset this you can use the following method: First, insert all of your similar-sounding regions as described above, then select them all by shift-clicking and nudge or drag them all to the left until the attacks are properly aligned. First it’s made fast, then it’s made right.
Organization and Documentation (Session Hygiene)
Two of the most fundamentally important elements of a recording or mixing session are that of organization and documentation. They help ensure that your work is efficient and does not get bogged down by confusion or excessive redundancy. Musicians become quickly uninspired while waiting for an engineer to locate misplaced files or wrestle to create a suitable monitoring environment. It is also not uncommon for people to discover they have lost material from earlier recorded sessions they had moved or backed up incorrectly.
While the computer-based DAW has the ability to make highly organized sessions (due to its comprehensive and structured nature), many users instead exhibit very sloppy documentation and file structures. They mistakenly rely on the “all knowing, infallible computer” to keep track of their work. This is a major, yet common, mistake.
To avoid problems like these, consider the following guidelines to help organize things at the start of every new session (or even before):
On your audio recording drive, create a folder for the artist with individual project folders for the various songs.
Name the session file something meaningful (like “artist_song…”) and save it to the appropriate predetermined folder on the audio drive. Save often during the session with the “save as” command. This can be used to create a set of progressive backups that can also be used as “undo” sessions. Give each a similarly progressive and incremental designation (numerical, alphabetic, temporal, etc….)
Be sure all session, fade, and both imported and recorded audio files are saved directly to the correct project folder. (Do this by specifying the pathways at the beginning of the session, rather than trying to hunt everything down and move it later.) When insufficient attention is paid to this issue, a session’s supporting files can wind up in multiple locations, even across various media. Though the computer does keep track of these for the moment, if they are moved later it may not know where to find them. In addition, if you do not know where everything is located you can easily misplace or mistakenly erase important files.
All tracks and channels should be titled before recording begins. On most systems, audio files are automatically named based on their track names and the order in which they are recorded. Sound files should never be labeled such things as “track-1” or “audio-01” (which happens when the user does not specify the titles before recording.) When editing and mixing, it is easy to become confused between takes (and even instruments) when the hundreds of onscreen waveforms are distinguished only by numerical designations.
Create session templates for common project types you record. These empty session files should include track names, bussing structures, initial level settings (an opportunity to grab some much needed headroom), monitor controls, and even basic effects.
Well, that’s it for our five-part look at sequencers and DAWs. TCRM 29 will look further into the art of the mixdown stage and the various approaches to this daunting process.
- John Shirley is a recording engineer, composer, programmer and producer. He holds a PhD in music composition from the University of Chicago and is a Professor in the Sound Recording Technology program at the University of Massachusetts Lowell where he serves as chairman of their music department. You can check out some of his more wacky tunes on his Sonic Ninjutsu CD at http://www.cdbaby.com/cd/jshirley.
Supplemental Media Examples
Drum samples edited together using a grid. TCRM28_1.wav
Even if following the zero-crossing edit rule, audible artifacts can be heard if the waveform is edited in a way that doesn’t match cycle length or causes an abrupt change in direction. (See TCRM28_pic14.jpg) TCRM28_6.wav
Another method to avoid the “click” of a bad edit is to do a quick fade. Here, quick fades (no more than a millisecond) are used to remove the clicks caused by the bad edits in TCRM28_7.wav (See TCRM28_pic17 & 20.jpg). TCRM28_10.wav