1000
IT'S IN THE MIX: OPTIMIZING AUDIO TRACKS

Have you ever visited the control room of an audio studio? 
Gadgets everywhere. Patch bays, wires, racks populated with cold,
Equalizer - 
This box is a fancy tone control. With it you can reduce one or 
more sound frequencies or boost one or more frequencies. Since the 
world is filled with sounds you want to hear and others that get in 
the way, by using an equalizer you can filter out the unwanted 
squawking crows, blowing wind, or humming fans while boosting the 
sounds of a narrator's voice or a bass fiddle.
Equalizers come in two types: the graphic equalizer which has a 
number of sliders, each slider representing a different frequency to 
be boosted or cut, and the parametric equalizer with a knob that dials 
the frequency you wish to adjust and another knob that determines 
whether that frequency is reduced or increased. Audio mixers often 
have several parametric equalizers allowing you to dial one frequency 
for boosting, another for reduction, and a third for whatever. 
Equalizers aren't exact: you can't select 60 Hz on the 
equalizer and excise audio hum from a bad recording, without touching 
the 59 Hz and the 61 Hz frequencies around it. Equalizers generally 
effect a range of frequencies and for this reason are used delicately 
in the music recording business. Slicing one particular frequency 
range can leave a musical instrument sounding a bit odd.
To the average videographer, equalizers are helpful for reducing 
the burble of wind when a microphone is used outdoors. A low cut 
filter (also called a high pass filter) reduces the low frequencies 
where you find the sounds of wind rumble. Although using such a 
filter will reduce the normal bassiness of a man's voice, the effect 
is minor and leaves plenty of intelligibility to the narrator's words. 
Since speaking voices generally range between 150 Hz and 2000 Hz, 
there are a lot of highs and lows that can be cut without losing the 
words. Don't cut them unless you have to, however, because you'll be 
removing fidelity and ambiance from the sound. You don't want your 
narrator's voice to sound like it came over a telephone (telephones 
also have a narrow audio bandwidth). In short, unless you have a 
nasty noise in the background that you simply must remove so that it 
doesn't distract from your desired audio, leave the sound spectrum 
alone and deactivate your equalizer.
Equalizers are sometimes handy for "tuning" a room's public 
address system. If you have set up your mikes and speakers for a 
speech or a play and you hear a ringing or whoop or wail as feedback 
envelops the room, it may be that certain frequencies are bouncing 
back to your microphone stronger than others. If you can detect which 
frequencies those are and reduce them a little, you can reduce the 
feedback without lowering the volume and effectiveness of your 
loudspeaker system.
I particularly enjoy the chest shaking depth and power that 
comes from heavy duty bass, but there are times when low frequencies 
are our enemies. Low frequencies involve huge amounts of power. They 
saturate (over magnetize) audio tape leaving no room for the other 
frequencies. Bass frequencies sometimes suck the power from your 
amplifier, also leaving little room for the other frequencies, either 
drowning or distorting them. They pin our VU meters (push the needles 
off scale) sometimes without adding much useful to the mix. You don't 
want to waste precious audio resources, be it wattage, magnetism, or 
digits, on noise or thumping or rumble or hum of equipment, wind, foot 
stomping, and other non-melodic sounds. Cardioid and other 
directional microphones tend to exaggerate bass sounds of people 
speaking into them, and often turn percussive consonants like p's and 
b's into mini explosions. Fine mesh screens in front of the mikes 
(called pop filters) may be our primary defense against such phonetic 
fireworks, but our second line of defense is the equalizer or low cut 
filter. In short, cut down on the bass frequencies unless there's 
something useful happening there.
Next consider the musical bass frequencies you are recording; 
these are the legitimate low frequency sounds of various instruments. 
You may wish to record these frequencies, but not at their full volume 
to avoid saturating the tape or exceeding the ceiling on your digital 
recorders.
If you are recording or retransmitting a mix of sounds, you
again may wish to reduce the low frequencies. Low frequencies tend to 
"muddy up" the sound mix, reducing the clarity of the other 
instruments. If you have recorded an instrument with full fidelity 
and are now mixing the tracks together, you may at this time wish to 
reduce some of the low frequencies to keep the final sound from 
becoming too weighty, ponderous, or muddled.
Low frequencies are a bit unfriendly to reverbs (described 
shortly). Again, they confuse the sound space, trampling on the 
higher frequencies of other instruments.
At the other end of the spectrum are troublesome sizzles,
whistles, and hisses, often associated with the letter "s". Some 
sibilant speakers and some sensitive mikes conspire to emphasize "s" 
sounds. Screeching s's are about as appealing as fingernails on a 
chalkboard, so a little high frequency trimming may help. There are 
also specialized devices called "de-esssers" that, like audio 
antibodies, seek out and vanquish just that one consonant alone. 
Equalizers go upstream of most other audio processors. In other 
words, once your mike signal is preamplified to mixer level or line 
level, you pass it through the equalizer before sending the signal to 
reverbs, delays, and other gadgets. This is because those other 
devices are more likely to respond well to "tailored" sound, sound 
they can manipulate easily without dealing with troublesome chaff such 
as booming bass and sibilant highs. The entire signal is fed through 
the equalizer, treated, and output to the next device, unlike the 
signal path applied to reverbs and delays. For reverbs and delays, 
part of the sound passes untouched to its destination while another 
part of the sound is sidetracked and passed through the reverb or 
delay device, and recombined with the original sound. We'll hear more 
about this shortly.
Variable gain amplifier -
Variable gain amplifiers (VGAs) modify the dynamic range, the 
ratio between loudness and softness of sounds. Examples of VGAs are 
noise gates, expanders, limiters, and compressors. 
A variable gain amplifer is like a genie with his hand on your 
volume control. As the genie hears the sound coming into the VGA, he 
raises or lowers the volume of the output sound nearly 
instantaneously, and only for a moment. Depending upon the settings, 
the genie could lower the volume for a millisecond to take the edge 
off a certain percussive sound. Working just the opposite, the genie 
could raise the volume for just a millisecond, increasing the punch of 
the percussive sound. In the first case, the VGA can tame the 
hardness of a drumbeat. Adjusted to an extreme, the VGA can turn the 
tap-tap of a snare drum into the chug-chug of a steam engine. In the 
second case, the VGA accentuates the drumbeat bringing it to the 
forefront of your audio making it more pronounced.
Adjusting the VGA another way, the genie would ride audio like
any audio technician, lowering the volume slightly whenever it peaked 
above a certain level, and raising it slowly when incoming audio 
volumes seemed a bit low. This is what we have grown to know as 
automatic volume control. And just like the real audio person, the 
circuit can be "fooled". If you were to speak very softly for a 
minute or so, the automatic volume controls would rise bringing your 
voice to the right volume. The increased amplification would also 
increase the strength of ambiant sounds in the room such as echoes, 
hum, hiss, shuffling of paper, whatever. Then if you suddenly 
sneezed, the loud sound would knock the volume control down instantly 
(but not soon enough to save the listeners from a blast from their 
speakers), and now the volume would be so low that your voice would be 
barely audible. Slowly the automatic control would allow the volume 
to creep back up to normal.
Automatic gain controls are handy when you don't have enough 
hands. You simply switch them on and then go ahead to do other 
things. They don't do too bad a job; most home VCRs have automatic 
gain controls in them to adjust sound volume. A little more subtle, 
however, is the limiter. The limiter is a VGA that effects only high 
volumes. It's only job is to keep the surprise sneezes from blowing 
everyone's ears out. It also keeps an occasional verbal outburst from 
distorting badly. It doesn't crank down the entire volume control 
leaving you with semi-silence after an outburst; it just cuts off the 
peaks leaving the valleys and mid-mountains untouched and natural.
Noise gates work at the other end of the spectrum. When things 
are quiet, they turn the volume down more. This is great for reducing 
hiss and unwanted background sounds. When music is playing and people 
are speaking continuously, you can't hear these weak noises. When the 
music or narrator is silent, however, these sounds become audible to 
your ears. By turning down the volume during the silent moments, you 
don't miss any of your wanted sound, but the noise is reduced during 
the silent passages so that it cannot be heard during that special 
time when your ears are most sensitive to noise. Noise gates have to 
work quickly to turn themselves back up the instant there is 
legitimate sound. Perhaps you have heard a noise gate that was over 
used. Someone would be talking and between every word there would be 
an uncanny silence, and as they spoke, you would hear the whoosh of 
some other background sound mixed with their words.
In short, noise gates make quiet sounds disappear. They also
can accelerate the decay of reverberation changing how you perceive 
the size of a room. Used in reverse, they can increase the attack of 
sound making drums and musical instruments sound more percussive. 
Compressors compress the dynamic range of sound. If a sound 
were to double in loudness, a compressor could make it increase in 
volume by only 50% Weak sounds are hardly touched at all. Loud 
sounds and extremely loud sounds end up having nearly the same volume. 
Compressors make it possible for sounds that vary excessively in 
volume, like the voice of someone shouting, to remain within the range 
of what the mixer, recording media, radio transmitters, and 
loudspeakers will bear. A typical example of highly compressed sound 
is one of those annoying pitchmen on AM radio; he seems to be shouting 
in your ear, but the VU meters would show that his actual volume level 
rarely exceed 1 dB.
Compressors are good for keeping control of exuberant narrators
and holding the lid on singers and musical instruments that range from 
a whisper to a blast. Compressors are also used on wireless 
microphones to assure that loud outbursts don't overdrive the 
transmitter circuits.
Back to narrators for a moment --- amateur narrators tend to 
speak in monotones --- very dull to listen to. Listen to some 
professional announcers and you'll notice how they constantly vary the 
pitch and volume of their speech to keep the message exciting. This 
kind of speech pattern pushes VU meters all over the scale, requiring 
constant attention from the audio person. Enter the compressor, 
squeezing the whispers and the screams into a 4 dB range. Now the 
announcer sounds excited and animated, but the audio levels are tame. 
Incidentally, some announcers can mimic animated conversation and keep 
within the 4 dB volume range without a compressor. But boy do they 
sound weird when they do this at the dinner table!
If compressors are like bondage for sound, expanders are the 
opposite, letting sounds leap to life. Unlike compressors, expanders 
will do nothing to a weak sound, but will take a loud sound and make 
it louder, adding more emphasis to it. Expanders often work in 
cooperation with compressors; compressed sound is a little unnatural, 
but is the price we pay for squeezing widely varying volumes through 
recording devices and transmitters that have limited range. Once we 
are past the recording devices and transmitters, and the sound is 
ready to be amplified and fed to a loudspeaker, an expander can 
decompress a sound returning it to its full dynamic range.
Describing this another way, say you had some sounds that ranged 
between 0 and 100 dB in signal strength. Say that your recorder 
distorts when it hears anything over 60 dB. You don't want to just 
turn down the volume of your recording, that would hurt the weak 
sounds too much. Instead, you use a compressor which hardly touches 
the low volume sounds between 0 and 40 dB. The incoming sounds 
between 40 and 80 dB get squashed down to fit in the range between 40 
and 50 dB at the compressors output. The incoming sounds between 80 
and 100 dB get squashed into the range between 50 and 55 dB. Thus, 
sounds that ranged from 0 to 100 dB going into the compressor came out 
between 0 and 55 dB, well within the range of a digitizer, recorder, 
wireless microphone transmitter, or whatever. We could stop there or 
we could attempt to restore the sound to its original dynamic range 
once we have passed our equipment bottlenecks. With an expander, the 
incoming barely-compressed sounds between 0 and 40 dB would remain 
untouched. The louder sounds between 40 and 50 dB would be amplified 
and come out in the range between 40 and 80 dB. The heavily 
compressed sounds between 50 and 55 dB would be expanded to the range 
80 to 100 dB, thus reconstituting the full dynamic range of the 
original sounds.
Like equalizers, varible gain amplifiers are usually connected 
early in the sound chain, before signals are sent to reverbs, echo, 
delay, and other processors. Also like equalizers, VGAs have all of 
the signal pass through them; they don't sample a bit of the signal, 
change it, and then add it back to the sound stream. 
Delay-
A delay is a repeat of the original sound just milliseconds 
after the original sound. It generally adds thickness and richness to 
a sound, and can also be used to artificially construct a mental image 
of the room where the sound was generated. Imagine for a moment that 
a person were speaking to you while standing on a mile tall pedestal. 
Presume it's a calm day and you are up there with him. Ninety percent 
of the vibrations from his voice will go other places than towards 
your ear. The small amount of sound that strikes your ear will seem 
thin and strange. Anechoic (soundproof) rooms and "dead" (soundproof) 
audio studios also sound this way. Music and voice both sound 
unnatural because, in the real world, we always have a floor and walls 
around us reflecting delayed strains of the original sound back to our 
ears.
Now imagine someone with a tall crane hoisting a wall and
positioning the wall behind the person speaking. If the wall is 
merely a foot behind the speaker, you will hear his original words 
plus a weaker delayed repeat of those words about 1 millisecond later. 
His voice will sound stronger and more realistic, but you won't quite 
know why. One millisecond is such a quick delay that your ear cannot 
tell that it's hearing the sound twice. If a wall is moved slowly 
away from the speaker, the delay becomes greater. Between 5 and 15 
feet, the wall will continue to reinforce the person's voice making 
him sound natural and normal, but the sound will change character as 
the wall increases its distance. When the wall is 15 feet away from 
the speaker, the sound is delayed about 30 milliseconds and your ear 
begins to perceive it as a discernible delay or a repeat. Moving the 
wall further eventually turns the delay into your classic 
echo...classic echo.
When a delay time is less than 5 ms, some of the delay's 
electrical vibrations cancel some of the original vibrations. Certain 
frequencies will be nullified while others remain (a condition called 
comb filtering). The result is a hollow, through-a-pipe sound, a bit 
like the voice of Darth Vader. Somewhere between the zone of 
cancellation and the zone of distinct delays is the desired 
reinforcement zone where the delays are constructive and thicken the 
sound. These are the delay amounts you will find most useful for 
normal audio sweetening.
In the real world, we generally have more than one wall, so 
there are several sound reflections picked up by the microphone and 
recorded. By combining several delays together, one can recreate the 
perception that a room of nearly any size exists. If the delays are 
all short, a voice can sound as though it were coming from within a 
vehicle, a hallway, or a small room. If the short delays are strong 
(ie. nearly as loud as the original sound), the sound will appear to 
be coming from a bathroom. Increase the delay time and the room 
becomes larger. Strengthen the delay volume and the walls become 
"harder" like in a gym or parking garage.
How can you use this in the video world? Say you made a 
recording of a person speaking in a room. If while editing you need 
to dub in new lines, you could bring that person back to that room and 
record them, thus maintaining the "room sound". If the room isn't 
available and you bring that person back to the studio or have someone 
else record the person in another town, naturally the character of the 
sound will be different. You would try to use the same kind of 
microphone at the same distance from the person in order to match as 
many variables as possible, but often the subtle difference in room 
reverberation will make your inserted audio sound glaringly different 
from the rest of the recording. Here is where you can run the signal 
through a delay (or several) and attempt to sonically recreate the 
missing walls.
For a more creative example, we are always trying to make sets 
seem like the real thing on the TV screen. Sound carries some of the 
subconscious message to the listener, so if you can make your tiny set 
sound like a big room, or the metal interior of a submarine, you can 
transfer the audience to a place that never existed. Some inexpensive 
cardboard and paint and a few delayed sounds can become a convincing 
spaceship or foxhole without the cost or cramped quarters.
When working with music, delays are useful for creating the 
impression that there are more than one instrument. This trick is a 
flimsy one, however. In the real world no two instruments play 
exactly the same note at the same frequency in the same phase all the 
time. One voice cannot be simply duplicated into a chorus of singers.
In the real world, some voices will vary above and below pitch
slightly. To more closely approximate the real world, there's a 
button found on delays and music synthesizers called chorus. The 
chorus is a swept delay which is a combination of one short fixed 
delay and one changing delay. The changing delay is like having a 
wall coming toward you, and then moving away from you. Electronically 
the swept delay is being adjusted from a small amount of delay to a 
larger amount and then back to the smaller amount. The frequency 
varies a little above and below the normal frequency. Other 
adjustments on your processor allow you to vary the modulation (rate 
of sweep) of the swept delay creating a slow waaahhhhaa sound or maybe 
a quick fluttering wowowowo sound. If you listened to it, it would 
sound a little like vibrato in a singer's voice, or the sound of a 
train whistle as it goes by you. To make the chorus sound even more 
natural, the sweep rate is varied, perhaps making a woowowoooowowwo 
sound. The random or fake-random sweep is similar to what you hear 
when two voices are singing the same note.
Reverberation -
Reverberation is the reflection of sound in a defined space. It 
is the sound you hear when singing in the bathroom, in a stairwell, or 
in a cavern. The word is used interchangeably with echo which is 
technically something different. Echo means decaying repeats of a 
sound....sound...sound. Echo is what you hear when you scream "hello" 
to a canyon wall. Echo makes an interesting sound effect, but will 
turn a voice into cacophony. Guitarists can use echo to turn one 
pluck of the string into many, adding complexity to the music. For 
the most part, echo is simply used as an effect, and reverberation is 
the preferred flavoring for sound. Incidentally, many mixers have 
controls on them called echo send. This is a mixer circuit that takes 
the microphone's preamplified signal, and sends it out of the mixer 
for further processing (by an echo, delay, reverb, or whatever), then 
takes the result back into the mixer and recombines it with the 
original sound. You would vary the amount of the two signals with a 
knob on the "echo send" part of the mixer.
Reverberation used to be made with mechanical devices such as 
metal plates and steel springs. Audio signals would be transduced 
into physical vibrations causing the spring to bounce around or the 
plate to vibrate. At the other end of the spring or on a parallel 
plate would be another transducer that changed the mechanical 
vibrations into an audio signal again. Today most reverbs are digital 
audio devices which sample a sound, convert it into numbers, 
manipulate (perhaps repeating) the numbers, then convert the data back 
to sound. Reverbs are often souped-up delays that feed their own 
delayed sounds back into themselves in complex manners. Where a delay 
represents one repeat of a sound (like from that wall behind the guy 
on the pedestal), reverb is many repeats blended together (as you'd 
expect from a real multi-walled room).
Reverberation creates a room ambiance, much like delay does. 
You can think of delay as a couple walls (usually the closest and most 
important ones) and think of reverb as the entire room. There would 
be a direct reflection from the nearest wall and perhaps the floor, 
followed by weaker more numerous reflections from the rear wall of the 
room, ceiling, and other places. Reverberation forms mental space so 
that you can tell whether something was recorded in a living room or a 
gymnasium. Reverb adds natural depth and excitement to music and 
plays an important role in gluing together independent sounds that 
have been mixed together from separate recordings. You could feed 
many microphones into a mixer or have many tracks in an audio 
recording, and each of them may sound singular and independent even 
though they are mixed together. By using reverberation, the sounds 
blend more naturally.
Reverberation devices may have a control called diffusion which 
determines whether the sound reflections are highly defined or mixed 
in a fuzzy way. In the physical world, parallel walls would give low 
diffusion and high definition to a sound because the delayed 
reflections come back to your ears fairly intact. Non-parallel walls 
give high diffusion; there are no audible repeats of the original 
sound, just the amorphous ring of the sound in the background. 
Depth is another knob you may find on a reverb and it controls 
the perception of where you are sitting relative to the speaker or 
musical instrument being played. Listening to a trumpet from the 
front row of an auditorium sounds different than listening from the 
back row. In the front row, you hear an immediate, strong, original 
sound followed by almost instantaneous direct reflections from the 
floor and backdrop, followed by a weak reverberation of the sound from 
the back wall or ceiling. If you sit in the back row, the reflections 
hit you at the same time as the original sound, and the sound is weak 
relative to those reflections. In short, the music "sounds" far away. 
By turning up the depth control on a reverb unit, and weakening the 
original sound that is mixed back with the reverb once the two have 
passed through your mixer, you can create the illusion of a distant 
voice or musician.
Faking things to sound real -
In real life, we are surrounded by natural reverberations. The 
high frequency reverberations decay first leaving just mid and low 
tones. Although this is natural, it doesn't sound good. When strong, 
low frequencies are fed into a reverb along with high frequencies, the 
low sounds cloud the mixture, sounding muddy. For better results, 
audio technicians generally run the sound through an equalizer first, 
cutting the low frequencies, then send the result to the reverb. This 
is called an equalized reverb and it sounds brighter than reverb with 
the full sound spectrum. In fact reverb settings sometimes use the 
words dark and bright to describe the tonal character of the reverb.
All reverbs allow you to set the decay time, the length of time
it takes for the reverb sound to trail off. Fancier reverbs have a 
low and high frequency decay adjustment to again give the high 
frequencies an edge over their low frequency brothers. You might 
adjust the low frequency decay to be quick and the high frequency 
decays to be slow. Again, this makes up for how low frequencies muddy 
up the sound of a reverb.
Reverb, like salt, should be used in moderation when flavoring 
music. If you have a voice and an instrument, for instance, one of 
them should be recorded dry (without reverb). This will help one 
sound stand out from the other. If you have a lot of production in 
your sound track (ie. many instruments or many things going on), less 
reverberation is better; too much muddies up the sound. If you have 
sparcer production, such as a single voice or only a few instruments 
(ie. a single saxophone, singer, or flute) more reverb may give the 
desired dramatic effect. Slow songs allow you to use a long decay 
time in your reverb while fast songs need a fast decay so that the 
reverb doesn't get in the way of the next note. For the highest 
impact, you may decide to have a small section of music with no reverb 
at all. This builds excitement through comparison and avoids 
monotony.
Equalizers, VGAs, delays and reverbs are just a few of the 
little black boxes you will find in the hands of audio technicians. 
Videographers can also use these gadgets to season their sound and 
create a sonic space that transports the listener into the world that 
you have created. Like lighting, camera angles, and the use of color, 
the sonic space you contrive sends a subconscious message that draws 
your viewers into the program and captures their minds.
NOTE:
First Light Video Publishing (800-777-1576) markets five 
excellent videotapes, the "Shaping Your Sound Series" ($329); hosted 
by engineer and producer Tom Lubin. The tapes contain graphic 
animations, live music examples, and clear demonstrations of recording 
techniques and equipment operation. Two of the tapes specialize in 
reverb, delay, equalizers, and gates.