The next 10 pages begin!
For those who were uncertain, my publishing schedule is 10 pages per quarter.
It coincides with my teaching schedule.
Many of you know that I keep myself WAY too busy. This is a problem for me. I have too many things I want to do, and by trying to do them all I sometimes feel that none of them get finished.
Add to that my amazing hand/wrist issues of the last 3 months, plus the fact that my daughter is 2 years old and I’ve been sick for the better part of the calendar year, and you know it’s been tough to finish things.
This having been said, I have two projects that should be out soon (by July)
1) Tim Price’s book: Big in Japan, for which I made the cover and produced 29 interior illustrations. I will add one to this post. Big in Japan is a YA novel about a band, that you may not have heard of because they’re big in Japan, called “Vinyl Crush”. They also fight giant japanese monsters in a big robot called “the Duke”.
2) My wife’s first disk, “The Circle”. I played bass on a few tracks, played mandolin on a couple, produced and arranged all the vocal tunes on it. We should have it finished in a couple of weeks.
Hopefully, I’ll get back on track with my web comic (Still trying to finish 40 pages this year), the film/animation stuff I’m working on with Ben Thompson (we should complete at least one more short this year, although we have bigger works in development), and I have a couple of experimental stories I think will be available early next year.
Meanwhile, this year I’ll be producing a “Big in Japan” CD with Tim Price, a series of tutorial videos with Tom Knight for “Independent Media Production” (my instructional site/iBook in development) and a couple other things I can’t talk about yet.
SOUND is the consequence of changing air pressure over time. A more technical definition:
Sound is mechanical energy in the form of pressure variances in an elastic medium. These pressure variances propagate as waves from a vibrating source.
Changes in air pressure (air being a propagating medium) can be represented by a WAVEFORM, which is a graphic representation of a sound. In reality, sound waves propagate through the air in LONGITITUDAL WAVES (and not TRANSVERSE WAVES):
Although sound waves through air are indeed longitudinal, it is more practical to represent them graphically as TRANSVERSE waves, as in this simple sinusoid (sine wave):
Sound travels through dry air at approximately 1100 ft. per second (approx. 760 miles per hour). See Speed of Sound through Air.
See Acoustics Animations.
See SOUND in Barry Truax’s Handbook for Acoustic Ecology. Sound consists of BOTH physical or ACOUSTIC aspects AND psychological or PERCEPTUAL aspects.
See Basic Acoustics for Electronic Musicians (St. Olaf College).
In representing a sound graphically on paper or digitally on a computer, it is important to understand the various characteristics of a WAVEFORM. See Parts of a Wave to understand: peaks, troughs, amplitude, cycles per second (cps), Hertz (Hz),positive amplitude, negative amplitude, wavelength.
ACOUSTICS is the science and study of sound.
See ACOUSTICS in Barry Truax’s Handbook for Acoustic Ecology. The study of ACOUSTICS is the study of the PHYSICS OF SOUND.
See the Acoustical Society of America – Career Opportunities in Acoustics
See Acoustics Animations.
The curving line in this diagram (Fig. 1) represents displacement of whatever medium through which the WAVE isPROPAGATING, for example, through the air. This corresponds to the energy in the wave and consequently how loud it appears to be. The energy of the WAVE is called the AMPLITUDE and this is related to the PERCEIVED LOUDNESS.
AMPLITUDE can change over time (the sound can get louder or softer):
There are many ways of measuring AMPLITUDE; since it relates to the size of the pressure variations in the air it can be measured in units of pressure. More often we talk about deciBels (dB) which measure amplitude on a logarithmic scale relative to a standard sound. [A DECIBELis 1/10th of a BEL, where one BEL (named after Alexander Graham Bell) represents the difference in sound level between two intensities, where one is 10 times greater than the other.] The dB scale is useful since it maps directly to the way that humans perceiveloudness. However, PERCEIVED LOUDNESS is more complicated than just measuring amplitude.
Try this experiment: Using SYD, set up a simple sine wave patch with a frequency of 1000 Hz and an amplitude of 50% (.5). Synthesize the patch and then play it. Now change the frequency to 100 Hz but leave the amplitude the same. Most people would agree that the patch played at 100 Hz sounds ‘softer’ than the patch at 1000 Hz even though the amplitudes are the same. Now play the patch again at 1000 Hz. Change the frequency to 3000 Hz but leave the amplitude the same. Most people would agree that the patch played at 3000 Hz sounds ‘louder’ than the patch at 1000 Hz even though the amplitudes are the same.
In the realm of PSYCHO-ACOUSTICS, perceived loudness is a function of both frequency AND amplitude (Fletcher-Munson Curves: theEqual Loudness Contour). At low intensities tones have the same loudness when they are equally detectable, whereas at high intensities they match in loudness when they have the same intensity. See Hearing. See Intensity. See Loudness. See Mastering FAQ; How loud is it?
See AMPLITUDE in Barry Truax’s Handbook for Acoustic Ecology.
See Amplitude (St. Olaf’s College) — An excellent and easy to understand discussion of decibels.
See DECIBEL in in Barry Truax’s Handbook for Acoustic Ecology.
See an excellent article on “What is a DECIBEL?”
See CYCLE in Barry Truax’s Handbook for Acoustic Ecology.
PHASE refers to the point in a CYCLE where a wave begins. For example, here is the same WAVE starting 1/4 the way through a CYCLE. The PHASE is .25 or 90 degrees.
See PHASE in Barry Truax’s Handbook for Acoustic Ecology.
A PITCHED SOUND is one that has a repetitive CYCLE or PERIOD. PITCH is determined by the number of CYCLESper second and is called FREQUENCY. The term, HERTZ (Hz) refers to the mathematician, Henrich Hertz and is used to represent the number of cycles per second. The RANGE OF AUDIBLE FREQUENCY FOR THE HUMAN EAR is between approximately 16 Hz and 20,000 Hz (20 kHz). The note ‘A’ above Middle C (C3) has 440 cycles per second (Hz). How many Hertz (Hz) is the following sound? Would this be a low sound or a high sound? Why?
NOISE has no repetetive cycle or period
Sound with a frequency above the range of human hearing (20,000 Hz) is called ULTRASONIC. Sound below the range of human hearing (16 Hz) is called INFRASONIC.
See FREQUENCY in Barry Truax’s Handbook for Acoustic Ecology.
CONSTRUCTIVE and DESTRUCTIVE INTERFERENCE.
The concepts of PHASE and FREQUENCY are particularly useful for describing the interactions which occur when two sounds are combined. If two sounds have the same FREQUENCY and have the same PHASE they are said to be “in PHASE.” The resultant AMPLITUDE is a simple addition of the two respective amplitudes (they are added together) and produce an overall LOUDER sound. This phenomenon is called CONSTRUCTIVE INTERFERENCE.
If a sound begins when another sound of the same frequency is half a CYCLE ahead (180 degrees), then the crests of one wave will coincide with the troughs of the other. In this case, the two sounds are described as being 180 degrees “out of PHASE.” TheAMPLITUDE of one sound will be subtracted from the AMPLITUDE of the other sound resulting in a a phenomenon calledDESTRUCTIVE INTERFERENCE. If the amplitudes of the two sounds are equal, then the two sounds will cancel each other out totally and there will be no sound (zero amplitude).
Where the wave peaks and troughs coincide, their amplitudes are summed: Constructive interference
Where the wave peaks and troughs are in opposition, their amplitudes are canceled: Destructive interference
See INTERFERENCE from in Barry Truax’s Handbook for Acoustic Ecology.
All naturally produced (non-electronic) pitched sounds (a clearly defined musical note) have a SPECTRUM consisting of the NATURAL HARMONIC SERIES. This is “an ordered set of frequencies which are integer multiples of a FUNDAMENTAL” frequency. For example, if the FUNDAMENTAL is 100 Hz, then the 2nd harmonic (the 1st “overtone”) is 200 HZ, the 3rd harmonic is 300 Hz, etc; if the FUNDAMENTAL is 440 Hz, then the 2nd harmonic (the 1st “overtone”) is 880 HZ, the 3rd harmonic is 1320 Hz, etc.
In addition, in naturally produced sounds the amplitudes of the various harmonics in their SPECTRUM is inversely proportional to their position in the series. For example, if the amplitude of the FUNDAMENTAL is 1/1 (100% or 1.0), then the amplitude of the 2nd harmonic is 1/2 ( 50% or .5), the amplitude of the 3rd harmonic is 1/3 (33.33% or .33), etc.
See HARMONIC SERIES in Barry Truax’s Handbook for Acoustic Ecology.
The SPECTRUM of a sound is a graphic representation of the HARMONICS (see Harmonic Series above) or RESONANCE of the sound. All naturally produced (non-electronic) pitched sounds (a clearly defined musical note) have a SPECTRUM consisting of the NATURAL HARMONIC SERIES.
3-D Spectrum of a RECORDER showing the HARMONICS changing over time
See a great article at What is a Sound Spectrum?
See SPECTRUM in Barry Truax’s Handbook for Acoustic Ecology.
When two sounds with a slightly different frequency are combined, the listener will have the impression that there is a single sound, but its loudness (amplitude) will pulsate as the contributing sounds (waveforms) are alternately in phase and out of phase. For example, consider the combination of a sound with a frequency of 441 Hz and a sound with a frequency of 440 Hz:
Two sounds with slightly different frequencies:
The sounds begin together and then interfere constructively. After one-half second has elapsed, however, the first sound has vibrated 220.5 times, while the second sound has only vibrated 220 times. The two sounds are a half-cycle apart, or 180 out of phase. At this point, theyinterfere destructively, and the combined loudness is diminished. By the end of one full second, however, both sounds have completed full cycles. The 441st cycle for the first sound, the 440th cycle for the second sound. The two sounds are in phase again, and their combined loudness is greater. This alternating pattern of constructive and destructive interference will continue as long as the two sounds are combined. Once each second the loudness will rise, and then it will diminish. If the difference between the two frequencies is 2 cycles per second, then the loudness will rise and diminish twice each second. This pulsation of loudness, produced by the combination of two sounds of nearly the same frequency, is called BEATING.
Combined result of two sounds with slightly different frequencies:
See BEATS in Barry Truax’s Handbook for Acoustic Ecology.
NOISE can be defined in two ways: (1) any unwanted sound (as in CIPPPING, QUANTIZATION NOISE, or AILIASING; (2) sound which has no discernible repetitive wave pattern or period and resembles the static sometimes heard on a a radio or television. This second type of NOISE has RANDOM amplitude and frequency. Here is the waveform for “white noise”:
White Noise: random amplitude and frequency
The 2nd type of NOISE might be referred to as, “good noise” because it is essential in synthesizing certain types of sounds. For the purposes of this class, NOISE will generally refer to the 2nd type and not the 1st type.
Other types of NOISE are PINK NOISE and BROWN NOISE. See: Noise.
See NOISE in Barry Truax’s Handbook for Acoustic Ecology.
TIMBRE (pronounced “tam-ber”) is the tone quality of a pitched sound. For example, a piano has a different timbre than a flute.TIMBRE is determined by (RESONANCE, which is the combination of frequencies at specific amplitudes within an individual cycle. For example below are the waveforms of a piano and a flute:
The small changes in AMPLITUDE within the individual CYCLES of the WAVEFORM are indications of the different frequencies which color the sound and make it uniquely distinct. This complex array of FREQUENCIES along with their specific AMPLITUDES is called the HARMONIC SPECTRUM, or RESONANCE.
See TIMBRE in Barry Truax’s Handbook for Acoustic Ecology.
Another important aspect of any sound is its ENVELOPE. That is, its ATTACK, SUSTAIN (also called STATIONARY SOUND) and DECAY. Attack is the point in time when the sound begins and is the single most important part of the sounds for the ear to determine the individuality of the sound. For example, the attack of a piano waveform is the single most important part of the sound for the ear to determine that the sound was actually a piano and not a flute. Sustain is that part of the sound waveform which maintains a more or less constant amplitude. Decay is that part of the sound waveform which begins to decrease in amplitude because of a loss of energy.
See ENVELOPE in Barry Truax’s Handbook for Acoustic Ecology.
A digital representation of a sound is called a SAMPLE. Sampling a sound is a similar process to recording a movie using film (not video tape). With film, a camera takes 32 frames (pictures) per second and this is enough information to fool the brain into seeing a continuous and uninterrupted motion. With sound, the brain needs much more information. A good SAMPLING RATE is in excess of 8000 frames (samples) per second, that is 8 kilo-Hertz (8 kHz). A better sampling rate is 22 kHz. The best sampling rates are in excess of 44.1 kHz. 44,100 Hz (44.1 kHz) is the standard sampling rate for CD production. At this sampling rate the extremes of sound frequency from the lowest pitches to the highest pitches which the ear is physically capable of hearing can be accurately represented.
Another aspect of sampling is the SAMPLE RESOLUTION of the sample. That is, how accurately an individual sample is stored digitally, for example, 8 bit samples, 16 bit samples, 32 bit samples, etc. The higher the bit resolution, the more accurately the sample is represented digitally. Sixteen bit samples are generally considered high enough resolution to accurately represent an individual sound for the human ear.
An important factor which enters into decisions regarding rate and resolution of sampling is hardware limitations. High sampling rates combined with high resolution rates require large amounts of memory and/or disk storage. If hardware capabilities are limited then certain tradeoffs will have to be considered when deciding on sample rate and resolution. When a sound is sampled by a computer or sampler (digitizer), the resulting waveform can be saved as a computer file called a SOUNDFILE. Some common soundfile formats are AIFF (Audio Interchange File Format), a standard file format supported by applications on the Macintosh and Windows computers; µlaw, an 8-bit sound encoding that offers better dynamic range than standard (linear) 8-bit coding; WAVE (.WAV), a standard sound format for the Windows platform; .au, An audio file format that is popular on Sun and NeXT computers, as well as on the Internet.
See DIGITAL RECORDING in Barry Truax’s Handbook for Acoustic Ecology.
See Digital Sounds and Sampling Rate (St. Olaf’s College)
The mathematician, Harry Nyquist, proved conclusively that accurate reproduction of any sound, no matter how complicated, required a sampling rate no higher than twice as high as the highest frequency that could be heard (NYQUIST LIMIT). This means that high fidelity requires a sampling rate of only 30 kHz and even the best of human ears needs only 40 kHz. CD’s sample at 44.1 kHz. For example, if a sampling rate of 11.025 kHz was used, then the highest frequency that could be accurately represented digitally would by only 5.5125 kHz. The highest note on a piano keyboard is C7 and is 1.9755 kHz (1975.5 Hz). However, the highest note possible with MIDI is G8 which would be 6.272 kHz (6272 Hz). Consequently this pitch if sampled at only 11.025 would not be accurately represented and would actually have a different timbre (waveform). In order to sample a pitch of 6.272 kHz you would need a sampling rate of 6.272 x 2, or 12.544 kHz. SoundEdit 16 only supports three sampling rates for the Macintosh computer hardware. Consequently, you would need to use the sampling rate of 22.050 kHz in order to accurately record (sample) the pitch G8.
See Digital Sounds and Sampling Rate (St. Olaf’s College)
See Filter Basics: Anti-Aliasing for a discussion of the Nyquist Limit, Aliasing, etc.
See Consequences of Nyquist Theorem for Acoustic Signals Stored in Digital Format from Proceedings from Acoustic Week in Canada 1991.
Common Problems with Digitized Sound (unwanted DIGITAL NOISE):
Frequencies higher than one-half the sampling rate falsely appear as lower frequencies. This phenomenon is called ALIASING.
Sound with a frequency of 1500 Hz sampled at 1000 Hz.
Sample points are not enough to accurately represent the wave form
and the resulting frequency is LOWER than the actual sampled frequency:
What is the highest frequency that could be sampled with a sampling rate of 11.025 kHz?
QUANTIZING (QUANTIZATION NOISE) occurs when a sound is digitized (sampled). The amplitude of the sample is restricted to only integer values within a limited range. For example, an 8-bit sample assigns an integer value between 0 – 255 to represent the amplitude of each of the samples. Quantization gives the reconstructed waveform a staircase shape, compared with the original sound’s smooth, continuous waveform shape. As the sampling resolution goes down, the amount of noise due to quantization increases. With 16-bit samples, the amount of quantization error is barely audible.
CLIPPING occurs when the amplitude of a sample exceeds the quantization range. A clipped waveform appears to be cut off at the top and bottom, contains more sharp corners, and sounds “rougher” (is noisier) than the original waveform.
Original Waveform with high amplitude:
Clipped waveform resulting from quantization range not
being high enough to accurately represent the extremes of amplitude:
CLIPPING and QUANTIZATION NOISE can be reduced or eliminated by:
- reducing the input volume (no RED on the VU Meter).
- using a MIXER for multiple inputs and then adjusting the MIXER’s Master Volume level.
When combining (mixing) individual SOUNDFILES, reduce the output gain control (if available).
See PEAK CLIPPING in Barry Truax’s Handbook for Acoustic Ecology.
Reverberation and Echo
Reverberation is a result of multiple REFLECTIONs. A SOUND WAVE in an enclosed or semi-enclosed environment will be broken up as it is bounced back and forth among the reflecting surfaces. Reverberation is, in effect, a multiplicity of ECHOes whose speed of repetition is too quick for them to be perceived as separate from one another. W.C. Sabine established the official period of reverberation as the time required by a sound in a space to decrease to one-millionth of its original strength (i.e. for its intensity level to change by -60 dB).
An echo is a repetition or a partial repetition of a sound due to REFLECTION. REVERBERATION is also reflected sound, but in this case, separate repetitions of the original sound are not distinguishable. For a repetition to be distinct from the original, it must occur at least 50 ms afterwards without being masked by either the original signal or other sounds. In practice, an echo ismore likely to be audible after a 100 ms delay. See: PRECEDENCE EFFECT regarding echo suppression.
See REVERBERATION in Barry Truax’s Handbook for Acoustic Ecology.
See ECHO in Barry Truax’s Handbook for Acoustic Ecology.
Be able to define the following terms:
Equal Loudness Contour
Range of human hearing for high and low pitched sounds
CD-quality sample rate
What is the lowest sampling rate which should be used to sample sounds with the following frequencies: 20,000 Hz; 16,500 Hz; 26,000 Hz?
Last month while I was working on illustrations for “Big In Japan” my wrist stopped working.
Believe it or not there was a time when no one really knew what pitch was, they didn’t know scales, or chords. Someone just made all this stuff up. Actually, pitch was defined about 500 BCE (give or take a hundred years), scales were defined about 500 CE, and chords (actually, triads, as we understand them today) about 1300 CE. We didn’t put it all together with standard performance practices including rhythm, melody, accompaniment patterns and all the things that make music work easily until around 1600 CE. The reason we study Bach, or better yet, Palestrina, is because it is from these musicians that standards of practice were derived. If you’ve ever taken a high school music theory class and learned ‘figured bass’ the idea for such practices originated with people trying to solve the problems of how to communicate music quickly. Palestrina and Bach are examples of composers who solved a lot of performance problems in their day, so the rules about music executed in their day come from people studying their decisions. When I took those music theory courses in high school (and in college) these rules about music were presented to me by people who didn’t really understand them. I was taught, for example, that parallel perfect intervals were a “no-no” in music. That just isn’t true. Ask Claude Debussy or Eddie Van Halen who both use parallel perfect fifths a lot. What is true is that parallel perfect intervals were a “no-no” for the Baroque style of music, and learning the rules of a style are very important. Learning the rules of a style help one decide what questions they may need to ask in order to execute another style.
Music Theory isn’t a bunch of rules that define what you can and can’t do with music. Music Theory is the math that explains how music was created and organized into the language it is today by the people who defined it. Saying one doesn’t need to know ‘theory’ to write music is like saying one doesn’t need to know English to write a book or short story. While it is possible to write prose without a solid grasp of the English language the scope of what may be accomplished becomes limited. It’s difficult to write a sentence well if one doesn’t know which words are nouns or verbs, but simply relies on one’s ear to ‘hear’ when it sounds good.
The first music theoretician was actually Pythagorus who went on to curse many 7th graders with his crazy ideas about how a2+b2=c2. At least we think it was the same guy. Truth is that Pythagorus was a cult leader, maybe not on the same level as Charles Manson, but the Pythagoreans believed that everything in the world was made up of numbers. It was a popular trend at the time (500ish BCE) to attribute anything one did to one’s leader. So it is possible that it was actually one of Pythagorus’ followers who was the true first music theoretician. Here’s how it started:
Pythagorus was walking down the road near a Blacksmith’s shop and he observed something that surprised him. Each of the blacksmiths produced a different musical note when they hit their anvils. “Why?” he wondered. After very careful observation Pythagorus concluded that the various pitches must be the result of the difference in musculature between the blacksmiths. This observation tells me two things: 1) No matter how smart you are, you’re bound to be wrong at least occasionally, and 2) It was likely a female follower of Pythagorus who made this observation (just kidding). No the differing notes were the result of the variety in hammer size between the blacksmiths. See, when the blacksmiths struck an anvil with their hammer the hammer would resonate creating a musical note.
Eventually, Pythagorus (or his minion) started to recognize the precise correlation between density or size in materials that created musical sounds, and further defined those musical sounds by examining harmonics.
What makes a sound musical? Well, this doesn’t work for everything (especially not a drum), but we generally say that the distinction between a noise and a musical sound is a repeating waveform.
Well, think about tossing a pebble in a lake. The water ripples with tiny waves outward from the spot where the pebble entered the water, right? It works the same way in the air when we make a noise. Noise is the result of moving air. If the waveform or ripples in the air repeat, it sounds like a musical note. The faster those ripples occur the higher the note sounds, the slower waves sound low. All of this was defined in the 19th century by a guy named Heinrich Hertz, and as a result we define all kinds of waves with his name, abbreviated “Hz”.
I know a lot of people in music who don’t think this is important, but it really is important for even the most basic audio production. Stick with me and I’ll prove it!
Now, back to Pythagorus! He was so mesmerized by this ability of hammers to produce musical sounds that he performed a series of experiments to explain and quantify what musical notes existed. Over time he defined the overtone system as the guide for pitch in the western world.
What the hell is an overtone system?
OK, imagine that some sounds are very simple and some sounds are very complex. What makes one sound simple and another sound complex mathematically? The number of overtones apparent in that sound. For example, the cash register at the grocery store beeps using a sound known as a ‘sine wave’. Sine waves don’t exist in nature, they are too simple. On the other hand, a flute isn’t much more complex. What makes it more complex is that in addition to any note one plays on a flute one can also hear a few more notes far above the ‘fundamental’ or main note produced by the instrument. It is the level or loudness of each of these ‘extra notes’ that is in part responsible for the tone of the flute and is partially responsible for us being able to tell the difference between a note played on a flute and the same note played on a violin. Every instrument or voice produces harmonics in the same order, it is the severity, the loudness of these harmonics that distinguishes one timbre from another. There are other factors (specifically: attack, decay, sustain, and release), but we’ll discuss them another day.
The overtones one can hear in any given natural sound are
1) the fundamental (f)
And onward up.
Again, more on this later.
When Pythagorus defined these notes he stated that he could observe 12 distinct pitches. Today we define these pitches as:
A – A#/Bb – B – C – C#/Db – D – D#/Eb – E – F – F#/Gb – G – G#/Ab
Some see them as notes on a keyboard.
Some see them as frets on a guitar
Some simply hear them as notes to sing.
How you visualize them or remember them is unimportant. That you know this order of notes without pausing is incredibly important for making music like the music you have heard for most of your life.
Now that we know the 12 notes we must consider, and we know that these 12 notes repeat in different registers. In other words, a bass player might play the note ‘G’ very low, while a piccolo player may play the same note, ‘G’ very high. These notes are connected by name, but separated by register.
The low ‘B’ on a bass guitar resonates with a fundamental frequency of 30hz. The B an octave above it resonates at 60 Hz. The next B up resonates at 120 Hz, then again at 240, and 480, and 960 (get the pattern yet?). The pitch is defined as the same because of it’s clearly definable relationship to pitches bearing the same name. We say that each of the notes are separated by octave. The ‘B’ at 30 Hz is considered ‘B’ in the negative 1 octave. The ‘B’ an Octave up is “B0″ (pronounced “bee zero”). Then, of course “B1″, “B2″, etc.