CSC108H Assignment 1

Due Date

Tuesday October 11, 10:00 pm


The purpose of this assignment is to give you practice writing your own Python code that uses variables, assignments, if-statements and functions in a fun problem domain: digital sound processing. To work locally, click here to download a complete zip archive of the assignment.


Sounds are waves of air pressure. When a sound is generated, a sound wave consisting of compressions (increases in pressure) and rarefactions (decreases in pressure) moves through the air. This is similar to what happens if you throw a stone into a pond: the water rises and falls in a repeating wave.

When a microphone records sound, it takes a measure of the air pressure and returns it as a value. These values are called samples and can be positive or negative corresponding to increases or decreases in air pressure. Each time the air pressure is recorded, we are sampling the sound. Each sample records the sound at an instant in time; the faster we sample, the more accurate is our representation of the sound. The sampling rate refers to how many times per second we sample the sound. Sampling rates of 11025 (bad quality; e.g. for VOIP conversations), 22050, and 44100 (CD quality) are common; the higher the sample rate, the better the sound quality.

For sounds recorded in mono, a sample is simply a positive or negative integer that represents the amount of compression in the air at the point the sample was taken. For sounds recorded in stereo (which we use in this assignment), a sample is actually made up of two integer values: one for the left speaker and one for the right.

The sound module (part of Pygraphics) contains functions for working with sound files. Using that module, you will first write a function to remove vocals from a music sound file (karaoke time!). Then, you'll write functions to add fade-in and fade-out to sounds, and pan a sound from left to right.

Some of the sound files in this assignment are modified versions of sounds from

Using the sound Module

When you install Pygraphics, you're actually installing two modules: media (for pictures) and sound (for sound). You can test that the sound module is working by typing import sound at the Python shell. If you receive an error, please make sure that you have installed Pygraphics.

During the 108 labs and lectures, you should have already encountered all the functions you need from the sound module. Here is a summary.

sound.load_sound(filename)Returns a sound from file filename
sound.create_sound(length)Returns a silent sound of length samples
sound.copy(snd)Returns a copy of sound snd sound snd
len(snd)Returns the number of samples in sound snd
Returns the left (or right) channel of Sample samp
sound.set_left(samp, value)
sound.set_right(samp, value)
Sets the left (or right) channel of Sample samp to value
sound.get_index(samp)Returns the index of Sample samp
sound.get_sample(snd, index)Returns the Sample found at index index in sound snd

You can also use the looping syntax for sample in snd to iterate over all samples in snd.

For this assignment, you are not allowed to use the crop function. Other than crop, all the functions in the sound module are available for your use. However, it's not necessary to go searching for other functions. You can write a perfect solution using only the functions in the table above!

If you want to save any of your sounds as wav files, you can do so using the sound.save_as function. However, make sure you don't use this function in the code that you hand in.

You may be more familiar with MP3 files than wav files. The major difference between the two is that wav files don't use any compression, whereas MP3 files use lossy compression. Lossy compression results in MP3 files that are poorer quality, but much smaller, than wav files.

What You Will Submit

Please put all of the code for this assignment (i.e. all three parts below) in a file called When you're finished, you will submit only your file.

Part 1 - Removing Vocals

Take a listen to this wav file with vocals. The function you write in this part will be able to take that file, and produce a wav file with vocals removed. (You should be able to click these links in your browser to hear the sounds. If that doesn't work, save them to your computer and then open them with music software like Quicktime, Itunes, Winamp, Windows Media Player, etc. You'll also find it convenient to play sounds from within Python itself.)

def rem_vocals (snd):

The rem_vocals function takes a sound object snd as a parameter, and creates and returns a new sound with vocals removed using the algorithm described below. The new sound has the same number of samples as snd. (The original sound snd is not modified.)

The algorithm works like this. Any given sample in snd contains two integer values, one for the left channel and one for the right. Call these values left and right. For each sample in snd, compute (left - right) / 2.0, and use this value for both the left and right channels of the corresponding sample in the new sound you are creating.

Here's an example. Let's say that snd contains the following three samples, each composed of two values: (1010, 80), (1500, -4200), (-65, 28132). Your program will produce a new sound consisting of the following three samples: (465, 465), (2850, 2850), (-14098, -14098).

If you do the math, you'll notice that the values in the third sample should have both been the fractional number -14098.5; but, as we know, sample values must be integers. You therefore cannot set a sample value to -14098.5 (you'll get an error if you try), and must convert that number to an integer before storing it in a sound. Keep this in mind for all of the functions you write in this assignment.

Note: when dividing by 2, you must use 2.0 and not 2. If you divide by 2, Python uses integer division, converting the result to an integer and losing precision. You must not do any integer conversion until the very last step when you store the sample value in the sound.

Why Does This Algorithm Work?

For the curious, a brief explanation of the vocal-removal algorithm is in order. As you noticed from the algorithm, we are simply subtracting one channel from the other (and then dividing by 2 to keep the volume from getting too loud). So why does subtracting the right channel from the left channel magically remove vocals?

When music is recorded, it is sometimes the case that vocals are recorded by a single microphone, and that single vocal track is used for the vocals in both channels. The other instruments in the song are recorded by multiple microphones, so that they sound different in both channels. Subtracting one channel from the other takes away everything that is ``in common'' between those two channels which, if we're lucky, means removing the vocals.

Of course, things rarely work so well. Try your vocal remover on this badly-behaved wav file. Sure, the vocals are gone, but so is the body of the music! Apparently, some of the instruments were also recorded ``centred'', so that they are removed along with the vocals when channels are subtracted. When you're tired of that one, try this harmonized song. Can you hear the difference once you remove the vocals? Part of the harmony is gone!

Part 2 - Fade-in and Fade-out

As with Part 1, None of the functions in this part should modify an existing sound object. They should all create and return a new sound object.


def fade_in (snd, fade_length):

This function takes a sound object and an integer indicating the number of samples to which the fade-in will be applied. For example, if fade_length is 88200, the fade-in should not affect any sample numbered 88200 or higher. (The first sample in a sound is numbered 0.)

Before we discuss how to accomplish fade-in, let's get acquainted with some fading-in. Listen to this monotonous sound of water bubbling. The volume is stable throughout. Now, with the call fade_in (water, 88200) (where water is a sound object loaded with the water sound), we get water with a short fade-in. Notice how the water linearly fades in over the first two seconds, then remains at maximum volume throughout. (88200 corresponds to two seconds, because we're using sounds recorded at 44100 samples per second.) Finally, with the call fade_in (water, len(water)), we get water with a long fade-in. The fade-in is slowly and linearly applied over the entire duration of the sound, so that the maximum volume is reached only at the very last sample.

To apply a fade-in to a sound, we multiply successive samples by larger and larger fractional numbers between 0 and 1. Multiplying samples by 0 silences them, and multiplying by 1 (obviously) keeps them the same. Importantly, multiplying by a factor between 0 and 1 scales their volume by that factor.

Here's an example. Assume fade_length is 4, meaning that I apply my fade-in over the first four samples (samples numbered 0 to 3). Both channels of those samples should be multiplied by the following factors to generate the fade-in:

Sample Number Multiply By...
0 0.0
1 0.25
2 0.5
3 0.75
>3 Do Not Modify the sample


def fade_out (snd, fade_length):

This function again takes a sound object and an integer indicating the length of the fade. However, this time, the fade is a fade-out (from loud to quiet), and the fade-out begins fade_length samples from the end of the sound rather than from the beginning. For example, if fade_length is 88200 and the length of the sound is samp samples, the fade-out should only affect samples numbered samp-88200 up to samp-1.

Let's use a raining sound to demonstrate. As with the water bubbling above, The volume is stable throughout. Now, with the call fade_out (rain, 88200) (where rain is a sound object loaded with the rain sound), we get rain with a short fade-out. The first few seconds of the rain are as before. Then, two seconds before the end, the fade-out starts, with the sound progressing toward zero volume. The final sample of the sound has value 0.

The multiplicative factors for fade_out are the same as for fade_in, but are applied in the reverse order. For example, if fade_length were 4, the channels of the fourth-last sample would be multiplied by 0.75, the channels of the third-last sample would be multiplied by 0.5, the channels of the second-last sample would be multiplied by 0.25, and the channels of the final sample in the sound would be multiplied by 0.0.


def fade (snd, fade_length):

This one combines both fading-in and fading-out. It applies a fade-in of fade_length samples to the beginning of the sound, and applies a fade-out of fade_length samples to the end of the sound. Don't be concerned about what to do when the fades would overlap; don't do anything special to try to recognize or fix this.

Here's yet another sound file for you to try. This one has a particularly abrupt beginning and end, which your fade function should be able to nicely finesse. This is a large file and can take a minute or two to process on a slow computer; test with smaller files first.

Part 3 - Panning from Left to Right

As usual, the function in this part should not modify the sound object it is passed; it should create and return a new sound object.

def left_to_right (snd, pan_length):

This function takes a sound object and an integer indicating the number of samples to which the pan will be applied. For example, if pan_length is 88200, the pan should not affect any sample numbered 88200 or higher.

Let's listen to what panning sounds like. Here's an airplane sound. The entire sound is centred, and does not move in the stereo field as you listen. Now, with the call left_to_right (airplane, len(airplane)) (where airplane is a sound object loaded with the airplane sound), we get this airplane panning from left to right sound. The sound starts completely at the left, then slowly moves to the right, reaching the extreme right by the final sample.

Getting a sound to move from left to right like this requires a fade-out on the left channel and a fade-in on the right channel.

Here's an example. Assume pan_length is 4. The following table indicates the factors by which the channels of these samples should be multiplied:

Sample Number Multiply Left Channel By... Multiply Right Channel By...
0 0.75 0.0
1 0.5 0.25
2 0.25 0.5
3 0.0 0.75
>3 Do Not Modify the sample Do Not Modify the sample

If you run left_to_right on only a prefix of a sound (i.e. you use a pan_length that is less than the length of snd), you'll get strange (though expected) results. For example, if you pan the first 441000 samples of love.wav, you'll hear it pan from left to right over the first ten seconds, then you'll hear a click followed by the remainder of the song played in the centre.

To understand how this function works, it might help to think of changing the volume using two volume controls: one for the left channel and one for the right. To make the sound seem like it's moving from left to right, you slowly lower the volume in the left ear and raise the volume in the right ear. There is no copying going on between the two channels. And for the record, this technique only works when corresponding samples of both channels are the same: experiment with this dog and lake sound to see what happens when channels contain different sounds.

No Input or Output!

You must not produce any output to the screen (with print), acquire any input from the keyboard (with raw_input), or create any wav files (with sound.save_as) in your file. You may include an if name == ... section, but do not include any top-level code that runs when we import your file. We will call your functions to test them, and such top-level code will cause our tests to fail.


We are providing some tests that exercise the very basics of your code. Download the following file and place it in the same directory in which you are working on your assignment:

To run these tests, open in Wing and click Run. The final line of output tells you the number of tests that failed. Earlier output tells you exactly which of your functions failed; for example, the following output indicates a problem with your rem_vocals function:

FAIL: test_rem_vocals (__main__.TestCases)
test rem_vocals.
Traceback (most recent call last):
File "", line 136, in test_rem_vocals
self.assertEqual (student, sol)
AssertionError:  != 

If all tests pass, don't start dancing quite yet. The test code checks that you have declared your functions in the right places, returned the proper values, and done the right thing according to a single test case. We leave it up to you to do further, more comprehensive testing on your own. We will run each of your functions on our own tests, and it is up to you to make sure that all of our tests will pass.


These are the aspects of your work on which we will focus in the marking:

What to Hand In

Submit the following file:

Remember that spelling of filenames, including case, counts: your file must be named exactly as above.