Learning Resources

GameSoundCon: A bootcamp to learn game audio skills

Interactive Audio Tutorials

Game Development Tutorials-- no programming knowledge required!

Slidecasts and slideshares

Podcasts (also contains resources referenced in podcasts)

Music Note to MIDI conversion chart

Data Sheets: Simple explanations/definitions of common terms for those with no knowledge of sound synthesis/ computer game sound technology.

  • Sheet 1: An introduction to sound waves: Covers sound waves (frequency, amplitude, wavelength), wave forms (square, sawtooth, sine, triangle wave and noise), and programmable sound generators
  • Sheet 2: An introduction to sampling: Covers bit, bit depth, sample rate, digtial audio converters (DACs), pulse code modulation (PCM), ADPCM, and pulse width modulation (PWM).
  • Sheet 3: An introduction to synthesis: Covers additive, subtractive, FM, wavetable/linear arithmetic and granular synthesis as they apply to games audio.
  • Sheet 4: An introduction to MIDI and MOD: Covers MIDI, GM, GS, and MOD format.
  • Sheets 5, 6: Game System Specs: 8-bit, 16-bit: Just the specs for common consoles and home computer systems.
  • Sheet 7: Functions of Games Audio. Covers functions of games audio in some detail. See also below.

Glossary of Game Audio (by Karen Collins: Elements taken from my book chapter, “An Introduction to the Participatory and Non-Linear Aspects of Video Games Audio.” (see research for details and citation information! Students: Please cite correctly!)

Dynamic Audio is any audio which is designed to be changeable, encompassing both interactive and adaptive audio. Dynamic audio, therefore, is sound which reacts to changes in the gameplay environment and/or in response to a user.

  • Adaptive Audio occurs in the game environment, reacting to gameplay, rather than responding directly to the user. As Todd Fay indicates, “in many ways, it is like interactive audio in that it responds to a particular event. The difference is that instead of responding to feedback from the listener/player, the audio changes according to changes occurring within the game or playback environment” (Fay et al. 2004, 6). An example is Super Mario Brothers (Nintendo 1985), where the music plays at a steady tempo until the time begins to run out, at which point the tempo doubles. [Fay, Todd M.; Selfon, Scott & Fay, Todor J. 2004: Directx 9 Audio Exposed: Interactive Audio Development. Wordware Publishing, Texas.]
  • Interactive Audio: refers to sound events occurring in reaction to gameplay, which can respond to the player directly. In other words, if for instance a player presses a button, the character on screen swings their sword and makes a “swooshing” noise. Pressing the button again will cause a recurrence of this sound. The “swoosh” is an interactive sound effect.

Diegetic sounds (“source music” or “real sounds”) sounds that occur in the character's space:

  • In non-dynamic diegetic audio, the sound event occurs in the character’s space, but with which the character has no direct participation. These sounds of course occur in cut-scenes, but also take place in gameplay. For instance, in the underground hideout in Grim Fandango, Eva (a member of the resistance) is fiddling with a radio trying to tune in a particular station. Manny (the player’s character) has no contact with the radio: Its sound is diegetic, but non-dynamic.
  • Adaptive diegetic audio. In Legend of Zelda: Ocarina of Time, at dawn we hear a rooster crow, and in the “day” sequences of Hyrule Field, we hear pleasant bird sounds. When the game’s timer changes to night-time, we hear a wolf howl, crickets chirp, and various crows cawing. These sounds are diegetic and adaptive.
  • Interactive diegetic sounds occur in the character’s space, with which the player’s character can directly interact. The player instigates the audio cue, but does not necessarily affect the sound of the event once the cue is triggered. In Grim Fandango, there is a scene in the Calavera Café in which grease-monkey Glottis is playing a piano in the bar. If the player gives Glottis a VIP pass to the local racetracks, Glottis leaves the piano open. If the player then chooses, the main character Manny can sit down on the piano and play, triggering a pre-selected cue. More commonly, interactive diegetic sounds are sound effects, for instance, the sound Link’s sword makes when cutting (in Zelda: Ocarina), or the footsteps of characters in games.

Non-diegetic sound ("background" music and sound effects):

  • Adaptive non-diegetic sounds are sound events occurring in reaction to gameplay, but which are unaffected by the player’s direct movements, and are outside the diegesis. The music in Zelda: Ocarina of Time fades out at dusk and stops altogether during the night. At dawn, a quick “dawn theme” is played, followed by a return to the area’s main theme music. The player cannot re-trigger these events (except by waiting for another day to pass).
  • Interactive non-diegetic sounds, in contrast, are sound events occurring in reaction to gameplay, which can react to the player directly, but which are also outside of the diegesis. In Zelda: Ocarina, the music changes in reaction to the player approaching an enemy. If the player backs off, the music returns to the original cue. If the player manages to find the trigger point in the game, it is possible to hear both cues at the same time in the midst of a cross-fade. The player, then, controls the event cue, and can repeatedly trigger the cue, by, in this case, running back and forth over the trigger area.
  • Non-dynamic linear sounds and music: audio found most frequently in the introductory movies or cut-scenes in games. In these cases, the player has no control over the possibility of interrupting the music (short of resetting or turning off the game). In the introduction to Zelda: Ocarina of Time , for instance, a short a dream sequence movie is played, explaining the plot. If the player does not start the game (leading to further cut-scenes), the entire introduction sequence loops. Similar plot advancement movies are spliced into Grim Fandango. At key points in the game, a pre-set cut-scene movie loads, leading us to the next stage in the plot. For instance, Manny meets with Salvador, the revolutionary, in his underground hideout to conspire to expose the inequities of the D.O.D. When Manny gives Salvador the moulded impression of his teeth (necessary for access to the building), a cut-scene ends that stage of the game (El Marrow) and leads us to the next location (the Petrified Forest). The music during this intermission cut-scene begins with the theme for the hideout, and then changes to that of the new location without the player’s input: It is, in other words, linear, non-dynamic, non-diegetic music.

Kinetic gestural interaction can occur in both diegetic and non-diegetic sound, in which the player (as well as the character, typically), bodily participates with the sound on screen. At its simplest level, a joystick or controller could be argued to be kinetically interactive in the sense that a player can, for instance, play an ocarina in Legend of Zelda: Ocarina of Time by selecting notes through pushing buttons on a controller; but more significantly, here I refer to when a player may physically, gesturally mimic the action of a character, dancer, musician, etc. in order to trigger the sound event. In other words, the player must physically play a drum in Donkey Konga (Namco 2003), or play a guitar in Guitar Hero (Red Octane 2005), for instance. These types of games have typically required the purchase of additional equipment to play outside the traditional joystick/controller that is included with the game’s platform, although this will change with the release of Nintendo’s Wii controller in 2006, which will make kinetic gestural interaction with sound much more common. With the Wii controller, in the latest Zelda game, The Legend of Zelda: The Twilight Princess (Nintendo 2006) , the player must literally swing the controller to elicit a sword movement in the game, resulting in the sword swooshing sound.