• Beginning Audio Programming

  • Completed 2003

This project was a book review I completed for Premier Press. The full text arcticle is reproduced on the remainder of this project page. Feel free to email me with any comments or questions!

Book Review

Joe Bertolami, Review of Beginning Game Audio Programming, by Mason McCuskey, Premier Press 2003

Computer audio programming has gained enormous momentum in recent years with the arrival of professional grade hardware at consumer prices. As a result, a surging opportunity has replaced the once desolate and antiquated field of audio software development.

Thanks to the audio community, modern sound cards can handle high quality audio and perform complex operations in real-time. Games now exhibit high quality 3D surround sound with dynamic effects such as reverb, obstruction, and reflection. As the game industry awakens to the power of audio to enhance the realism of a gaming experience, the rare audio programmer has become priceless.

Beginning Game Audio Programming provides students with the ability to implement a modern sound engine. It meticulously guides the reader through basic sound concepts, common API usage, and sound engine design as well as covering, in the process, a myriad of audio topics such as DirectPlay Voice, audio scripting, and visualization to familiarize the novice with the diverse facets of the field. This work is the ultimate tool for any aspiring sound programmer.

The book is organized into two sections, "Audio Engine Basics," and "Advanced Audio Engine Functionality." The first section aims at teaching the fundamentals of DirectAudio, various audio formats, and a basic design for an audio engine. Although this section contains new concepts and information, the author avoids confusion through concise explanation. The goal of this section is to make the reader comfortable writing basic sound software in DirectX that can perform all the common and expected operations for a sound engine like, for instance, mixing and volume control.

The second section covers more advanced topics such as sound effects, audio scripts, DirectPlay, OpenAL, and dynamic music. This portion of the book is dedicated to the practical application of a sound engine. The book describes specific technologies such as EAX and Dolby that build upon the principles of the previous chapters.

McCuskey's effort has much to commend it. From the outset, a solid history of sound programming and his general approach serves as an excellent motivator for the rest of the book. By the second chapter, the reader is well versed in the basics of sound, common sound formats, and the pre-existing sound engines available. One of the best aspects of the book is it's casual and non-intimidating tone: "(DirectX is) complex because it's just so darn big. DirectMusic is one of those all-encompassing libraries. It plays wave files. It streams waves from disk. It plays MIDI. It plays dynamic music. Shoot, it even composes music. There are very few things DirectMusic can't do." The lecture-like tutorial format works well to convey complex concepts with a vividness enhanced by well chosen examples. The relaxed pace prevents discouragement in the novice programmer as do the author's pithy maxims: "Big libraries make the difficult stuff easy, but the easy stuff difficult." The author recognizes the hurdles with DirectX encountered by novice programmers and overcomes them by focusing on the most crucial information, and providing references for the details. The approach easily maintains the reader's interest throughout.

McCuskey cuts through relatively dense topics such as DirectMusic, DirectSound, and DirectAudio and more than adequately explores various internal objects such as segments, audio paths, bands, parts, performance objects, and buffers. The result is that the reader is given the vital information needed to get started programming a sound engine, but without being hampered by the extraneous details of this intricate library.

A useful teaching device employed by the author is the description of a sound engine developed seamlessly from one chapter to the next. He orchestrates the incremental buildup of the engine starting with a brief introduction to the engine and its design, and then slowly adds features such as WAV, MP3, and MIDI loading and playback. This allows the reader to become comfortable with the various audio topics gradually, with each being followed by concrete examples and implementations. Complex features such as buffer management, and 3D sound are gracefully integrated into the now familiar sound engine. By the end of the book, the reader has been progressively introduced to a complex audio engine - one they are completely comfortable with because they have been given sufficient time to digest and assimilate each new concept. It is beneficial for the reader when an author lets you marinate with a new idea for a bit before moving on and building on it.

This book also provides an excellent chapter on CD audio playback. McCuskey covers the MCI in proper detail and thoroughly explains the benefits and drawbacks to using this method of game audio.

In the second section of the book, McCuskey details more advanced topics of audio programming such as dynamic music, 3D sound, and audio visualization. The chapter on dynamic music covers a large quantity of concepts in great detail. The author provides insight into the theories behind dynamic music, as well as in-depth coverage of DirectX support for this type of audio. The single most important piece of advice from the author warns against over-embellishment of audio technologies in a game. "A word of caution: use effects sparingly. Like lens flares and graphic violence, effects work best when applied strategically and for a specific purpose. They are often applied with great subtlety. A common mistake is to look at all these cool new effects and start blanketing your game's audio with them. Resist the temptation to do this. Instead, apply effects with restraint and precision, and your game's audio will truly sound great." This advice is crucial to implementing a professional sound engine that adequately and appropriately balances the various facets of sound technology.

The chapters on 3D sound teach the reader how to implement this powerful new technology in DirectAudio and OpenAL. The author clearly and eloquently covers the basic principles of listeners, buffers, sources, and audio roll-off.

Another important facet of audio programming, audio visualization, is explained in the final chapter of the book. In this chapter, critical audio concepts such as fast and discrete Fourier transforms, Doppler shifts, and spectrum analysis are explained. This chapter offers the reader a taste of some of the intermediate level topics now possible using a sound engine.

I recommend this book for any developer seeking an introductory course in audio programming. It provides a solid foundation of sound principles, API usage, file formats, effects, and sound engine design that are crucial to the field of game audio development.

This book carefully avoids becoming overly complex or didactic, offering descriptive explanations of the core concepts of audio programming. This refreshing outlook provides a student with a solid foundation from which to take large successive steps into the audio world.

This book is truly an excellent resource for anyone wishing to learn more about the blooming field of computer audio programming.