Digital Audio Primer

It seems as though there are more digital music players on the market now than grains of sand on the beach. Flash memory continues to drop in price and the electronics necessary to make a portable digital music player are increasingly commoditized. The result is a flood of affordable digital music devices from brands both large and small. This year will see the release of a slew of new digital media adapters – devices that pull audio, video, and pictures off your PC and display them on your TV. Some will even be built into DVD players or TiVo-like video recorders. With all these new devices for playing digital music, it looks like the compact disc's days are numbered.

Even if we don't throw out our CDs, it's more likely than ever that we won't actually listen to them. Instead, we'll pull the songs off, storing them on our PC or portable music player as compressed digital audio files. As the years roll on, more and more music will be purchased online in digital form. Most of us already have a lot of digital audio in our lives (beyond simple CDs), but the amount of it is set to explode.

That's why this is the perfect time to go over digital audio terms and technologies. Most of us know a bit about the different formats and terms, but the world of digital audio is enormous and ever-evolving. A detailed explanation of all the technology would fill volumes. Allow us instead to present this 20-minute primer, a list of common and useful terms, their definitions, and a description of some of the more popular and interesting digital audio formats. There's a lot of ground to cover so let's jump right in.

When you come across two geeks talking about digital music, it can sound an awful lot like they're speaking in another language (much like listening to a baseball fans). There's a sea of terminology to wade through but much of it is purely academic and not really useful to typical consumers. Here are a few of the more common terms you should understand to get the most out of your digital audio experience.

Codec: Any time digital audio is mentioned, this word gets thrown around. It's a shortening of compressor/decompressor – an algorithm used to compress data and then decompress it again. Some codecs are implemented in software, some in hardware, and some are limited in their functionality. A portable music player may have a "codec" that only decompresses data, for example. However, it's common in the digital media world to call any algorithm that deals with the compression and decompression of audio data a codec.

The process isn't necessarily symmetrical--it often takes longer to compress a digital music file than to decompress it for playback. Decompression is less CPU intensive, which is why tiny players with very low power processors are fine for playing back compressed music.

Compression Ratio: Simply put, this is the ratio between the size of the original uncompressed audio clip and its compressed version. If an audio clip is 20MB in its native uncompressed form and 1.8MB as a 128k MP3 file, then that file has a compression ratio of 11:1 (typical of 128k MP3 files).

Digital sound formats are divided into two types: Lossy compression is that which actually removes some information in order to make the file easier to compress. Lossy compression is by far the most popular format, because it allows for much smaller file sizes. Virtually all lossy compression schemes, whether for audio or video, work by a principle called "perceptual coding." This is the process of removing parts of the original data that the user will probably not even perceive anyway. The trick is to remove as little information as possible from the original audio sample and to make sure that which is removed is hardest to hear (frequencies above the range of human hearing, for example).

Lossless compression is just what it sounds like – a way of compressing music into a file that, when played back, is absolutely identical to the original. Not just "sounds the same," but that is statistically identical. It was once only a viable option for professionals seeking to archive large volumes of audio but now large hard drives are cheap and drive-based portable players with lots of storage are abundant, so lossless codecs are increasingly relevant to end users that want the best possible audio fidelity.

The following image shows the difference between lossy and lossless compression. We took a clip from the Fellowship of the Ring soundtrack and encoded it in two formats: Windows Media 9 Lossless and MP3 at 128 kilobits per second. The original clip was 10.3MB, which was cut down to 6MB with the lossless codec--not even a 2:1 reduction, but still a good space savings. A clip that is not quite so musically complex would compress further. The 128kbit MP3 cut the file down to 983 kilobytes, and 11:1 compression ratio.

Then we took a look at a spectrum analysis of the resulting files. Note how lossless compression produces the exact same graph, while the MP3 file loses signal dramatically as frequency increases and drops off entirely just over 16KHz. Increasing the bit rate can improve this, but the only way to get a totally exact copy of the music is to use a lossless compression scheme.

The bit rate of a digital file is defined as how many bits it uses up in a given interval of time. (An audio file is almost always measured in "kilobits per second.") Typically, the higher the bit rate at which music is encoded, the better the sound is. A rate of 128 kilobits/sec is extremely popular in online music downloading – legal and otherwise – because it offers a good compromise between sound quality and download time. Here's a little chart to give you an idea of how different bit rates compare to regular CD audio.

Bit rate (Kilobits per Second) Seen in Compression Ratio (Compared to CD Audio) Three-Minute Songs per 650 megabyte CD 1411 CD Audio 1:1 20 192 High quality MP3, WMA, or AAC 7.3:1 154 128 Most downloaded music (legal or otherwise) 11:1 231 64 High-quality streaming music; small memory-based portable devices 22:1 462 8 Streaming voice; Internet talk radio 176:1 3,697

Bit rates lower than 128kb/sec are generally not suitable for CD or hard drive-based devices, but rates between 64 and 128 are great for devices with only 64 or 128MB of memory where you want to pack on more songs, or even for streaming internet radio if the listeners have broadband connections. Very low bit rates (below 64kbps) are almost totally unsuitable for music but compress voice fairly well and can be used for online voice chat or streaming talk/news radio. Of course, there's always the option of using a variable bit rate.

VBR & CBR: These common acronyms stand for Variable Bit Rate and Constant Bit Rate, respectively. Constant Bit Rate audio files are the most common--they use up the exact same amount of data from one moment to the next. If you have a 128kb/s CBR music file, it will use 128kilobits to describe the audio in each second of the song, regardless of what sounds are playing that second or the complexity of the audio stream at the time. A Variable Bit Rate is a bit smarter. A VBR music file will use a lower bit rate in areas of the song that are simpler to compress accurately, and then higher bit rates in parts that require more bits to describe accurately. VBR audio files are often made with a certain quality in mind, rather than a certain bit rate, but it's almost always true that, all things being equal, a VBR sound file will sound better than a CBR file of the same size. The problem with VBR is that it's hard to stream over the Internet, because the amount of data that needs to come over your net connection is constantly changing from one moment to the next.

These are the most basic specifications of all digital audio files, compressed or not. Sample rate refers to how many times per second the original waveform is translated into digital form. CD audio, for instance, is sampled at 44.1KHz. That means that the left and right channels are each sampled 44,100 times per second. Sampled into what? That's where Bit Depth comes in. This is how many bits are used to describe each of those samples. The more bits used to encode the file, the more accurate the sample. CD audio is sampled at 16 bits, so there is a 16-bit number to describe the amplitude of the sound wave for each of the 44,100 samples every second. No wonder CD audio files are so big.

Baseball