Show understanding of how sound is represented and encoded

Resources | Subject Notes | Computer Science

1.2 Multimedia: Sound Representation and Encoding

This section explores how sound is represented digitally for use in multimedia applications. It covers the principles of sound waves, their conversion to digital signals, and common encoding methods.

Sound Waves: The Physical Basis

Sound is a mechanical wave that propagates through a medium, typically air, as variations in pressure. These variations create compressions and rarefactions. The characteristics of a sound wave that we perceive as pitch and loudness are related to the frequency and amplitude of the wave, respectively.

Analog to Digital Conversion (ADC)

To represent sound digitally, an analog-to-digital converter (ADC) is used. This process involves sampling the continuous analog sound wave at regular intervals and then quantizing the amplitude of the sampled signal.

The key parameters in ADC are:

  • Sampling Rate (Fs): The number of samples taken per second, measured in Hertz (Hz). A higher sampling rate results in a more accurate representation of the original sound.
  • Bit Depth (n): The number of bits used to represent each sample. The bit depth determines the number of possible amplitude levels. A higher bit depth provides greater dynamic range and lower quantization noise.

The relationship between the sampling rate and bit depth determines the number of possible amplitude levels. The number of possible amplitude levels is $2^n$.

Quantization

Quantization is the process of mapping the continuous range of amplitude values to a finite set of discrete levels. The amplitude of each sample is rounded to the nearest available quantization level. This introduces quantization error, which is the difference between the original analog signal and the digital representation.

Encoding Formats

Several encoding formats are used to store and transmit digital audio. These formats vary in terms of compression, quality, and file size.

Uncompressed Formats

These formats store the audio data without any compression. They offer the highest audio quality but require significant storage space.

Format Description File Size Quality
WAV (.wav) Uncompressed PCM audio. Large Excellent
AIFF (.aiff) Uncompressed PCM audio, commonly used on macOS. Large Excellent

Compressed Formats

These formats use compression techniques to reduce the file size of audio data. Compression can be either lossy (discarding some audio information) or lossless (preserving all audio information).

Lossy Compression

Lossy compression formats achieve smaller file sizes by discarding audio data that is considered less perceptually important. Examples include:

  • MP3: A widely used lossy compression format.
  • AAC: Another popular lossy compression format, often considered to offer better quality than MP3 at the same bitrate.
  • Ogg Vorbis: A free and open-source lossy compression format.
Lossless Compression

Lossless compression formats reduce file size without discarding any audio information. The original audio data can be perfectly reconstructed from the compressed data. Examples include:

  • FLAC: A popular lossless compression format.
  • ALAC: Apple Lossless Audio Codec, used by Apple devices.

Digital Audio Characteristics

The digital representation of sound has specific characteristics:

  • Sampling Rate and Frequency Response: The Nyquist-Shannon sampling theorem states that the sampling rate must be at least twice the highest frequency component of the sound wave to avoid aliasing.
  • Bit Depth and Dynamic Range: The bit depth determines the dynamic range of the audio signal, which is the difference between the loudest and quietest sounds that can be represented.
  • Audio Channels: Sound can be represented in mono (one channel) or stereo (two channels). Stereo audio provides a more realistic soundstage.
Suggested diagram: Illustrates the process of converting an analog sound wave to a digital signal through sampling and quantization. Shows the sampling rate, bit depth, and the resulting digital waveform.

Aliasing

Aliasing occurs when the sampling rate is insufficient to accurately represent the frequencies present in the original sound wave. This results in spurious frequencies appearing in the digital audio signal, which can be perceived as distortion.

To prevent aliasing, the sampling rate must be greater than twice the highest frequency component of the sound.