Resources | Subject Notes | Computer Science
Compression is a fundamental technique in computer science used to reduce the size of data. This is crucial for efficient storage and transmission of information. It involves representing data using fewer bits than the original representation, thereby saving space and bandwidth.
There are several key reasons why data compression is essential:
Data compression can be broadly classified into two main types:
These techniques are commonly used for text files, program code, and other data where accuracy is paramount.
Technique | Description | Example |
---|---|---|
Run-Length Encoding (RLE) | Replaces consecutive occurrences of the same character with a single instance of the character and the number of repetitions. | Images with large areas of the same color. |
Huffman Coding | Assigns shorter codes to more frequent characters and longer codes to less frequent characters. | ZIP archives, image formats like PNG. |
Lempel-Ziv (LZ77, LZ78) | Uses a dictionary to represent repeated sequences of data. | ZIP archives, GZIP. |
These techniques are used for multimedia data (images, audio, video) where some data loss is acceptable in exchange for significant compression.
Technique | Description | Example |
---|---|---|
Discrete Cosine Transform (DCT) | Transforms data into frequency components, allowing for the discarding of high-frequency components (which are often less noticeable). | JPEG image format. |
Wavelet Transform | Similar to DCT, but provides better performance for images with sharp edges. | JPEG 2000 image format. |
MP3, AAC (Audio) | Removes audio frequencies that are masked by louder sounds. | MP3, AAC audio formats. |
MPEG, H.264, H.265 (Video) | Exploits temporal redundancy (similarity between frames) and spatial redundancy (similarity within a frame) to reduce video data size. | MP4, AVI, MKV video formats. |
The compression ratio is a measure of how much a file is reduced in size after compression. It is calculated as:
$$ \text{Compression Ratio} = \frac{\text{Original File Size}}{\text{Compressed File Size}} $$A higher compression ratio indicates better compression.
It's important to note that compression involves trade-offs. Lossless compression preserves data integrity but typically achieves lower compression ratios. Lossy compression achieves higher compression ratios but introduces data loss. The choice of compression technique depends on the specific application and the acceptable level of data loss.