Image compression uses four stages to reduce file sizes: 1) transforming RGB pixels to the YCbCr color space, 2) applying discrete cosine transformation to concentrate pixel data into a few matrix elements, 3) quantizing pixel values to reduce their range, and 4) using Huffman coding to assign shorter bit codes to more common pixel values. This allows JPEG images to be compressed to smaller file sizes for storage and transmission while still maintaining good visual quality.
2. Introduction
The images we see and use every day are compressed
before storing and transmitting through wireless
network.
Assume a standard, color image of 1024 x 1024 pixels.
Each pixel has 3 channels of 8 bits (1 byte). Now,
doing the math, the total space required for the
image would be 3 MB.
We all know that it is a large file, and, now, more than
ever, with smartphones and cameras, people snap
pictures of every moment, food, and places.
3. Stages
Image compression has four stages.
Compression can be achieved at three of these
stages, and the level of compression depends
on how efficiently it is done and how much
image clarity is being compromised.
The four stages are explained here.
4. Stage 1: Mapper
This stage achieves no effective compression, but it
does the necessary setup required for compression.
A computer image is composed of pixels and each pixel
is made of three colors: Red, Green, Blue (RGB). Each
pixels’ color value is represented by a byte (0-255).
The mapper modifies the RGB color space to YCbCr
color space. Y is the luminescence, Cb and Cr are the
chroma blue and red components.
5. Stage 2: Transformation
The next stage is executing Discrete Cosine
Transformation on the YCbCr 8X8 subimages.
In Matlab, use dct2(). What this transformation does is
it preserves most of the image data from 64 matrix
elements into first few elements of the matrix.
That way, depending on the clarity of image needed,
first few elements of the matrix can be picked and
the rest of them can be ignored.
6. Stage 3: Quantization
The pixels have values ranging from 0 to 255. But, the
human eye can notice little difference between the
value changes.
Taking advantage, we can quantize the values, so we
can say that any value between 11-20 will be
considered as 15, and 21-30 as 25 and so on... The
easiest way to do is to divide by 16.
That way, (0-255) / 16 = (0 – 15) or 4 bits. Hence,
instead of 8 bits, now you need 4 bits. This is where
most of compression occurs for images.
When you decode, you just multiply with 16 to get the
actual pixel value.
7. Stage 4: Huffman Coding
The basic idea is that if you take any image and do a
histogram of pixel values, you will find that most of
the pixels (about 50% or more) are of same value
and some other value like 25%.
The idea is to give the pixel value with maximum
probability a single bit (like 0 or 1), then the 2nd
highest probability pixel would get something like
(01 or 10) and so on…
The key idea is to try to reduce the bit occupancy for
highly repeating pixel values (goes back to why
repeatability is good and why we split into YCbCr
space).
8. Decoding
Once you have the compressed image data, it is the
matter of writing it to a file that can be decoded
properly and re-constructed back.
There is always going to be an overhead of file
structure and tags, so your file format has to be
efficient.
The image can be constructed by reversing each of the
step.
9. Learn More
in
IIT Kharagpur's
First Online
Certificate Course on
Image and
Video Communication
[Refer: http://goo.gl/hMyYWa ;
courses@wiziq.com]
10. Learn More
in
IIT Kharagpur's
First Online
Certificate Course on
Image and
Video Communication
[Refer: http://goo.gl/hMyYWa ;
courses@wiziq.com]