This post is a simple yet illustrative application of K-means clustering technique. Using K-means clustering, we will perform quantization of colours present in the image which will further help in compressing the image.

In a coloured image, each pixel is of size 3 bytes (RGB), where each colour can have intensity values from 0 to 255. Following combinatorics, the total number of colours which can be represented are 256*256*256. Practically, we are able to visualize only a few colours in an image. Shown below is an image of 1280 x 720 pixels taking 1.71 MB in PNG format. PNG is a lossless compression technique for images. Our objective is to compress the image further using colour quantization, though the compression will be lossy.

tiger

K-means clustering

It is basically an optimization algorithm to find ‘k’ clusters in the given set of data points. Initially, it randomly assigns k-cluster centers and then on the basis of some distance metric (for example, euclidean distance) it aims to minimize within cluster sum of squared distance of the data points from the cluster center. There are two steps in k-means clustering algorithm:

a) Assignment step – Each data point is assigned to the cluster whose center is nearest to it.
b) Update step – New means (centroids) are calculated from the data points assigned to the new clusters.

Just to give you an idea of clustering data points, Below picture has been taken from internet to depict data points before and after K-means clustering.

image
K-means clustering with K=3

In our problem of image compression, K-means clustering will group similar colours together in ‘k’ clusters (say ‘k’ = 128). Therefore, the centroid of each cluster is representative of the 3 dimensional colour vectors (RGB) falling in the respective cluster. By now you might have understood what are we trying to do. These ‘k’ centroids will replace all the colour vectors in their clusters, thereby keeping only ‘k’ colour combinations for the whole image. Thus, we need to keep only the label of each pixel in the image that tells about the cluster in which that pixel falls. Also, we keep the ‘k’ centroids as codebook which are the only colours seen in the compressed image.

Compression

We will write a simple python code to compress the image and store the compressed image along with the code book. The compressed image saved here is nothing but the cluster label of each pixel of the original image. Codebook is the fancy name given to the list of cluster centers (3-d RGB) achieved after running k-means algorithm. Afterwards, both the arrays (the cluster labels and the codebook) are saved in data type ‘unsigned integer’ as the range of intensity values (0-255) and value of ‘k’ is always going to be less than 255 . The code given below does all this.

from skimage import io
from sklearn.cluster import KMeans
import numpy as np

image = io.imread('tiger.png')
io.imshow(image)
io.show()

rows = image.shape[0]
cols = image.shape[1]
 
image = image.reshape(image.shape[0]*image.shape[1],3)
kmeans = KMeans(n_clusters = 128, n_init=10, max_iter=200)
kmeans.fit(image)

clusters = np.asarray(kmeans.cluster_centers_,dtype=np.uint8) 
labels = np.asarray(kmeans.labels_,dtype=np.uint8 )  
labels = labels.reshape(rows,cols); 

np.save('codebook_tiger.npy',clusters)    
io.imsave('compressed_tiger.png',labels);

We can select ‘k’ sufficient enough to represent the colours of image well. Here, ‘k’ has been chosen as 128. This means all the colour combinations in the original image have been quantized to 128 distinct colours only. These colours will be present in reconstructed image (after decompression) and it should be visually similar to original image.

Decompression

We also need to decompress the image in order to visualise the reconstructed image which is obviously an outcome of lossy compression performed. Below code does the decompression by assigning the 3-d colours from the code book to the each pixel depending upon its label.

from skimage import io
import numpy as np

centers = np.load('codebook_tiger.npy')
c_image = io.imread('compressed_tiger.png')

image = np.zeros((c_image.shape[0],c_image.shape[1],3),dtype=np.uint8 )
for i in range(c_image.shape[0]):
    for j in range(c_image.shape[1]):
            image[i,j,:] = centers[c_image[i,j],:]
io.imsave('reconstructed_tiger.png',image);
io.imshow(image)
io.show()

We can see the reconstructed image after decompression below. Though the reconstructed image has lost a lot of pixel colour information but still you won’t find any major difference visually.

reconstructed_tiger

Also, you can visualise these 128 colours found in the reconstructed image by viewing the colours in the codebook separately (may be by displaying mono-coloured square box). These colours are the centroids of clusters formed after performing k-means on original image.

Caution

  1. If you will try to compress a ‘jpeg’ image in exactly the same way as followed in the blog-post, you will incur errors as jpeg does a lossy compression. Compression algorithm of jpeg changes the intensity values of the pixel, so the pixels in the compressed image containing the label may become more than ‘k’ which leads to error.
  2. K-means algorithm is an optimization problem of finding the clusters in the given data-set. Execution time increases as the image dimensions increases or ‘K’ increases. So, initially you can start with a lesser value of ‘k’ in order to quickly get results.
  3. There is a trade off between the execution time and the number of colours represented in reconstructed image. Higher ‘k’ will produce better quality of compressed image but will take longer to execute.

Conclusion

You can check the disk space taken by the images which were considered in the blog-post here. I have posted the snapshot of working directory. The original png image was of 1757 KB (tiger.png) whereas the compressed tiger image and codebook are of only 433 KB all together. The reconstructed image is also taking less space because png runs its own compression algorithm. As there are only 128 unique colours now, png is able to get compression ratio of more than 2.

spaces

One can conclude that the compression applied here is done only by reducing the number of colours in the image which is also called as Colour Quantization. We have not reduced either the size of image or the intensity ranges of pixels.

Hope it was easy to follow this blog-post. The full python implementation of image compression with K-means clustering can be found on Github link here.

If you liked the post, follow this blog to get updates about the upcoming articles. Also, share this article so that it can reach out to the readers who can actually gain from this. Please feel free to discuss anything regarding the post. I would love to hear feedback from you.

Happy machine learning 🙂