UNet-based Lossless Image Compression

Overview

This project implements an image compression system that uses a neural network to achieve high compression ratios. The core of the system is a UNet with an attention mechanism, which outputs per-pixel probabilities for an arithmetic coder.

Key Features

How It Works

Encoding Process

  1. The input image is passed through a conv layer N times, until it is small enough to fit in the model’s core resolution. I do this so the model can handle any sized input image but I can still have a fixed core resolution for the attention mechanism.
  2. N DoubleConv layers. A DoubleConv layer is (conv, rms_norm, relu) x2.
  3. One last DoubleConv at the bottleneck.
  4. N upconv (nn.ConvTranspose2d) layers.
  5. Encode the image to bits using the model’s output probabilities and an arithmetic coder.
  6. Store the bottleneck output, z, and the encoded image bits in a file.

Decompression Process

  1. Read bottleneck output, z, and encoded image bits from file.
  2. N upconv (nn.ConvTranspose2d) layers.
  3. Use the model’s output probabilities and an arithmetic coder with the encoded image bits to decode the original image.

Technical Details

Model Architecture

(coming soon)

Arithmetic Coding

(coming soon, but there is a great explanation here by Matt Mahoney: Data Compression Explained)

File Format Specification

(coming soon)

Future Work