Deep Learning-Based Image Semantic Coding for Semantic Communications

TLDR

The study proposes a GAN‑based image semantic coding system designed for semantic exchange rather than symbol transmission, evaluated with perception metrics aligned with semantic communications. The method employs a convolutional encoder, quantizer, conditional SPADE generator, residual coding, and perceptual losses in a coarse‑to‑fine architecture where a base layer fully generates semantic content and an enhancement layer restores fine details. Experimental results show that the model achieves state‑of‑the‑art visually pleasing and semantically consistent reconstruction at extreme low bitrates, outperforming BPG, WebP, JPEG2000, JPEG, and other deep‑learning codecs while allowing perception‑distortion trade‑off tuning.

Abstract

This paper presents the Generative Adversarial Networks (GANs)-based image semantic coding, the goal of which is semantic exchange rather than symbol transmission. State-of-the-art visually pleasing reconstruction and semantic preserving performance are obtained in extreme low bitrate via a rate-perception-distortion optimization framework. In particular, we investigate convolutional encoder, quantizer, conditional SPADE generator, residual coding as well as perceptual losses. In contrast to previous work, we designed a coarse-to-fine image semantic coding model for multimedia semantic communication system. The base layer of the image is fully generated and preserves semantic information while the enhancement layer restores the fine details. We explore the perception and distortion performance trade-off by tuning the rate of base layer and enhancement layer. Different from the existing methods that adopt pixel accuracy as distortion metric, we train and evaluate the proposed image semantic coding model with multiple perception metrics, in line with the purpose of semantic communications. Experimental results demonstrate that our model could achieve visually pleasant and semantic consistent reconstruction, as well as saving times of bitrate, compared to BPG, WebP, JPEG2000, JPEG, and other deep learning-based image codecs.

References

Page 1

	Year	Citations

Page 1