Publication | Closed Access
Low-Complexity Source Coding Using Gaussian Mixture Models, Lattice Vector Quantization, and Recursive Coding with Application to Speech Spectrum Quantization
31
Citations
13
References
2006
Year
EngineeringQuantization SchemesRecursive CodingSpeech RecognitionSpeech CodingImage CompressionQuantization FrameworkJoint Source-channel CodingLattice Vector QuantizationCoding TheoryVariable-length CodeHealth SciencesComputer ScienceData CompressionSignal ProcessingQuantization (Signal Processing)Image CodingSpeech Spectrum QuantizationGaussian Mixture ModelSpeech Processing
In this paper, we use the Gaussian mixture model (GMM) based multidimensional companding quantization framework to develop two important quantization schemes. In the first scheme, the scalar quantization in the companding framework is replaced by more efficient lattice vector quantization. Low-complexity lattice pruning and quantization schemes are provided for the <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$E_8$</tex> Gossett lattice. At moderate to high bit rates, the proposed scheme recovers much of the space-filling loss due to the product vector quantizers (PVQ) employed in earlier work, and thereby, provides improved performance with a marginal increase in complexity. In the second scheme, we generalize the compression framework to accommodate recursive coding. In this approach, the joint probability density function (PDF) of the parameter vectors of successive source frames is modeled using a GMM. The conditional density of the parameter vector of the current source frame based on the quantized values of the parameter vector of the previous source frames is used to generate a new codebook for every current source frame. We demonstrate the efficacy of the proposed schemes in the application of speech spectrum quantization. The proposed scheme is shown to provide superior performance with moderate increase in complexity when compared with conventional one-step linear prediction based compression schemes for both narrow-band and wide-band speech.
| Year | Citations | |
|---|---|---|
Page 1
Page 1