Publication | Closed Access
Effects of Packet Losses in Waveform Coded Speech and Improvements Due to an Odd-Even Sample-Interpolation Procedure
216
Citations
20
References
1981
Year
Adaptive InterpolationEngineeringIterative DecodingWaveform Coded SpeechRandom Packet LossesSpeech RecognitionSpeech CodingOdd-even Sample-interpolation ProcedureRobust Speech RecognitionCoding TheoryHealth SciencesSpeech SynthesisComputer EngineeringSpeech OutputComputer ScienceData CompressionSignal ProcessingSpeech CommunicationSpeech TechnologyDigital Speech SystemsSpeech ProcessingSpeech PerceptionPacket Losses
We have studied the effects of random packet losses in digital speech systems based on 12-bit PCM and 4-bit adaptive DPCM coding. The effects are a function of packet length <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">B</tex> and probability of packet loss P <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">L</inf> . We have also studied tbe benefits of an odd-even sample-interpolation procedure that mitigates these effects (at the cost of increased decoding delay). The procedure is based on arranging a <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2B</tex> -block of codewords into two <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">B</tex> -sample packets, an odd-sample packet and an even-sample packet. If one of these packets is lost, the odd (or even) samples of the <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2B</tex> -block are estimated from the even (or odd) samples by means of adaptive interpolation. Perceptual considerations indicate that packet lengths most robust to losses are in the range 16-32 ms, irrespective of whether interpolation is used or not. With these packet lengths, tolerable P <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">L</inf> values, which are strictly input-speech-dependent, can be as high as 2 to 5 percent without interpolation and 5 to 10 percent with interpolation. These observations are based on a computer simulation with three sentence-length speech inputs, and on informal listening tests.
| Year | Citations | |
|---|---|---|
Page 1
Page 1