Concepedia

Publication | Open Access

Deep Scattering Spectrum

631

Citations

47

References

2014

Year

TLDR

A scattering transform provides a locally translation‑invariant, time‑warp‑stable representation. The method extends MFCCs by cascading wavelet convolutions and modulus operators to compute multi‑order modulation spectra, capturing transient events and yielding a frequency‑transposition‑invariant representation via log‑frequency scattering. The approach achieves state‑of‑the‑art accuracy on musical genre classification (GTZAN) and phone classification (TIMIT).

Abstract

A scattering transform defines a locally translation invariant representation which is stable to time-warping deformations. It extends MFCC representations by computing modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators. Second-order scattering coefficients characterize transient phenomena such as attacks and amplitude modulation. A frequency transposition invariant representation is obtained by applying a scattering transform along log-frequency. State-the-of-art classification results are obtained for musical genre and phone classification on GTZAN and TIMIT databases, respectively.

References

YearCitations

Page 1