Mid-level representations for Computational Auditory Scene Analysis

Abstract

In this paper we consider representations for use in models of the processing that occurs between the eardrum and our conscious experience of sound. We first list &quot;good&quot; properties for such mid-level representations, then present a framework within which to discuss some examples. We compare in detail two popularschemes --- sinusoid tracks andcorrelograms --- and propose a new representation, wefts, which seeks to combine their advantages. 1 Introduction: Mid-level representations Mid-level representation is a term usually associated with computer vision, particularly the ideas of David Marr [1982]. It has since becomeacceptedbymany in the computer audition community[Bregman,1990;Cooke,1991;Brown,1992]asan concept useful to models of hearing as well. Auditory perception may be viewed as a sequence of representations from &quot;low&quot; to &quot;high,&quot; where low-level representationsare (roughly) those appropriate to describing the soundreaching the cochlea, and high-level representations are those to...

References

Page 1

	Year	Citations

Page 1