Publication | Closed Access
SRP-PHAT methods of locating simultaneous multiple talkers using a frame of microphone array data
50
Citations
7
References
2010
Year
Unknown Venue
EngineeringLocalizationAcoustic ModelingSpeech RecognitionLarge-aperture Microphone ArraySpeaker LocalizationAudio AnalysisNoiseNew MethodsAcoustic Signal ProcessingSimultaneous Multiple TalkersAcoustic CameraHealth SciencesSingle SegmentMulti-channel ProcessingSrp-phat MethodsDistant Speech RecognitionSignal ProcessingSpeech CommunicationMicrophone Array DataSpeech ProcessingSpeech Perception
Two new methods for locating multiple sound sources using a single segment of data from a large-aperture microphone array are presented. Both methods employ the proven-robust steered response power using the phase transform (SRP-PHAT) as a functional. To cluster the data points into highly probable regions containing global peaks, the first method fits a Gaussian mixture model (GMM), whereas the second one sequentially finds the points with highest SRP-PHAT values that most likely represent different clusters. Then the low-cost global optimization method, stochastic region contraction (SRC), is applied to each cluster to find the global peaks. We test the two methods using real data from five simultaneous talkers in a room with high noise and reverberation. Results are presented and discussed.
| Year | Citations | |
|---|---|---|
Page 1
Page 1