Automatic Speech Segmentation Using Average Level Crossing Rate Information

Abstract

We explore new methods of determining automatically derived units for classification of speech into segments. For detecting signal changes, temporal features are more reliable than the standard feature vector domain methods, since both magnitude and phase information are retained. Motivated by auditory models, we have presented a method based on average level crossing rate (ALCR) of the signal, to detect significant temporal changes in the signal. An adaptive level allocation scheme has been used in this technique that allocates levels, depending on the signal pdf and SNR. We compare the segmentation performance to manual phonemic segmentation and also that provided by maximum likelihood (ML) segmentation for 100 TIMIT sentences. The ALCR method matches the best segmentation performance without a priori knowledge of number of segments, as in ML segmentation.

References

Page 1

	Year	Citations

Page 1