A comparison of neural network architectures for the classification of three types of infant cry vocalizations

Abstract

The analysis of infant cry vocalization has been the focus of a number of efforts over the past thirty years. Since the infant cry is one of the only means that an infant has for communicating with its care-giving environment, it is thought that information regarding the state of an infant, such as hunger or pain, can be determined from cry vocalizations. To date, research groups have determined that a number of different types of cries can be determined auditorily and at least one group has attempted to automate this classification process. This paper presents the results of another attempt at automating the discrimination process, this time using artificial neural networks (ANNs). The input data consists of successive frames of 10 mel-cepstrum coefficients ranging in length from 0.75 seconds to 1 second. The mel-cepstrum coefficients were extracted from anger, fear, and pain cries. The ANNs used were a simple feed-forward network (FF), a recurrent neural network (RNN), and a time-delay neural network (TDNN). From tests conducted to date, it is determined that ANNs are a useful tool for cry classification and merit further study in this domain.

References

Page 1

	Year	Citations

Page 1