Using a hidden Markov model to detect inshore Bryde’s whale short pulse calls

As the interest in studying cetaceans’ sounds increases, so has the motivation to develop different automated sound detection and classification methods. One such technique is passive acoustic monitoring (PAM) which was discussed in our previous post. PAM is used extensively to study cetaceans’ sounds over a predetermined period to understand their daily activities within their ecosystem. However, when using PAM, the collected sound datasets are usually large and impractical to manually analyse and detect.

 

  1. What is the Hidden Markov model
  2. Models to detect mysticetes’ and cetaceans’ pulse calls
  3. Highlights of the research
  4. Conclusion

 

This is where the hidden Markov model (HMM) becomes particularly useful as a tool to automatically detect and classify these cetaceans’ sounds. However, HMM relies heavily on the employed feature extraction method such as Mel-scale frequency cepstral coefficients (MFCC) and linear predictive coding (LPC).

1. What is the Hidden Markov model?

The HMM is a machine learning tool widely used for speech analysis and recognition. It can be viewed as a stochastic state machine, where each change in the hidden states end with an emission of a symbol. The HMM tool is flexible and can effortlessly model and classify the spectral and temporal features of a set of sound signals.

2. Models to detect mysticetes’ and cetaceans’ pulse calls

In most cases, the more reliable the extracted feature vector from the known sound label, the higher the sensitivity of the HMM. Even though these feature extraction methods are widely used, their design is based on filters which require windowing, fast Fourier transforms (FFT), and logarithm operations. Consequently, this increases the computational time complexity of the HMM.

In the research listed below a selective time-domain feature extraction method is proposed that can be easily adapted with the HMM. This proposed feature extraction method uses a combination of some simple but robust parameters such as “the mean, relative amplitude and relative power/energy (MAP), which are selected based on empirical observations of the call to be detected”. The performance of the suggested MAP-HMM was verified using the acoustic dataset of continuous recordings of an inshore Bryde’s whale short pulse calls collected in a single site in False Bay. Apart from exhibiting a low computational complexity, the proposed MAP-HMM “offers superior sensitivity and false discovery rate performances in comparison to the LPCHMM and MFCC-HMM” (Ogundile, Usman, Babalola & Versfeld, 2020).

The research further investigates an empirical mode decomposition (EMD) based hidden Markov model (HMM) approach for the detection of mysticetes’ pulse calls such as the Bryde’s whales. The HMM detection abilities in this approach depend on the deployed feature extraction (FE) technique. In this regard, the EMD is proposed as a performance efficient alternative to the Mel-scale frequency cepstral coefficient (MFCC) and linear predictive coefficient (LPC) FE techniques. These proposed EMD-HMM and EEMD-HMM approaches achieved better performance in comparison to the MFCC-HMM and LPC-HMM approaches.

 

Figure 1: (Color online) Time series and spectrogram representation of an inshore Bryde’s pulse calls.

 

3. Highlights of the research

  •  A selective time domain feature extraction technique is described.
  • The technique is simple but robust and can be easily adapted with the hidden Markov model.
  • The performance of the technique is verified with the short pulse calls of an inshore Bryde’s whale.
  • Results can be accessed against the conventional MFCC and the LPC methods.
  • The technique offered promising sensitivity and false discovery rate results.

4. Conclusion

The EMD and EEMD were introduced as a performance efficient alternative FE method for HMM detection of pulse calls. This method offered improved sensitivity and false-positive rate performances while reducing the computational load of the HMM. These approaches can, therefore, be useful in real-time detection of mysticetes pulse calls such as the Bryde’s whale.

A further feature was suggested which entailed a simple but robust FE technique that can be easily adapted with the HMM. This proposed feature extraction method uses a combination of some simple but robust characteristics of the sound to be detected such as the mean, relative amplitude and relative power. The detection accuracy and false discovery rate of the proposed MAP-HMM were verified using the acoustic dataset of continuous recordings of an inshore Bryde’s whale short pulse call collected in False bay. This method proved to be very effective in comparison to the existing MFCC-HMM and LPC-HMM methods. It also offered good sensitivity and false discovery rate performance, at a reduced computational time and delay complexity. This approach would be a useful real-time approach to studying the movement of different obscure but vocal cetaceans such as the inshore Bryde’s whales that produce characteristics call.

 

Read the complete research papers at:

https://www.sciencedirect.com/science/article/abs/pii/S1574954120300376

https://asa.scitation.org/doi/10.1121/10.0000717