Abstract
A modified hierarchical mixtures of experts (HME) architecture is presented for text-dependent speaker identification. A new gating network is introduced to the original HME architecture for the use of instantaneous and transitional spectral information in text-dependent speaker identification. The statistical model underlying the proposed architecture is presented and learning is treated as a maximum likelihood problem; in particular, an expectation-maximization (EM) algorithm is also proposed for adjusting the parameters of the proposed architecture. An evaluation has been carried out using a database of isolated digit utterances by 10 male speakers. Experimental results demonstrate that the proposed architecture outperforms the original HME architecture in text-dependent speaker identification. © 1996 IEEE.
Original language | English |
---|---|
Pages (from-to) | 1309-1313 |
Number of pages | 4 |
Journal | IEEE Transactions on Neural Networks |
Volume | 7 |
Issue number | 5 |
DOIs | |
Publication status | Published - 1996 |