UBM-based real-time speaker segmentation for broadcasting news

Ting Yao Wu, Lie Lu, Ke Chen, Hong Jiang Zhang

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    This paper addresses the problem of real-time speaker change detection in broadcast news, in which no prior knowledge on speakers is assumed. Our speaker segmentation is a "coarse to refine" process, which consists of two stages: pre-segmentation and refinement. In the pre-segmentation stage, a new approach based on Gaussian Mixture Model - Universal Background Model (GMM-UBM) is proposed to categorize feature vectors into three sets, i.e. reliable speaker-related set, doubtful speaker-related set and unreliable speaker-related set, in order to enhance the effect of the reliable speaker-related feature vectors. Then potential speaker change boundaries are detected based on a novel distance measure, In the refinement stage, incremental speaker adaptation (ISA), which is suitable for real-time requirement, is proposed to obtain considerably precise speaker models so that the potential speaker change boundaries can be confirmed and refined. Experimental results demonstrate our approach yields the satisfactory performance.
    Original languageEnglish
    Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|ICASSP IEEE Int Conf Acoust Speech Signal Process Proc
    PublisherIEEE
    Pages193-196
    Number of pages3
    Volume2
    Publication statusPublished - 2003
    Event2003 IEEE International Conference on Accoustics, Speech, and Signal Processing - Hong Kong
    Duration: 1 Jul 2003 → …

    Conference

    Conference2003 IEEE International Conference on Accoustics, Speech, and Signal Processing
    CityHong Kong
    Period1/07/03 → …

    Fingerprint

    Dive into the research topics of 'UBM-based real-time speaker segmentation for broadcasting news'. Together they form a unique fingerprint.

    Cite this