Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold

A. Davis, S.E. Nordholm, Roberto Togneri

    Research output: Contribution to journalArticlepeer-review

    140 Citations (Scopus)

    Abstract

    Traditionally, voice activity detection algorithms are based on any combination of general speech properties such as temporal energy variations, periodicity, and spectrum. This paper describes a novel statistical method for voice activity detection using a signal-to-noise ratio measure. The method employs a low-variance spectrum estimate and determines' an optimal threshold based on the estimated noise statistics. A possible implementation is presented and evaluated over a large test set and compared to current modern standardized algorithms. The evaluations indicate promising results with the proposed scheme being comparable or favorable over the whole test set.
    Original languageEnglish
    Pages (from-to)412-424
    JournalIEEE ACM Transactions on Audio, Speech, and Language Processing
    Volume14
    Issue number2
    DOIs
    Publication statusPublished - 2006

    Fingerprint

    Dive into the research topics of 'Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold'. Together they form a unique fingerprint.

    Cite this