Skip to main content
  • Book
  • © 2015

Speech and Audio Processing for Coding, Enhancement and Recognition

  • Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research
  • Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks
  • Presents an overview of recent developments in conversational speech coding technologies, important new algorithmic advances, and recent standardization activities in ITU-T, 3GPP, 3GPP2, MPEG and IETF that offer a significantly improved user experience during voice calls on existing and future communication systems
  • Presents an overview of ensemble learning efforts based on different machine learning techniques that have emerged in automatic speech recognition in recent years
  • Emphasizes signal processing for efficient time-domain and spectral-domain representations, reduction of noise, channel and session variabilities, extraction of temporal and spectral features for recognition and modeling
  • Informs readers of the latest research and developments in advanced statistical estimation and deep neural networks for speech recognition
  • Presents readers with the architectural framework and key approaches involved in the “hot” research areas of emotion recognition and speaker diairization systems
  • Provides readers with a more enriching view of state of the art research in speech enhancement arising from novel multi-microphone and time-frequency solutions
  • Includes supplementary material: sn.pub/extras

Buy it now

Buying options

eBook USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (10 chapters)

  1. Front Matter

    Pages i-x
  2. Overview of Speech and Audio Coding

    1. Front Matter

      Pages 1-1
    2. Challenges in Speech Coding Research

      • Jerry D. Gibson
      Pages 19-39
    3. Recent Speech Coding Technologies and Standards

      • Daniel J. Sinder, Imre Varga, Venkatesh Krishnan, Vivek Rajendran, Stéphane Villette
      Pages 75-109
  3. Review and Challenges in Speech, Speaker and Emotion Recognition

    1. Front Matter

      Pages 111-111
    2. Ensemble Learning Approaches in Speech Recognition

      • Yunxin Zhao, Jian Xue, Xin Chen
      Pages 113-152
    3. Speech Based Emotion Recognition

      • Vidhyasaharan Sethu, Julien Epps, Eliathamby Ambikairajah
      Pages 197-228
    4. Speaker Diarization: An Emerging Research

      • Trung Hieu Nguyen, Eng Siong Chng, Haizhou Li
      Pages 229-277
  4. Current Trends in Speech Enhancement

    1. Front Matter

      Pages 279-279
    2. Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

      • Yasuaki Iwata, Tomohiro Nakatani, Takuya Yoshioka, Masakiyo Fujimoto, Hirofumi Saito
      Pages 281-317
    3. Modulation Processing for Speech Enhancement

      • Kuldip Paliwal, Belinda Schwerin
      Pages 319-345

About this book

This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. 

 

 

 

Editors and Affiliations

  • Dept. of Electrical Engineering, Santa Clara University, Santa Clara, USA

    Tokunbo Ogunfunmi

  • School of EE&C Engineering, The University of Western Australia, Crawley, Australia

    Roberto Togneri

  • Qualcomm Inc., Santa Clara, USA

    Madihally (Sim) Narasimha

About the editors

Tokunbo Ogunfunmi is an Associate Professor of Electrical Engineering and an Associate Dean for Research and Fac. Dev. at Santa Clara University.

Roberto Togneri is a professor with the School of Electrical, Electronic and Computer Engineering at The University of Western Australia.

Madihally (Sim) Narasimha is a Senior Director of Technology at Qualcomm Inc.

Bibliographic Information

Buy it now

Buying options

eBook USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access