Skip to main content
  • Book
  • © 2017

New Era for Robust Speech Recognition

Exploiting Deep Learning

  • Field of automatic speech recognition has evolved greatly since the introduction of deep learning

  • Covers the state-of-the-art in noise robustness for deep neural-network-based speech recognition

  • Includes descriptions of benchmark tools and datasets widely used in the field

Buy it now

Buying options

eBook USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (20 chapters)

  1. Front Matter

    Pages i-xvii
  2. Introduction

    1. Front Matter

      Pages 1-1
    2. Preliminaries

      • Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey
      Pages 3-17
  3. Approaches to Robust Automatic Speech Recognition

    1. Front Matter

      Pages 19-19
    2. Multichannel Speech Enhancement Approaches to DNN-Based Far-Field Speech Recognition

      • Marc Delcroix, Takuya Yoshioka, Nobutaka Ito, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto et al.
      Pages 21-49
    3. Multichannel Spatial Clustering Using Model-Based Source Separation

      • Michael I. Mandel, Jon P. Barker
      Pages 51-77
    4. Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition

      • Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Michael Mandel, Liang Lu, John R. Hershey et al.
      Pages 79-104
    5. Raw Multichannel Processing Using Deep Neural Networks

      • Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani, Bo Li et al.
      Pages 105-133
    6. Novel Deep Architectures in Speech Processing

      • John R. Hershey, Jonathan Le Roux, Shinji Watanabe, Scott Wisdom, Zhuo Chen, Yusuf Isik
      Pages 135-164
    7. Deep Recurrent Networks for Separation and Recognition of Single-Channel Speech in Nonstationary Background Audio

      • Hakan Erdogan, John R. Hershey, Shinji Watanabe, Jonathan Le Roux
      Pages 165-186
    8. Robust Features in Deep-Learning-Based Speech Recognition

      • Vikramjit Mitra, Horacio Franco, Richard M. Stern, Julien van Hout, Luciana Ferrer, Martin Graciarena et al.
      Pages 187-217
    9. Adaptation of Deep Neural Network Acoustic Models for Robust Automatic Speech Recognition

      • Khe Chai Sim, Yanmin Qian, Gautam Mantena, Lahiru Samarakoon, Souvik Kundu, Tian Tan
      Pages 219-243
    10. Training Data Augmentation and Data Selection

      • Martin Karafiát, Karel Veselý, KateÅ™ina Žmolíková, Marc Delcroix, Shinji Watanabe, Lukáš Burget et al.
      Pages 245-260
    11. Advanced Recurrent Neural Networks for Automatic Speech Recognition

      • Yu Zhang, Dong Yu, Guoguo Chen
      Pages 261-279
    12. Sequence-Discriminative Training of Neural Networks

      • Guoguo Chen, Yu Zhang, Dong Yu
      Pages 281-297
    13. End-to-End Architectures for Speech Recognition

      • Yajie Miao, Florian Metze
      Pages 299-323
  4. Resources

    1. Front Matter

      Pages 325-325
    2. The CHiME Challenges: Robust Speech Recognition in Everyday Environments

      • Jon P. Barker, Ricard Marxer, Emmanuel Vincent, Shinji Watanabe
      Pages 327-344
    3. The REVERB Challenge: A Benchmark Task for Reverberation-Robust ASR Techniques

      • Keisuke Kinoshita, Marc Delcroix, Sharon Gannot, Emanuël A. P. Habets, Reinhold Haeb-Umbach, Walter Kellermann et al.
      Pages 345-354
    4. Distant Speech Recognition Experiments Using the AMI Corpus

      • Steve Renals, Pawel Swietojanski
      Pages 355-368

About this book

This book covers the state-of-the-art in deep neural-network-based methods for noise robustness in distant speech recognition applications. It provides insights and detailed descriptions of some of the new concepts and key technologies in the field, including novel architectures for speech enhancement, microphone arrays, robust features, acoustic model adaptation, training data augmentation, and training criteria. The contributed chapters also include descriptions of real-world applications, benchmark tools and datasets widely used in the field. 

This book is intended for researchers and practitioners working in the field of speech processing and recognition who are interested in the latest deep learning techniques for noise robustness. It will also be of interest to graduate students in electrical engineering or computer science, who will find it a useful guide to this field of research.


Editors and Affiliations

  • Mitsubishi Electric Research Laboratories (MERL), Cambridge, USA

    Shinji Watanabe, John R. Hershey

  • NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan

    Marc Delcroix

  • Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA

    Florian Metze

Bibliographic Information

  • Book Title: New Era for Robust Speech Recognition

  • Book Subtitle: Exploiting Deep Learning

  • Editors: Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey

  • DOI: https://doi.org/10.1007/978-3-319-64680-0

  • Publisher: Springer Cham

  • eBook Packages: Computer Science, Computer Science (R0)

  • Copyright Information: Springer International Publishing AG 2017

  • Hardcover ISBN: 978-3-319-64679-4Published: 10 November 2017

  • Softcover ISBN: 978-3-319-87849-2Published: 24 May 2018

  • eBook ISBN: 978-3-319-64680-0Published: 30 October 2017

  • Edition Number: 1

  • Number of Pages: XVII, 436

  • Number of Illustrations: 50 b/w illustrations, 26 illustrations in colour

  • Topics: Artificial Intelligence, Signal, Image and Speech Processing, Natural Language Processing (NLP), Linguistics, general

Buy it now

Buying options

eBook USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access