Name: New Era for Robust Speech Recognition
ISBN: 978-3-319-64680-0

Editors:

Shinji Watanabe⁰,
Marc Delcroix¹,
Florian Metze²,
…
John R. Hershey³

Shinji Watanabe
1. Mitsubishi Electric Research Laboratories (MERL), Cambridge, USA
View editor publications

You can also search for this editor in PubMed Google Scholar
Marc Delcroix
1. NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan
View editor publications

You can also search for this editor in PubMed Google Scholar
Florian Metze
1. Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA
View editor publications

You can also search for this editor in PubMed Google Scholar
John R. Hershey
1. Mitsubishi Electric Research Laboratories (MERL), Cambridge, USA
View editor publications

You can also search for this editor in PubMed Google Scholar

Field of automatic speech recognition has evolved greatly since the introduction of deep learning
Covers the state-of-the-art in noise robustness for deep neural-network-based speech recognition
Includes descriptions of benchmark tools and datasets widely used in the field

46k Accesses
109 Citations
4 Altmetric

Buy it now

eBook USD 149.00

Price excludes VAT (USA)

Softcover Book USD 199.99

Price excludes VAT (USA)

Hardcover Book USD 199.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Learn about institutional subscriptions

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (20 chapters)

Front Matter

Pages i-xvii

PDF
Introduction
1. Front Matter
  
  Pages 1-1
  
  PDF
2. Preliminaries
  
  Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey
  
  Pages 3-17
Approaches to Robust Automatic Speech Recognition
1. Front Matter
  
  Pages 19-19
  
  PDF
2. Multichannel Speech Enhancement Approaches to DNN-Based Far-Field Speech Recognition
  
  Marc Delcroix, Takuya Yoshioka, Nobutaka Ito, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto et al.
  
  Pages 21-49
3. Multichannel Spatial Clustering Using Model-Based Source Separation
  
  Michael I. Mandel, Jon P. Barker
  
  Pages 51-77
4. Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition
  
  Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Michael Mandel, Liang Lu, John R. Hershey et al.
  
  Pages 79-104
5. Raw Multichannel Processing Using Deep Neural Networks
  
  Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani, Bo Li et al.
  
  Pages 105-133
6. Novel Deep Architectures in Speech Processing
  
  John R. Hershey, Jonathan Le Roux, Shinji Watanabe, Scott Wisdom, Zhuo Chen, Yusuf Isik
  
  Pages 135-164
7. Deep Recurrent Networks for Separation and Recognition of Single-Channel Speech in Nonstationary Background Audio
  
  Hakan Erdogan, John R. Hershey, Shinji Watanabe, Jonathan Le Roux
  
  Pages 165-186
8. Robust Features in Deep-Learning-Based Speech Recognition
  
  Vikramjit Mitra, Horacio Franco, Richard M. Stern, Julien van Hout, Luciana Ferrer, Martin Graciarena et al.
  
  Pages 187-217
9. Adaptation of Deep Neural Network Acoustic Models for Robust Automatic Speech Recognition
  
  Khe Chai Sim, Yanmin Qian, Gautam Mantena, Lahiru Samarakoon, Souvik Kundu, Tian Tan
  
  Pages 219-243
10. Training Data Augmentation and Data Selection
  
  Martin Karafiát, Karel Veselý, Kateřina Žmolíková, Marc Delcroix, Shinji Watanabe, Lukáš Burget et al.
  
  Pages 245-260
11. Advanced Recurrent Neural Networks for Automatic Speech Recognition
  
  Yu Zhang, Dong Yu, Guoguo Chen
  
  Pages 261-279
12. Sequence-Discriminative Training of Neural Networks
  
  Guoguo Chen, Yu Zhang, Dong Yu
  
  Pages 281-297
13. End-to-End Architectures for Speech Recognition
  
  Yajie Miao, Florian Metze
  
  Pages 299-323
Resources
1. Front Matter
  
  Pages 325-325
  
  PDF
2. The CHiME Challenges: Robust Speech Recognition in Everyday Environments
  
  Jon P. Barker, Ricard Marxer, Emmanuel Vincent, Shinji Watanabe
  
  Pages 327-344
3. The REVERB Challenge: A Benchmark Task for Reverberation-Robust ASR Techniques
  
  Keisuke Kinoshita, Marc Delcroix, Sharon Gannot, Emanuël A. P. Habets, Reinhold Haeb-Umbach, Walter Kellermann et al.
  
  Pages 345-354
4. Distant Speech Recognition Experiments Using the AMI Corpus
  
  Steve Renals, Pawel Swietojanski
  
  Pages 355-368

About this book

This book covers the state-of-the-art in deep neural-network-based methods for noise robustness in distant speech recognition applications. It provides insights and detailed descriptions of some of the new concepts and key technologies in the field, including novel architectures for speech enhancement, microphone arrays, robust features, acoustic model adaptation, training data augmentation, and training criteria. The contributed chapters also include descriptions of real-world applications, benchmark tools and datasets widely used in the field.

This book is intended for researchers and practitioners working in the field of speech processing and recognition who are interested in the latest deep learning techniques for noise robustness. It will also be of interest to graduate students in electrical engineering or computer science, who will find it a useful guide to this field of research.

Keywords

Editors and Affiliations

Mitsubishi Electric Research Laboratories (MERL), Cambridge, USA

Shinji Watanabe, John R. Hershey
NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan

Marc Delcroix
Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA

Florian Metze

Bibliographic Information

Book Title: New Era for Robust Speech Recognition
Book Subtitle: Exploiting Deep Learning
Editors: Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey
DOI: https://doi.org/10.1007/978-3-319-64680-0
Publisher: Springer Cham
eBook Packages: Computer Science, Computer Science (R0)
Copyright Information: Springer International Publishing AG 2017
Hardcover ISBN: 978-3-319-64679-4Published: 10 November 2017
Softcover ISBN: 978-3-319-87849-2Published: 24 May 2018
eBook ISBN: 978-3-319-64680-0Published: 30 October 2017
Edition Number: 1
Number of Pages: XVII, 436
Number of Illustrations: 50 b/w illustrations, 26 illustrations in colour
Topics: Artificial Intelligence, Signal, Image and Speech Processing, Natural Language Processing (NLP), Linguistics, general

Publish with us

Policies and ethics

Editors:

Sections

Buy it now

Buying options

Other ways to access

Table of contents (20 chapters)

Front Matter

Introduction

Front Matter

Approaches to Robust Automatic Speech Recognition

Front Matter

Resources

Front Matter

About this book

Keywords

Editors and Affiliations

Mitsubishi Electric Research Laboratories (MERL), Cambridge, USA

NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan

Language Technologies Institute, Carnegie Mellon University, Pittsburgh, USA

Bibliographic Information

Publish with us

Buy it now

Buying options

Other ways to access

Search

Navigation