Name: Speech Separation by Humans and Machines
ISBN: 978-0-387-22794-8

Editors:

Pierre Divenyi⁰

Pierre Divenyi
1. East Bay Institute for Research and Education, USA
View editor publications

You can also search for this editor in PubMed Google Scholar

Provides comprehensive and authoritative discussion of how humans separate speech and the state of the art in approaching these abilities with machines

26k Accesses
441 Citations
8 Altmetric

Buy it now

eBook USD 84.99

Price excludes VAT (USA)

Softcover Book USD 109.99

Price excludes VAT (USA)

Hardcover Book USD 109.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Learn about institutional subscriptions

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (21 chapters)

Front Matter

Pages i-xxiv

PDF
Speech Segregation: Problems and Perspectives
- Chris Darwin
Pages 1-4
Auditory Scene Analysis
- Elyse S. Sussman
Pages 5-12
Speech separation
- Claude Alain
Pages 13-30
Recurrent Timing Nets for F0-based Speaker Separation
- Peter Cariani
Pages 31-53
Blind Source Separation Using Graphical Models
- Te-Won Lee
Pages 55-64
Speech Recognizer Based Maximum Likelihood Beamforming
- Bhiksha Raj, Michael Seltzer, Manuel Jesus Reyes-Gomez
Pages 65-82
Exploiting Redundancy to Construct Listening Systems
- Paris Smaragdis
Pages 83-95
Automatic Speech Processing by Inference in Generative Models
- Sam T. Roweis
Pages 97-133
Signal Separation Motivated by Human Auditory Perception: Applications to Automatic Speech Recognition
- Richard M. Stern
Pages 135-154
Speech Segregation Using an Event-synchronous Auditory Image and STRAIGHT
- Toshio Irino, Roy D. Patterson, Hideki Kawakhara
Pages 155-165
Underlying Principles of a High-quality Speech Manipulation System STRAIGHT and Its Application to Speech Segregation
- Hideki Kawahara, Toshio Irino
Pages 167-180
On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis
- DeLiang Wang
Pages 181-197
The History and Future of CASA
- Malcolm Slaney
Pages 199-211
Techniques for Robust Speech Recognition in Noisy and Reverberant Conditions
- Guy J. Brown, Kalle J. Palomäki
Pages 213-220
Source Separation, Localization, and Comprehension in Humans, Machines, and Human-machine Systems
- Nat Durlach, Steve Colburn, Gerald Kidd, Chris Mason, Barbara Shinn-Cunningham, Tanya Arborgast et al.
Pages 221-243
The Cancellation Principle in Acoustic Scene Analysis
- Alain de Cheveigné
Pages 245-259
Informational and Energetic Masking Effects in Multitalker Speech Perception
- Douglas S. Brungart
Pages 261-267
Masking the Feature Information In Multi-stream Speech-analogue Displays
- Pierre L. Divenyi
Pages 269-281
Interplay Between Visual and Audio Scene Analysis
- Ziyou Xiong, Thomas S. Huang
Pages 283-293

About this book

There is a serious problem in the recognition of sounds. It derives from the fact that they do not usually occur in isolation but in an environment in which a number of sound sources (voices, traffic, footsteps, music on the radio, and so on) are active at the same time. When these sounds arrive at the ear of the listener, the complex pressure waves coming from the separate sources add together to produce a single, more complex pressure wave that is the sum of the individual waves. The problem is how to form separate mental descriptions of the component sounds, despite the fact that the “mixture wave” does not directly reveal the waves that have been summed to form it. The name auditory scene analysis (ASA) refers to the process whereby the auditory systems of humans and other animals are able to solve this mixture problem. The process is believed to be quite general, not specific to speech sounds or any other type of sounds, and to exist in many species other than humans. It seems to involve assigning spectral energy to distinct “auditory objects” and “streams” that serve as the mental representations of distinct sound sources in the environment and the patterns that they make as they change over time. How this energy is assigned will affect the perceived n- ber of auditory sources, their perceived timbres, loudnesses, positions in space, and pitches.

Keywords

Editors and Affiliations

East Bay Institute for Research and Education, USA

Pierre Divenyi

Bibliographic Information

Book Title: Speech Separation by Humans and Machines
Editors: Pierre Divenyi
DOI: https://doi.org/10.1007/b99695
Publisher: Springer New York, NY
eBook Packages: Engineering, Engineering (R0)
Hardcover ISBN: 978-1-4020-8001-2Published: 02 November 2004
Softcover ISBN: 978-1-4419-5460-2Published: 04 November 2010
eBook ISBN: 978-0-387-22794-8Published: 16 January 2006
Edition Number: 1
Number of Pages: XXIV, 319
Topics: Signal, Image and Speech Processing, User Interfaces and Human Computer Interaction, Engineering, general

Publish with us

Policies and ethics

Editors:

Sections

Buy it now

Buying options

Other ways to access

Table of contents (21 chapters)

Front Matter

About this book

Keywords

Editors and Affiliations

East Bay Institute for Research and Education, USA

Bibliographic Information

Publish with us

Buy it now

Buying options

Other ways to access

Search

Navigation