Skip to main content
  • Book
  • © 2005

Speech Separation by Humans and Machines

Editors:

  • Provides comprehensive and authoritative discussion of how humans separate speech and the state of the art in approaching these abilities with machines

Buy it now

Buying options

eBook USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (21 chapters)

  1. Front Matter

    Pages i-xxiv
  2. Auditory Scene Analysis

    • Elyse S. Sussman
    Pages 5-12
  3. Speech separation

    • Claude Alain
    Pages 13-30
  4. Speech Recognizer Based Maximum Likelihood Beamforming

    • Bhiksha Raj, Michael Seltzer, Manuel Jesus Reyes-Gomez
    Pages 65-82
  5. Speech Segregation Using an Event-synchronous Auditory Image and STRAIGHT

    • Toshio Irino, Roy D. Patterson, Hideki Kawakhara
    Pages 155-165
  6. The History and Future of CASA

    • Malcolm Slaney
    Pages 199-211
  7. Source Separation, Localization, and Comprehension in Humans, Machines, and Human-machine Systems

    • Nat Durlach, Steve Colburn, Gerald Kidd, Chris Mason, Barbara Shinn-Cunningham, Tanya Arborgast et al.
    Pages 221-243
  8. The Cancellation Principle in Acoustic Scene Analysis

    • Alain de Cheveigné
    Pages 245-259
  9. Interplay Between Visual and Audio Scene Analysis

    • Ziyou Xiong, Thomas S. Huang
    Pages 283-293

About this book

There is a serious problem in the recognition of sounds. It derives from the fact that they do not usually occur in isolation but in an environment in which a number of sound sources (voices, traffic, footsteps, music on the radio, and so on) are active at the same time. When these sounds arrive at the ear of the listener, the complex pressure waves coming from the separate sources add together to produce a single, more complex pressure wave that is the sum of the individual waves. The problem is how to form separate mental descriptions of the component sounds, despite the fact that the “mixture wave” does not directly reveal the waves that have been summed to form it. The name auditory scene analysis (ASA) refers to the process whereby the auditory systems of humans and other animals are able to solve this mixture problem. The process is believed to be quite general, not specific to speech sounds or any other type of sounds, and to exist in many species other than humans. It seems to involve assigning spectral energy to distinct “auditory objects” and “streams” that serve as the mental representations of distinct sound sources in the environment and the patterns that they make as they change over time. How this energy is assigned will affect the perceived n- ber of auditory sources, their perceived timbres, loudnesses, positions in space, and pitches.

Editors and Affiliations

  • East Bay Institute for Research and Education, USA

    Pierre Divenyi

Bibliographic Information

Buy it now

Buying options

eBook USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access