Name: A Perspective on Single-Channel Frequency-Domain Speech Enhancement
ISBN: 978-3-031-02561-7

Overview

Authors:

Jacob Benesty ⁰,
Yiteng Huang ¹

Jacob Benesty
1. INRS-EMT, University of Quebec, Canada
View author publications

You can also search for this author in PubMed Google Scholar
Yiteng Huang
1. WeVoice, Inc., USA
View author publications

You can also search for this author in PubMed Google Scholar

Part of the book series: Synthesis Lectures on Speech and Audio Processing (SLSAP)

287 Accesses
4 Citations

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 29.99

Price excludes VAT (USA)

Softcover Book USD 16.99 ~~USD 37.99~~

Discount applied Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (9 chapters)

Front Matter

Pages i-viii

Download chapter PDF
Introduction
- Jacob Benesty, Yiteng Huang
Pages 1-4
Problem Formulation
- Jacob Benesty, Yiteng Huang
Pages 5-9
Performance Measures
- Jacob Benesty, Yiteng Huang
Pages 11-16
Linear and Widely Linear Models
- Jacob Benesty, Yiteng Huang
Pages 17-26
Optimal Filters with Model 1
- Jacob Benesty, Yiteng Huang
Pages 27-33
Optimal Filters with Model 2
- Jacob Benesty, Yiteng Huang
Pages 35-52
Optimal Filters with Model 3
- Jacob Benesty, Yiteng Huang
Pages 53-64
Optimal Filters with Model 4
- Jacob Benesty, Yiteng Huang
Pages 65-76
Experimental Study
- Jacob Benesty, Yiteng Huang
Pages 77-89
Back Matter

Pages 91-101

Download chapter PDF

About this book

This book focuses on a class of single-channel noise reduction methods that are performed in the frequency domain via the short-time Fourier transform (STFT). The simplicity and relative effectiveness of this class of approaches make them the dominant choice in practical systems. Even though many popular algorithms have been proposed through more than four decades of continuous research, there are a number of critical areas where our understanding and capabilities still remain quite rudimentary, especially with respect to the relationship between noise reduction and speech distortion. All existing frequency-domain algorithms, no matter how they are developed, have one feature in common: the solution is eventually expressed as a gain function applied to the STFT of the noisy signal only in the current frame. As a result, the narrowband signal-to-noise ratio (SNR) cannot be improved, and any gains achieved in noise reduction on the fullband basis come with a price to pay, which is speechdistortion. In this book, we present a new perspective on the problem by exploiting the difference between speech and typical noise in circularity and interframe self-correlation, which were ignored in the past. By gathering the STFT of the microphone signal of the current frame, its complex conjugate, and the STFTs in the previous frames, we construct several new, multiple-observation signal models similar to a microphone array system: there are multiple noisy speech observations, and their speech components are correlated but not completely coherent while their noise components are presumably uncorrelated. Therefore, the multichannel Wiener filter and the minimum variance distortionless response (MVDR) filter that were usually associated with microphone arrays will be developed for single-channel noise reduction in this book. This might instigate a paradigm shift geared toward speech distortionless noise reduction techniques. Table of Contents: Introduction / Problem Formulation / Performance Measures / Linear and Widely Linear Models / Optimal Filters with Model 1 / Optimal Filters with Model 2 / Optimal Filters with Model 3 / Optimal Filters with Model 4 / Experimental Study

Authors and Affiliations

INRS-EMT, University of Quebec, Canada

Jacob Benesty
WeVoice, Inc., USA

Yiteng Huang

About the authors

Jacob Benesty received his Ph.D. degree in control and signal processing from Orsay University, France, in April 1991. During his Ph.D. (from Nov. 1989 to Apr. 1991), he worked on adaptive filters and fast algorithms at the Centre National d'Etudes des Telecomunications (CNET), Paris, France. From January 1994 to July 1995, he worked at Telecom Paris University on multichannel adaptive filters and acoustic echo cancellation. From October 1995 to May 2003, he was first a Consultant and then a Member of the Technical Staff at Bell Laboratories, Murray Hill, NJ, USA. In May 2003, he joined the University of Quebec, INRS-EMT, in Montreal, Quebec, Canada, as a Professor. His research interests are in signal processing, acoustic signal processing, and multimedia communications. He is the inventor of many important technologies. In particular, he was the lead researcher at Bell Labs who conceived and designed the world-first, real-time, hands-free, full-duplex stereophonic teleconferencing system. Also, he and Tomas Gaensler conceived and designed the world-first, PC-based, multi-party hands-free, full-duplex stereo conferencing system over IP networks. He is the editor of the book series: Springer Topics in Signal Processing. He has co-authored and co-edited many books in the area of acoustic signal processing. He is also the editor-in-chief of the reference Springer Handbook of Speech Processing (Berlin: Springer-Verlag, 2007).Yiteng Huang received his M.S. and Ph.D. degrees from the Georgia Institute of Technology (Georgia Tech), Atlanta, in 1998 and 2001, respectively, all in electrical and computer engineering. From March 2001 to January 2008, he was a Member of Technical Staff at Bell Laboratories, Murray Hill, NJ. In January 2008, he founded the WeVoice, Inc., in Bridgewater, New Jersey and served as its CTO. His current research interests are in acoustic signal processing, multimedia communications, and wireless sensor networks. Dr. Huang served as an Associate Editor for the EURASIP Journal on Applied Signal Processing from 2004 and 2008 and for the IEEE Signal Processing Letters from 2002 to 2005. He served as a technical Co-Chair of the 2005 Joint Workshop on Hands-Free Speech Communication and Microphone Array and the 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. He is a coeditor/coauthor of seven books in the area of acoustic signal processing. He received the 2008 Best Paper Award and the 2002 Young Author Best Paper Award from the IEEE Signal Processing Society, the 2000-2001 Outstanding Graduate Teaching Assistant Award from the School Electrical and Computer Engineering, Georgia Tech, the 2000 Outstanding Research Award from the Center of Signal and Image Processing, Georgia Tech, and the 1997-1998 Colonel Oscar P. Cleaver Outstanding Graduate Student Award from the School of Electrical and Computer Engineering, Georgia Tech.

Bibliographic Information

Book Title: A Perspective on Single-Channel Frequency-Domain Speech Enhancement
Authors: Jacob Benesty, Yiteng Huang
Series Title: Synthesis Lectures on Speech and Audio Processing
DOI: https://doi.org/10.1007/978-3-031-02561-7
Publisher: Springer Cham
eBook Packages: Synthesis Collection of Technology (R0), eBColl Synthesis Collection 3
Copyright Information: Springer Nature Switzerland AG 2011
Softcover ISBN: 978-3-031-01433-8Published: 28 March 2011
eBook ISBN: 978-3-031-02561-7Published: 31 May 2022
Series ISSN: 1932-121X
Series E-ISSN: 1932-1678
Edition Number: 1
Number of Pages: VIII, 101
Topics: Electrical Engineering, Signal, Image and Speech Processing, Engineering Acoustics

Publish with us

Policies and ethics

A Perspective on Single-Channel Frequency-Domain Speech Enhancement

Overview

Access this book

Other ways to access

Table of contents (9 chapters)

Front Matter

Back Matter

About this book

Authors and Affiliations

INRS-EMT, University of Quebec, Canada

WeVoice, Inc., USA

About the authors

Bibliographic Information

Publish with us

Search

Navigation