Skip to main content
Book cover

Robust Emotion Recognition using Spectral and Prosodic Features

  • Book
  • © 2013

Overview

  • Deals with emotions in terms of how to characterize the emotions, how to acquire the emotion-specific information from speech conversations and finally how to incorporate the acquired emotion-specific information to synthesize the desired emotions
  • Proposes pitch synchronous and sub-syllabic spectral features for characterizing emotions
  • Explores global and local prosodic features at syllable, word and phrase levels to capture the emotion-discriminative information
  • Demonstrates real life emotions using hierarchical models based on speaking rate
  • Includes supplementary material: sn.pub/extras

Part of the book series: SpringerBriefs in Speech Technology (BRIEFSSPEECHTECH)

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (7 chapters)

Keywords

About this book

In this brief, the authors discuss recently explored spectral (sub-segmental and pitch synchronous) and prosodic (global and local features at word and syllable levels in different parts of the utterance) features for discerning emotions in a robust manner. The authors also delve into the complementary evidences obtained from excitation source, vocal tract system and prosodic features for the purpose of enhancing emotion recognition performance. Features based on speaking rate characteristics are explored with the help of multi-stage and hybrid models for further improving emotion recognition performance. Proposed spectral and prosodic features are evaluated on real life emotional speech corpus.

Authors and Affiliations

  • , School of Information Technology, Indian Institute of Technology, Kharagpur, India

    K. Sreenivasa Rao, Shashidhar G. Koolagudi

About the authors

K. Sreenivasa Rao is at Indian Institute of Technology, Kharagpur, India.
Shashidhar G, Koolagudi is at Graphic Era University, Dehradun, India.

Bibliographic Information

Publish with us