Skip to main content
  • Textbook
  • © 2015

Machine Learning for Audio, Image and Video Analysis

Theory and Applications

  • Presents techniques for extracting features from audio recordings, images and videos
  • Provides the mathematical background required to use the techniques described
  • Covers the most important machine learning techniques for classification, clustering and sequence analysis
  • Includes supplementary material: sn.pub/extras

Part of the book series: Advanced Information and Knowledge Processing (AI&KP)

Buy it now

Buying options

eBook USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (16 chapters)

  1. Front Matter

    Pages i-xvi
  2. Introduction

    • Francesco Camastra, Alessandro Vinciarelli
    Pages 1-10
  3. From Perception to Computation

    1. Front Matter

      Pages 11-11
    2. Audio Acquisition, Representation and Storage

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 13-55
    3. Image and Video Acquisition, Representation and Storage

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 57-96
  4. Machine Learning

    1. Front Matter

      Pages 97-97
    2. Machine Learning

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 99-106
    3. Bayesian Theory of Decision

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 107-129
    4. Clustering Methods

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 131-167
    5. Foundations of Statistical Learning and Model Selection

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 169-190
    6. Supervised Neural Networks and Ensemble Methods

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 191-227
    7. Kernel Methods

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 229-293
    8. Markovian Models for Sequential Data

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 295-340
    9. Feature Extraction Methods and Manifold Learning Methods

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 341-386
  5. Applications

    1. Front Matter

      Pages 387-387
    2. Speech and Handwriting Recognition

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 389-419
    3. Automatic Face Recognition

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 421-448
    4. Video Segmentation and Keyframe Extraction

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 449-465
    5. Real-Time Hand Pose Recognition

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 467-484
    6. Automatic Personality Perception

      • Francesco Camastra, Alessandro Vinciarelli
      Pages 485-498

About this book

This second edition focuses on audio, image and video data, the three main types of input that machines deal with when interacting with the real world. A set of appendices provides the reader with self-contained introductions to the mathematical background necessary to read the book.
Divided into three main parts, From Perception to Computation introduces methodologies aimed at representing the data in forms suitable for computer processing, especially when it comes to audio and images. Whilst the second part, Machine Learning includes an extensive overview of statistical techniques aimed at addressing three main problems, namely classification (automatically assigning a data sample to one of the classes belonging to a predefined set), clustering (automatically grouping data samples according to the similarity of their properties) and sequence analysis (automatically mapping a sequence of observations into a sequence of human-understandable symbols). The third partApplications shows how the abstract problems defined in the second part underlie technologies capable to perform complex tasks such as the recognition of hand gestures or the transcription of handwritten data.

Machine Learning for Audio, Image and Video Analysis is suitable for students to acquire a solid background in machine learning as well as for practitioners to deepen their knowledge of the state-of-the-art. All application chapters are based on publicly available data and free software packages, thus allowing readers to replicate the experiments.

Reviews

“This nice book of over 560 pages is really useful for students, researchers, practitioners, and anybody who is interested in machine learning and related subjects.” (Michael M. Dediu, Mathematical Reviews, May, 2017)

Authors and Affiliations

  • University of Naples Parthenope, Department of Science and Technology, Naples, Italy

    Francesco Camastra

  • University of Glasgow, School of Computing Science, Glasgow, Scotland, United Kingdom

    Alessandro Vinciarelli

Bibliographic Information

Buy it now

Buying options

eBook USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access