Skip to main content
  • Book
  • © 2019

Statistical Methods for Imbalanced Data in Ecological and Biological Studies

  • Focuses on the problem caused by imbalanced data often observed in ecology and biology
  • Introduces the latest statistical methods for imbalanced data
  • Demonstrates the application of statistical methods to several real data sets

Part of the book series: SpringerBriefs in Statistics (BRIEFSSTATIST)

Part of the book sub series: JSS Research Series in Statistics (JSSRES)

Buy it now

Buying options

eBook USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (5 chapters)

  1. Front Matter

    Pages i-viii
  2. Introduction to Imbalanced Data

    • Osamu Komori, Shinto Eguchi
    Pages 1-10
  3. Weighted Logistic Regression

    • Osamu Komori, Shinto Eguchi
    Pages 11-25
  4. \(\beta \)-Maxent

    • Osamu Komori, Shinto Eguchi
    Pages 27-33
  5. Generalized T-Statistic

    • Osamu Komori, Shinto Eguchi
    Pages 35-43
  6. Machine Learning Methods for Imbalanced Data

    • Osamu Komori, Shinto Eguchi
    Pages 45-55
  7. Back Matter

    Pages 57-59

About this book

This book presents a fresh, new approach in that it provides a comprehensive recent review of challenging problems caused by imbalanced data in prediction and classification, and also in that it introduces several of the latest statistical methods of dealing with these problems. The book discusses the property of the imbalance of data from two points of view. The first is quantitative imbalance, meaning that the sample size in one population highly outnumbers that in another population. It includes presence-only data as an extreme case, where the presence of a species is confirmed, whereas the information on its absence is uncertain, which is especially common in ecology in predicting habitat distribution. The second is qualitative imbalance, meaning that the data distribution of one population can be well specified whereas that of the other one shows a highly heterogeneous property. A typical case is the existence of outliers commonly observed in gene expression data, and another is heterogeneous characteristics often observed in a case group in case-control studies. The extension of the logistic regression model, maxent, and AdaBoost for imbalanced data is discussed, providing a new framework for improvement of prediction, classification, and performance of variable selection. Weights functions introduced in the methods play an important role in alleviating the imbalance of data. This book also furnishes a new perspective on these problem and shows some applications of the recently developed statistical methods to real data sets.

Authors and Affiliations

  • Seikei University, Musashino, Japan

    Osamu Komori

  • The Institute of Statistical Mathematics, Tachikawa, Japan

    Shinto Eguchi

About the authors

Osamu Komori, The Institute of Statistical Mathematics, 
Shinto Eguchi, The Institute of Statistical Mathematics

Bibliographic Information

Buy it now

Buying options

eBook USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access