Name: Laboratory Experiments in Information Retrieval
ISBN: 978-981-13-1199-4

Authors:

Tetsuya Sakai ⁰

Tetsuya Sakai
1. Waseda University, Tokyo, Japan
View author publications

You can also search for this author in PubMed Google Scholar

Discusses the principles and limitations of statistical significance tests
Provides hands-on examples of t-tests, ANOVA, and multiple comparison procedures with Excel and R
Introduces tools for designing effective experiments by leveraging topic set size design and for power analysis

Part of the book series: The Information Retrieval Series (INRE, volume 40)

8566 Accesses
31 Citations
9 Altmetric

Buy it now

eBook USD 39.99

Price excludes VAT (USA)

Softcover Book USD 54.99

Price excludes VAT (USA)

Hardcover Book USD 54.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Learn about institutional subscriptions

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (8 chapters)

Front Matter

Pages i-ix

PDF
Preliminaries
- Tetsuya Sakai
Pages 1-25
t-Tests
- Tetsuya Sakai
Pages 27-41
Analysis of Variance
- Tetsuya Sakai
Pages 43-58
Multiple Comparison Procedures
- Tetsuya Sakai
Pages 59-80
The Correct Ways to Use Significance Tests
- Tetsuya Sakai
Pages 81-98
Topic Set Size Design Using Excel
- Tetsuya Sakai
Pages 99-132
Power Analysis Using R
- Tetsuya Sakai
Pages 133-145
Conclusions
- Tetsuya Sakai
Pages 147-148
Back Matter

Pages 149-150

PDF

About this book

Covering aspects from principles and limitations of statistical significance tests to topic set size design and power analysis, this book guides readers to statistically well-designed experiments. Although classical statistical significance tests are to some extent useful in information retrieval (IR) evaluation, they can harm research unless they are used appropriately with the right sample sizes and statistical power and unless the test results are reported properly. The first half of the book is mainly targeted at undergraduate students, and the second half is suitable for graduate students and researchers who regularly conduct laboratory experiments in IR, natural language processing, recommendations, and related fields.

Chapters 1–5 review parametric significance tests for comparing system means, namely, t-tests and ANOVAs, and show how easily they can be conducted using Microsoft Excel or R. These chapters also discuss a few multiple comparison procedures for researcherswho are interested in comparing every system pair, including a randomised version of Tukey's Honestly Significant Difference test. The chapters then deal with known limitations of classical significance testing and provide practical guidelines for reporting research results regarding comparison of means.

Chapters 6 and 7 discuss statistical power. Chapter 6 introduces topic set size design to enable test collection builders to determine an appropriate number of topics to create. Readers can easily use the author’s Excel tools for topic set size design based on the paired and two-sample t-tests, one-way ANOVA, and confidence intervals. Chapter 7 describes power-analysis-based methods for determining an appropriate sample size for a new experiment based on a similar experiment done in the past, detailing how to utilize the author’s R tools for power analysis and how to interpret the results. Case studies from IR for both Excel-based topic set size design and R-basedpower analysis are also provided.

Keywords

Authors and Affiliations

Waseda University, Tokyo, Japan

Tetsuya Sakai

About the author

Tetsuya Sakai is a professor and the head of the Department of Computer Science and Engineering, Waseda University, Japan. He is also a visiting professor at the National Institute of Informatics. He joined Toshiba in 1993 and obtained a Ph.D. from Waseda in 2000. From 2000 to 2001, he was supervised by the late Karen Sparck Jones at the Computer Laboratory, University of Cambridge, as a visiting researcher. In 2007, he joined NewsWatch, Inc. as the director of the Natural Language Processing Lab. In 2009, he joined Microsoft Research Asia. He joined the Waseda faculty in 2013. He is an editor-in-chief of the Information Retrieval Journal (Springer) and an associate editor of ACM TOIS. He received a Waseda University Teaching Award in 2014 and a Waseda University Presidential Teaching Award in 2016.

Bibliographic Information

Book Title: Laboratory Experiments in Information Retrieval
Book Subtitle: Sample Sizes, Effect Sizes, and Statistical Power
Authors: Tetsuya Sakai
Series Title: The Information Retrieval Series
DOI: https://doi.org/10.1007/978-981-13-1199-4
Publisher: Springer Singapore
eBook Packages: Computer Science, Computer Science (R0)
Copyright Information: Springer Nature Singapore Pte Ltd. 2018
Hardcover ISBN: 978-981-13-1198-7Published: 04 October 2018
Softcover ISBN: 978-981-13-4581-4Published: 29 December 2018
eBook ISBN: 978-981-13-1199-4Published: 22 September 2018
Series ISSN: 1871-7500
Series E-ISSN: 2730-6836
Edition Number: 1
Number of Pages: IX, 150
Number of Illustrations: 10 b/w illustrations, 43 illustrations in colour
Topics: Information Storage and Retrieval, Statistics for Engineering, Physics, Computer Science, Chemistry and Earth Sciences

Publish with us

Policies and ethics

Authors:

Sections

Buy it now

Buying options

Other ways to access

Table of contents (8 chapters)

Front Matter

Back Matter

About this book

Keywords

Authors and Affiliations

Waseda University, Tokyo, Japan

About the author

Bibliographic Information

Publish with us

Buy it now

Buying options

Other ways to access

Search

Navigation