Skip to main content
  • Book
  • © 2021

Statistical Universals of Language

Mathematical Chance vs. Human Choice

  • Covers the universal mathematical properties of natural language from Zipf's law to the present
  • Explains how the properties show the essential limitation of language models in natural language processing
  • Speculates on how the properties relate to linguistic grammar, word formation, and meaning

Part of the book series: Mathematics in Mind (MATHMIN)

Buy it now

Buying options

eBook USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (25 chapters)

  1. Front Matter

    Pages i-viii
  2. Language as a Complex System

    1. Front Matter

      Pages 1-1
    2. Introduction

      • Kumiko Tanaka-Ishii
      Pages 3-10
    3. Universals

      • Kumiko Tanaka-Ishii
      Pages 11-17
    4. Language as a Complex System

      • Kumiko Tanaka-Ishii
      Pages 19-30
  3. Property of Population

    1. Front Matter

      Pages 31-31
    2. Relation Between Rank and Frequency

      • Kumiko Tanaka-Ishii
      Pages 33-43
    3. Bias in Rank-Frequency Relation

      • Kumiko Tanaka-Ishii
      Pages 45-54
    4. Related Statistical Universals

      • Kumiko Tanaka-Ishii
      Pages 55-61
  4. Property of Sequences

    1. Front Matter

      Pages 63-63
    2. Returns

      • Kumiko Tanaka-Ishii
      Pages 65-76
    3. Long-Range Correlation

      • Kumiko Tanaka-Ishii
      Pages 77-87
    4. Fluctuation

      • Kumiko Tanaka-Ishii
      Pages 89-99
    5. Complexity

      • Kumiko Tanaka-Ishii
      Pages 101-111
  5. Relation to Linguistic Elements and Structure

    1. Front Matter

      Pages 113-113
    2. Articulation of Elements

      • Kumiko Tanaka-Ishii
      Pages 115-124
    3. Word Meaning and Value

      • Kumiko Tanaka-Ishii
      Pages 125-133
    4. Size and Frequency

      • Kumiko Tanaka-Ishii
      Pages 135-140
    5. Grammatical Structure and Long Memory

      • Kumiko Tanaka-Ishii
      Pages 141-151
  6. Mathematical Models

    1. Front Matter

      Pages 153-153

About this book

This volume explores the universal mathematical properties underlying big language data and possible reasons why such properties exist, revealing how we may be unconsciously mathematical in our language use. These properties are statistical and thus different from linguistic universals that contribute to describing the variation of human languages, and they can only be identified over a large accumulation of usages. The book provides an overview of state-of-the art findings on these statistical universals and reconsiders the nature of language accordingly, with Zipf's law as a well-known example.
The main focus of the book further lies in explaining the property of long memory, which was discovered and studied more recently by borrowing concepts from complex systems theory. The statistical universals not only possibly lie as the precursor of language system formation, but they also highlight the qualities of language that remain weak points in today's machine learning.
In summary, this book provides an overview of language's global properties. It will be of interest to anyone engaged in fields related to language and computing or statistical analysis methods, with an emphasis on researchers and students in computational linguistics and natural language processing. While the book does apply mathematical concepts, all possible effort has been made to speak to a non-mathematical audience as well by communicating mathematical content intuitively, with concise examples taken from real texts.





             

Reviews

“The chapters and the parts are intelligently curated and well-thought-out to give the subject matter a free-flowing and coherent structure. … The book has a lot of exciting discussions on offer. … the book has a lot to learn from. … The book is unique in the sense that language is presented … . The book provides excellent food for thought. … the book is worth every single second that a reader would be spending on reading it … .” (Firdous Ahmad Mala, risingkashmir.com, January 25, 2022)

Authors and Affiliations

  • Research Center for Advanced Science and Technology (RCAST), The University of Tokyo, Tokyo, Japan

    Kumiko Tanaka-Ishii

Bibliographic Information

  • Book Title: Statistical Universals of Language

  • Book Subtitle: Mathematical Chance vs. Human Choice

  • Authors: Kumiko Tanaka-Ishii

  • Series Title: Mathematics in Mind

  • DOI: https://doi.org/10.1007/978-3-030-59377-3

  • Publisher: Springer Cham

  • eBook Packages: Mathematics and Statistics, Mathematics and Statistics (R0)

  • Copyright Information: The Editor(s) (if applicable) and The Author(s) 2021

  • Hardcover ISBN: 978-3-030-59376-6Published: 02 April 2021

  • Softcover ISBN: 978-3-030-59379-7Published: 02 April 2022

  • eBook ISBN: 978-3-030-59377-3Published: 01 April 2021

  • Series ISSN: 2522-5405

  • Series E-ISSN: 2522-5413

  • Edition Number: 1

  • Number of Pages: VIII, 236

  • Number of Illustrations: 52 b/w illustrations, 100 illustrations in colour

  • Topics: Mathematics in the Humanities and Social Sciences

Buy it now

Buying options

eBook USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access