Skip to main content
  • Textbook
  • © 2014

Text Analysis with R for Students of Literature

  • Book is specifically designed for students and scholars with no programming experience who wish to learn R for text analysis
  • Reader will move with ease from simple single text analysis to corpora level analysis with the accessible nature of this text, which is written from the perspective of a literature scholar
  • Design of material will get readers analyzing text immediately and covers enough conceptual information to be applied to individual projects
  • Includes supplementary material: sn.pub/extras

Buy it now

Buying options

eBook USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (13 chapters)

  1. Front Matter

    Pages i-xvi
  2. Microanalysis

    1. Front Matter

      Pages 1-1
    2. R Basics

      • Matthew L. Jockers
      Pages 3-10
    3. First Foray into Text Analysis with R

      • Matthew L. Jockers
      Pages 11-23
    4. Accessing and Comparing Word Frequency Data

      • Matthew L. Jockers
      Pages 25-28
    5. Token Distribution Analysis

      • Matthew L. Jockers
      Pages 29-46
    6. Correlation

      • Matthew L. Jockers
      Pages 47-56
  3. Mesoanalysis

    1. Front Matter

      Pages 57-57
    2. Measures of Lexical Variety

      • Matthew L. Jockers
      Pages 59-67
    3. Hapax Richness

      • Matthew L. Jockers
      Pages 69-72
    4. Do It KWIC

      • Matthew L. Jockers
      Pages 73-80
    5. Do It KWIC (Better)

      • Matthew L. Jockers
      Pages 81-87
    6. Text Quality, Text Variety, and Parsing XML

      • Matthew L. Jockers
      Pages 89-98
  4. Macroanalysis

    1. Front Matter

      Pages 99-99
    2. Clustering

      • Matthew L. Jockers
      Pages 101-117
    3. Classification

      • Matthew L. Jockers
      Pages 119-133
    4. Topic Modeling

      • Matthew L. Jockers
      Pages 135-159
  5. Back Matter

    Pages 161-194

About this book

Text Analysis with R for Students of Literature is written with students and scholars of literature in mind but will be applicable to other humanists and social scientists wishing to extend their methodological tool kit to include quantitative and computational approaches to the study of text. Computation provides access to information in text that we simply cannot gather using traditional qualitative methods of close reading and human synthesis. Text Analysis with R for Students of Literature provides a practical introduction to computational text analysis using the open source programming language R. R is extremely popular throughout the sciences and because of its accessibility, R is now used increasingly in other research areas. Readers begin working with text right away and each chapter works through a new technique or process such that readers gain a broad exposure to core R procedures and a basic understanding of the possibilities of computational text analysis atboth the micro and macro scale. Each chapter builds on the previous as readers move from small scale “microanalysis” of single texts to large scale “macroanalysis” of text corpora, and each chapter concludes with a set of practice exercises that reinforce and expand upon the chapter lessons. The book’s focus is on making the technical palatable and making the technical useful and immediately gratifying.

Reviews

“The aim of this book is … to give the Literature students just the most basic tools needed to do some relatively straightforward textual analysis. … Even though this is primarily a book intended for literature students, I would actually strongly recommend it to anyone interested in text mining, text analysis and natural language processing. It is a very gentle and approachable introduction to the whole world of textual analysis.” (Bojan Tunguz, tunguzreview.com, July, 2015)

“This is a well written book on the topic of Text Analysis. There is enough information to give you a good start using R. Followed by easy to understand details about text analysis. … This is a good book to have if you are doing text analysis.” (Mary Anne, Cats and Dogs with Data, maryannedata.com, August, 2014)

“A remarkably well-crafted book that will allow students to get a quick start and progress toward quite sophisticated text mining tasks. … exercises provided at the end of each chapter, withsolutions at the end of the book, should serve well to help students solidify their knowledge and gain more confidence in their text mining skills. … a great addition to the libraries of digital humanists and natural language enthusiasts who wish to expand their programming literacy … .” (Denilson Barbosa, Computing Reviews, August, 2014)

"I can't think of a more qualified person to guide readers through powerful R techniques for text analysis. While extremely useful for people studying literature, these techniques can be also used by anybody working with texts. Even if you simply want to understand how companies and data scientists are analyzing all kinds of texts, go through this book." (Lev Manovich, Department of Computer Science, The Graduate Center, City University of New York & author of The Language of New Media)

"The open source programming language R has become one of the most central statistical and analytical tool in many sciences. While it has already been used in linguistic applications, this book is the first to discuss the application of (corpus-linguistic and other) methods with R in the context of literary studies. The author covers a wide range of descriptive, analytical, and exploratory methods beautifully and in detail in a book that will appeal to a wide and diverse audience of both students and seasoned researchers from literary studies, linguistic computing, and the digital humanities more generally." (Stefan Th. Gries, Department of Linguistics, University of California, Santa Barbara & author of Quantitative corpus linguistics with R: A Practical Introduction)

"This book does a great service for literary scholars interested in computational approaches to text analysis, giving them ready access to powerful methods for exploring patterns and relationships across l

arge quantities of text. Its clear and lucid explanations will also make it an easy textbook to teach from, especially for instructors with prior background who can then use it as a stepping stone to introducing more complex methods. Amateurs and those with little programming background will find it imminently accessible." (Hoyt Long, Department of East Asian Languages and Civilizations, University of Chicago)

"Through my work as an epidemiologist, I encounter electronic health records in an unstructured form (i.e. text), and Text Analysis with R covers many of the initial steps for studying these records. The book is very accessible; it provides a straightforward introduction to manipulating text information without presuming a background in programming or a familiarity with the jargon used in this field. I also appreciated Jockers' thoughtful inclusion of supplemental explanations and information in footnotes throughout the book. For example, text analysis often involves the use of "regular expressions"; a footnote concisely explains wildcard and escape characters and this explanation spared me a fair bit of confusion in my own work. Although I am not a "student of literature", I thought the book contained many generalizable and expertly-taught lessons that make it a valuable introduction to manipulating and analyzing text." (Matthew Maenner, Ph.D.)

"This book is a worthy introduction to computational text analysis, and it fills an important gap in the literatur

e. It’s very accessible and contains plenty of interesting examples and real applications, which have been collected and crafted over the many years the author taught text analysis to undergraduate and graduate students. Although it focuses on the study of literature, I would highly recommend this book to students in business administration and related fields." (Joao Quariguasi Frota Neto, School of Management, University of Bath)

Authors and Affiliations

  • Department of English, University of Nebraska, Lincoln, USA

    Matthew L. Jockers

About the author

The author, Matthew L. Jockers, is Associate Professor of English and Director of the Nebraska Literary Lab at the University of Nebraska in Lincoln.  Jockers's text mining research has been featured in the New York Times, Nature, the Chronicle of Higher Education, Wired, New Scientist, Smithsonian, NBC News and many others. Jockers blogs about his research at www.matthewjockers.net.



Bibliographic Information

Buy it now

Buying options

eBook USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access