Skip to main content
  • Textbook
  • © 2021

Text Data Mining

  • Focuses on text data mining from an NLP perspective
  • Offers a rich blend of fundamental theories, key techniques and predominant applications
  • Presents the latest advances in the field of text data mining

Buy it now

Buying options

eBook USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (11 chapters)

  1. Front Matter

    Pages i-xxi
  2. Introduction

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 1-13
  3. Data Annotation and Preprocessing

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 15-31
  4. Text Representation

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 33-73
  5. Text Representation with Pretraining and Fine-Tuning

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 75-92
  6. Text Classification

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 93-124
  7. Text Clustering

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 125-144
  8. Topic Model

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 145-162
  9. Sentiment Analysis and Opinion Mining

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 163-199
  10. Topic Detection and Tracking

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 201-225
  11. Information Extraction

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 227-283
  12. Automatic Text Summarization

    • Chengqing Zong, Rui Xia, Jiajun Zhang
    Pages 285-333
  13. Back Matter

    Pages 335-351

About this book

This book discusses various aspects of text data mining. Unlike other books that focus on machine learning or databases, it approaches text data mining from a natural language processing (NLP) perspective.

The book offers a detailed introduction to the fundamental theories and methods of text data mining, ranging from pre-processing (for both Chinese and English texts), text representation and feature selection, to text classification and text clustering. It also presents the predominant applications of text data mining, for example, topic modeling, sentiment analysis and opinion mining, topic detection and tracking, information extraction, and automatic text summarization. Bringing all the related concepts and algorithms together, it offers a comprehensive, authoritative and coherent overview.  

Written by three leading experts, it is valuable both as a textbook and as a reference resource for students, researchers and practitioners interested in text data mining. It can also be used for classes on text data mining or NLP.


Authors and Affiliations

  • Institute of Automation, Chinese Academy of Sciences, Beijing, China

    Chengqing Zong, Jiajun Zhang

  • School of Computer Science & Engineering, Nanjing University of Science and Technology, Nanjing, China

    Rui Xia

About the authors

Chengqing Zong is a Professor at the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA) and an adjunct professor in the School of Artificial Intelligence at University (SAIU) of Chinese Academy of Sciences (UCAS). He authored the book “Statistical Natural Language Processing” (which is in Chinese, sold more than 32K copies), and has published more than 200 papers on machine translation, natural language processing and cognitive linguistics. He served as the chairs for numerous prestigious conferences, such as ACL, COLING,  AAAI and IJCAI , and has served as an associate editor for journals such as ACM TALLIP and ACTA Automatic Sinica, and as an editorial board member for journals including IEEE Intelligent Systems, Journal of Comput. Sci. & Tech. and Machine Translation. He is currently the President of the Asian Federation of Natural Language Processing (AFNLP) and a member of International Committee on Computational Linguistics (ICCL).

Rui Xia is a Professor at the School of Computer Science and Engineering, Nanjing University of Science and Technology, China. He has published more than 50 papers in high-quality journals and top-tiered conferences in the field of natural language processing and text data mining. He serves as area chair and senior program committee  member for several top conferences, such as EMNLP, COLING, IJCAI, AAAI. He received the outstanding paper award of ACL 2019, and the Distinguished  Young Scholar award from the Natural Science Foundation of Jiangsu Province, China in 2020..

Jiajun Zhang is a Professor at NLPR, CASIA and an adjunct professor in the SAIU of UCAS. He has published more than 80 conference papers and journal articles on natural language processing and text mining, and received 5 best paper awards. He served as the area chair or on the senior program committees for several top conferences, such as ACL, EMNLP, COLING, AAAI and IJCAI. He is the deputy director of China’s Machine Translation Technical Committee of the Chinese Information Processing Society of China.  He received Qian Wei-Chang Science and Technology Award of Chinese Information Processing and CIPS Hanvon Youth Innovation Award. He was supported by the Elite Scientists Sponsorship Program of China Association for Science and Technology (CAST).


Bibliographic Information

  • Book Title: Text Data Mining

  • Authors: Chengqing Zong, Rui Xia, Jiajun Zhang

  • DOI: https://doi.org/10.1007/978-981-16-0100-2

  • Publisher: Springer Singapore

  • eBook Packages: Computer Science, Computer Science (R0)

  • Copyright Information: Tsinghua University Press 2021

  • Hardcover ISBN: 978-981-16-0099-9Published: 23 May 2021

  • Softcover ISBN: 978-981-16-0102-6Published: 24 May 2022

  • eBook ISBN: 978-981-16-0100-2Published: 22 May 2021

  • Edition Number: 1

  • Number of Pages: XXI, 351

  • Number of Illustrations: 207 b/w illustrations, 7 illustrations in colour

  • Additional Information: Jointly published with Tsinghua University Press, Beijing, China

  • Topics: Natural Language Processing (NLP), Data Mining and Knowledge Discovery, Machine Learning

Buy it now

Buying options

eBook USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access