Skip to main content
  • Book
  • © 2004

Survey of Text Mining

Clustering, Classification, and Retrieval

Buy it now

Buying options

eBook USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 149.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

This is a preview of subscription content, log in via an institution to check for access.

Table of contents (9 chapters)

  1. Front Matter

    Pages i-xvii
  2. Clustering and Classification

    1. Front Matter

      Pages 1-1
    2. Automatic Discovery of Similar Words

      • Pierre P. Senellart, Vincent D. Blondel
      Pages 25-43
    3. Feature Selection and Document Clustering

      • Inderjit Dhillon, Jacob Kogan, Charles Nicholas
      Pages 73-100
  3. Information Extraction and Retrieval

    1. Front Matter

      Pages 101-101
    2. Vector Space Models for Search and Cluster Mining

      • Mei Kobayashi, Masaki Aono
      Pages 103-122
    3. HotMiner: Discovering Hot Topics from Dirty Text

      • Malú Castellanos
      Pages 123-157
    4. Combining Families of Information Retrieval Algorithms Using Metalearning

      • Michael Cornelson, Ed Greengrass, Robert L. Grossman, Ron Karidi, Daniel Shnidman
      Pages 159-169
  4. Trend Detection

    1. Front Matter

      Pages 171-171
    2. Trend and Behavior Detection from Web Queries

      • Peiling Wang, Jennifer Bownas, Michael W. Berry
      Pages 173-183
    3. A Survey of Emerging Trend Detection in Textual Data Mining

      • April Kontostathis, Leon M. Galitsky, William M. Pottenger, Soma Roy, Daniel J. Phelps
      Pages 185-224
  5. Back Matter

    Pages 225-244

About this book

Extracting content from text continues to be an important research problem for information processing and management. Approaches to capture the semantics of text-based document collections may be based on Bayesian models, probability theory, vector space models, statistical models, or even graph theory.

As the volume of digitized textual media continues to grow, so does the need for designing robust, scalable indexing and search strategies (software) to meet a variety of user needs. Knowledge extraction or creation from text requires systematic yet reliable processing that can be codified and adapted for changing needs and environments.

This book will draw upon experts in both academia and industry to recommend practical approaches to the purification, indexing, and mining of textual information. It will address document identification, clustering and categorizing documents, cleaning text, and visualizing semantic models of text.

Editors and Affiliations

  • Department of Computer Science, University of Tennessee, Knoxville, USA

    Michael W. Berry

Bibliographic Information

Buy it now

Buying options

eBook USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 149.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access