Overview
- Authors:
-
-
Weili Wu
-
Department of Computer Science, The University of Texas at Dallas, Richardson, USA
-
Hui Xiong
-
Department of Computer Science and Engineering, University of Minnesota - Twin Cities, Minneapolis, USA
-
Shashi Shekhar
-
Department of Computer Science and Engineering, University of Minnesota - Twin Cities, Minneapolis, USA
Access this book
Other ways to access
Table of contents (10 chapters)
-
Front Matter
Pages i-viii
-
- Ricardo Baeza-Yates, BenjamÃn Bustos, Edgar Chávez, Norma Herrera, Gonzalo Navarro
Pages 1-33
-
- Sudipto Guha, Rajeev Rastogi, Kyuseok Shim
Pages 35-82
-
- Levent Ertöz, Michael Steinbach, Vipin Kumar
Pages 83-103
-
- Ji He, Ah-Hwee Tan, Chew-Lim Tan, Sam-Yuan Sung
Pages 105-133
-
- Wesley W. Chu, Victor Zhenyu Liu, Wenlei Mao
Pages 135-159
-
- Steven Noel, Vijay Raghavan, C.-H. Henry Chu
Pages 161-193
-
- Ji-Rong Wen, Hong-Jiang Zhang
Pages 195-225
-
- Sam Y. Sung, Zhao Li, Tok W. Ling
Pages 227-259
-
- Daniel J. Crichton, J. Steven Hughes, Sean Kelly
Pages 261-298
-
About this book
Clustering is an important technique for discovering relatively dense sub-regions or sub-spaces of a multi-dimension data distribution. Clus tering has been used in information retrieval for many different purposes, such as query expansion, document grouping, document indexing, and visualization of search results. In this book, we address issues of cluster ing algorithms, evaluation methodologies, applications, and architectures for information retrieval. The first two chapters discuss clustering algorithms. The chapter from Baeza-Yates et al. describes a clustering method for a general metric space which is a common model of data relevant to information retrieval. The chapter by Guha, Rastogi, and Shim presents a survey as well as detailed discussion of two clustering algorithms: CURE and ROCK for numeric data and categorical data respectively. Evaluation methodologies are addressed in the next two chapters. Ertoz et al. demonstrate the use of text retrieval benchmarks, such as TRECS, to evaluate clustering algorithms. He et al. provide objective measures of clustering quality in their chapter. Applications of clustering methods to information retrieval is ad dressed in the next four chapters. Chu et al. and Noel et al. explore feature selection using word stems, phrases, and link associations for document clustering and indexing. Wen et al. and Sung et al. discuss applications of clustering to user queries and data cleansing. Finally, we consider the problem of designing architectures for infor mation retrieval. Crichton, Hughes, and Kelly elaborate on the devel opment of a scientific data system architecture for information retrieval.