Overview

Editors:

Huan Liu⁰,
Hiroshi Motoda¹

Huan Liu
1. Arizona State University, USA
View editor publications

You can also search for this editor in PubMed Google Scholar
Hiroshi Motoda
1. Osaka University, Japan
View editor publications

You can also search for this editor in PubMed Google Scholar

Part of the book series: The Springer International Series in Engineering and Computer Science (SECS, volume 608)

6381 Accesses
137 Citations

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 129.00

Price excludes VAT (USA)

Softcover Book USD 169.99

Price excludes VAT (USA)

Hardcover Book USD 169.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (22 chapters)

Front Matter

Pages i-xxv

Download chapter PDF
Background and Foundation
1. Front Matter
  
  Pages 1-1
  
  Download chapter PDF
2. Data Reduction via Instance Selection
  
  Huan Liu, Hiroshi Motoda
  
  Pages 3-20
3. Sampling: Knowing Whole from Its Part
  
  Baohua Gu, Feifang Hu, Huan Liu
  
  Pages 21-38
4. A Unifying View on Instance Selection
  
  Thomas Reinartz
  
  Pages 39-56
Instance Selection Methods
1. Front Matter
  
  Pages 57-57
  
  Download chapter PDF
2. Competence Guided Instance Selection for Case-Based Reasoning
  
  Barry Smyth, Elizabeth McKenna
  
  Pages 59-76
3. Identifying Competence-Critical Instances for Instance-Based Learners
  
  Henry Brighton, Chris Mellish
  
  Pages 77-94
4. Genetic-Algorithm-Based Instance and Feature Selection
  
  Hisao Ishibuchi, Tomoharu Nakashima, Manabu Nii
  
  Pages 95-112
5. The Landmark Model: An Instance Selection Method for Time Series Data
  
  Chang-Shing Perng, Sylvia R. Zhang, D. Stott Parker
  
  Pages 113-130
Use of Sampling Methods
1. Front Matter
  
  Pages 131-131
  
  Download chapter PDF
2. Adaptive Sampling Methods for Scaling up Knowledge Discovery Algorithms
  
  Carlos Domingo, Ricard Gavaldà, Osamu Watanabe
  
  Pages 133-150
3. Progressive Sampling
  
  Foster Provost, David Jensen, Tim Oates
  
  Pages 151-170
4. Sampling Strategy for Building Decision Trees from Very Large Databases Comprising Many Continuous Attributes
  
  Jean-Hugues Chauchat, Ricco Rakotomalala
  
  Pages 171-188
5. Incremental Classification Using Tree-Based Sampling for Large Data
  
  Hankil Yoon, Khaled Alsabti, Sanjay Ranka
  
  Pages 189-206
Unconventional Methods
1. Front Matter
  
  Pages 207-207
  
  Download chapter PDF
2. Instance Construction via Likelihood-Based Data Squashing
  
  David Madigan, Nandini Raghavan, William DuMouchel, Martha Nason, Christian Posse, Greg Ridgeway
  
  Pages 209-226
3. Learning via Prototype Generation and Filtering
  
  Wai Lam, Chi-Kin Keung, Charles X. Ling
  
  Pages 227-244
4. Instance Selection Based on Hypertuples
  
  Hui Wang
  
  Pages 245-262
5. KBIS: Using Domain Knowledge to Guide Instance Selection
  
  Peggy Wright, Julia Hodges
  
  Pages 263-279

Keywords

About this book

The ability to analyze and understand massive data sets lags far behind the ability to gather and store the data. To meet this challenge, knowledge discovery and data mining (KDD) is growing rapidly as an emerging field. However, no matter how powerful computers are now or will be in the future, KDD researchers and practitioners must consider how to manage ever-growing data which is, ironically, due to the extensive use of computers and ease of data collection with computers. Many different approaches have been used to address the data explosion issue, such as algorithm scale-up and data reduction. Instance, example, or tuple selection pertains to methods or algorithms that select or search for a representative portion of data that can fulfill a KDD task as if the whole data is used. Instance selection is directly related to data reduction and becomes increasingly important in many KDD applications due to the need for processing efficiency and/or storage efficiency.
One of the major means of instance selection is sampling whereby a sample is selected for testing and analysis, and randomness is a key element in the process. Instance selection also covers methods that require search. Examples can be found in density estimation (finding the representative instances - data points - for a cluster); boundary hunting (finding the critical instances to form boundaries to differentiate data points of different classes); and data squashing (producing weighted new data with equivalent sufficient statistics). Other important issues related to instance selection extend to unwanted precision, focusing, concept drifts, noise/outlier removal, data smoothing, etc.
Instance Selection and Construction for Data Mining brings researchers and practitioners together to report new developments and applications, to share hard-learned experiences in order to avoid similar pitfalls, and to shed light on the future development of instance selection. This volume serves as a comprehensive reference for graduate students, practitioners and researchers in KDD.

Editors and Affiliations

Arizona State University, USA

Huan Liu
Osaka University, Japan

Hiroshi Motoda

Bibliographic Information

Book Title: Instance Selection and Construction for Data Mining
Editors: Huan Liu, Hiroshi Motoda
Series Title: The Springer International Series in Engineering and Computer Science
DOI: https://doi.org/10.1007/978-1-4757-3359-4
Publisher: Springer New York, NY
eBook Packages: Springer Book Archive
Copyright Information: Springer Science+Business Media Dordrecht 2001
Hardcover ISBN: 978-0-7923-7209-7Published: 28 February 2001
Softcover ISBN: 978-1-4419-4861-8Published: 08 December 2010
eBook ISBN: 978-1-4757-3359-4Published: 09 March 2013
Series ISSN: 0893-3405
Edition Number: 1
Number of Pages: XXV, 416
Topics: Data Structures and Information Theory, Artificial Intelligence, Information Storage and Retrieval, Statistics, general

Publish with us

Policies and ethics

Instance Selection and Construction for Data Mining

Overview

Access this book

Other ways to access

Table of contents (22 chapters)

Front Matter

Background and Foundation

Front Matter

Instance Selection Methods

Front Matter

Use of Sampling Methods

Front Matter

Unconventional Methods

Front Matter

Keywords

About this book

Editors and Affiliations

Arizona State University, USA

Osaka University, Japan

Bibliographic Information

Publish with us

Search

Navigation