Skip to main content

Evolutionary Decision Trees in Large-Scale Data Mining

  • Book
  • © 2019

Overview

  • Sums up the authors research conducted over the last 15 years on the evolutionary induction of decision trees
  • Discusses some basic elements from three domains are discussed, all of which are necessary to follow the proposed approach: evolutionary computations, decision trees, and parallel and distributed computing
  • Presents in detail an evolutionary approach to the induction of decision trees

Part of the book series: Studies in Big Data (SBD, volume 59)

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (8 chapters)

  1. Background

  2. The Approach

  3. Extensions

  4. Large-Scale Mining

Keywords

About this book

This book presents a unified framework, based on specialized evolutionary algorithms, for the global induction of various types of classification and regression trees from data. The resulting univariate or oblique trees are significantly smaller than those produced by standard top-down methods, an aspect that is critical for the interpretation of mined patterns by domain analysts. The approach presented here is extremely flexible and can easily be adapted to specific data mining applications, e.g. cost-sensitive model trees for financial data or multi-test trees for gene expression data. The global induction can be efficiently applied to large-scale data without the need for extraordinary resources. With a simple GPU-based acceleration, datasets composed of millions of instances can be mined in minutes. In the event that the size of the datasets makes the fastest memory computing impossible, the Spark-based implementation on computer clusters, which offers impressive fault tolerance and scalability potential, can be applied.


Reviews

“The structure of the book is well-thought-out. … I recommend the book for students, researchers, and developers interested in real-life applications of big data analysis.” (K. Balogh, Computing Reviews, February 15, 2021)

Authors and Affiliations

  • Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland

    Marek Kretowski

Bibliographic Information

Publish with us