Name: Big Data Preprocessing
ISBN: 978-3-030-39105-8

Overview

Authors:

Julián Luengo ⁰,
Diego García-Gil ¹,
Sergio Ramírez-Gallego ²,
Salvador García ³,
…
Francisco Herrera ⁴

Julián Luengo
1. Department of Computer Science and AI, University of Granada, Granada, Spain
View author publications

You can also search for this author in PubMed Google Scholar
Diego García-Gil
1. Department of Computer Science and AI, University of Granada, Granada, Spain
View author publications

You can also search for this author in PubMed Google Scholar
Sergio Ramírez-Gallego
1. DOCOMO Digital España, Madrid, Spain
View author publications

You can also search for this author in PubMed Google Scholar
Salvador García
1. Department of Computer Science and AI, University of Granada, Granada, Spain
View author publications

You can also search for this author in PubMed Google Scholar
Francisco Herrera
1. Department of Computer Science and AI, University of Granada, Granada, Spain
View author publications

You can also search for this author in PubMed Google Scholar

One of the first books on preprocessing in Big Data that covers a large amount of significant issues, namely the enumeration and description of some of the most recent solutions to address imbalanced classification, the characteristics of novel problems and applications (with the latest published algorithms), and the implementations of working techniques ready to be used in well-known Big Data
Covers data intrinsic characteristics
Presents the concept of Smart Data

21k Accesses
52 Citations

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 39.99

Price excludes VAT (USA)

Softcover Book USD 54.99

Price excludes VAT (USA)

Hardcover Book USD 84.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (10 chapters)

Front Matter

Pages i-xiii

Download chapter PDF
Introduction
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 1-14
Big Data: Technologies and Tools
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 15-43
Smart Data
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 45-51
Dimensionality Reduction for Big Data
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 53-79
Data Reduction for Big Data
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 81-99
Imperfect Big Data
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 101-119
Big Data Discretization
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 121-146
Imbalanced Data Preprocessing for Big Data
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 147-160
Big Data Software
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 161-182
Final Thoughts: From Big Data to Smart Data
- Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
Pages 183-186

Keywords

About this book

This book offers a comprehensible overview of Big Data Preprocessing, which includes a formal description of each problem. It also focuses on the most relevant proposed solutions. This book illustrates actual implementations of algorithms that helps the reader deal with these problems.

This book stresses the gap that exists between big, raw data and the requirements of quality data that businesses are demanding. This is called Smart Data, and to achieve Smart Data the preprocessing is a key step, where the imperfections, integration tasks and other processes are carried out to eliminate superfluous information. The authors present the concept of Smart Data through data preprocessing in Big Data scenarios and connect it with the emerging paradigms of IoT and edge computing, where the end points generate Smart Data without completely relying on the cloud.

Finally, this book provides some novel areas of study that are gathering a deeper attention on the Big Data preprocessing. Specifically, it considers the relation with Deep Learning (as of a technique that also relies in large volumes of data), the difficulty of finding the appropriate selection and concatenation of preprocessing techniques applied and some other open problems.

Practitioners and data scientists who work in this field, and want to introduce themselves to preprocessing in large data volume scenarios will want to purchase this book. Researchers that work in this field, who want to know which algorithms are currently implemented to help their investigations, may also be interested in this book.

Authors and Affiliations

Department of Computer Science and AI, University of Granada, Granada, Spain

Julián Luengo, Diego García-Gil, Salvador García, Francisco Herrera
DOCOMO Digital España, Madrid, Spain

Sergio Ramírez-Gallego

About the authors

Julián Luengo received the M.S. degree in computer science and the Ph.D. from the University of Granada, Granada, Spain, in 2006 and 2011 respectively. He currently acts as an Assistant Professor in the Department of Computer Science and Artificial Intelligence at the University of Granada, Spain. His research interests include machine learning and data mining, data preparation in knowledge discovery and data mining, missing values, noisy data, data complexity and fuzzy systems. Dr. Luengo has been given some awards and honors for his personal work or for his publications in and conferences, such as IFSA-EUSFLAT 2009 Best Student Paper Award. He belongs to the list of the Highly Cited Researchers in the area of Computer Sciences (2015- 2018) (Clarivate Analytics).

Diego Garcı́a-Gil received the M.Sc. degree in computer science from the University of Granada, Granada, Spain, in 2015. He is currently pursuing the Ph.D. degree with the Department ofComputer Science and Artificial Intelligence, University of Granada, Granada, Spain. His current research interests include machine learning, data mining, data preprocessing and Big Data.

Sergio Ramírez-Gallego received the M.Sc. degree in computer science from the University of Jaén, Jaén, Spain, in 2012. He obtained the Ph.D. degree with the Department of Computer Science and Artificial Intelligence, University of Granada, Spain in 2018. His current research interests include data mining, data preprocessing, big data, and cloud computing.

Salvador García received the B.S. and Ph.D. degrees in Computer Science from the University of Granada, Granada, Spain, in 2004 and 2008, respectively. He is currently an Associate Professor in the Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain. Dr. García has published more than 80 papers in international journals (more than60 in Q1), h-index 43, over 60 papers in international conference proceedings (data from Web of Science). He has organized several special sessions and workshops related to data preprocessing and evolutionary learning in conferences such as “Hybrid Intelligent Systems”, “Intelligent Systems Design and Applications” and “International Joint-Conference of Neural Networks”. He has been associated with the international program committees and organizing committees of several regular international conferences including IEEE CEC, ICPR, ICDM, IJCAI, etc. As edited activities, he has co-edited two special issues in international journals and he is an associate editor of “Information Fusion” (Elsevier), “Swarm and Evolutionary Computation” (Elsevier) and “AI Communications” (IOS Press) journals, and he is co-Editor in Chief of the international journal “Progress in Artificial Intelligence” (Springer). He is a co-author of the books entitled “Data Preprocessing in Data Mining” and “Learning fromImbalanced Data Sets” published by Springer. His research interests include data science, data preprocessing, Big Data, evolutionary learning, Deep Learning, metaheuristics and biometrics.

Francisco Herrera (SM'15) received his M.Sc. in Mathematics in 1988 and Ph.D. in Mathematics in 1991, both from the University of Granada, Spain. He is currently a Professor in the Department of Computer Science and Artificial Intelligence at the University of Granada and Director of DaSCI Institute (Andalusian Research Institute in Data Science and Computational Intelligence). He has been the supervisor of 44 Ph.D. students. He has published more than 400 journal papers, receiving more than 66000 citations (Scholar Google, H-index 132). He is co-author of the books "Genetic Fuzzy Systems" (World Scientific, 2001) and "Data Preprocessing in Data Mining" (Springer, 2015), "The 2-tuple Linguistic Model. Computing with Words in Decision Making" (Springer, 2015), "Multilabel Classification. Problem analysis, metrics and techniques" (Springer, 2016), “Multiple Instance Learning. Foundations and Algorithms" (Springer, 2016) and “Learning from Imbalanced Data Sets” (Springer, 2018). He currently acts as Editor in Chief of the international journals "Information Fusion" (Elsevier) and “Progress in Artificial Intelligence (Springer). He acts as editorial member of a dozen of journals.

Bibliographic Information

Book Title: Big Data Preprocessing
Book Subtitle: Enabling Smart Data
Authors: Julián Luengo, Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Francisco Herrera
DOI: https://doi.org/10.1007/978-3-030-39105-8
Publisher: Springer Cham
eBook Packages: Computer Science, Computer Science (R0)
Copyright Information: Springer Nature Switzerland AG 2020
Hardcover ISBN: 978-3-030-39104-1Published: 17 March 2020
Softcover ISBN: 978-3-030-39107-2Published: 17 March 2021
eBook ISBN: 978-3-030-39105-8Published: 16 March 2020
Edition Number: 1
Number of Pages: XIII, 186
Number of Illustrations: 3 b/w illustrations, 54 illustrations in colour
Topics: Big Data, Machine Learning, Information Systems and Communication Service

Publish with us

Policies and ethics

Big Data Preprocessing

Overview

Access this book

Other ways to access

Table of contents (10 chapters)

Front Matter

Keywords

About this book

Authors and Affiliations

Department of Computer Science and AI, University of Granada, Granada, Spain

DOCOMO Digital España, Madrid, Spain

About the authors

Bibliographic Information

Publish with us

Search

Navigation