Overview

Authors:

K.G. Srinivasa ⁰,
Anil Kumar Muppalla ¹

K.G. Srinivasa
1. M.S. Ramaiah Institute of Technology, Bangalore, India
View author publications

You can also search for this author in PubMed Google Scholar
Anil Kumar Muppalla
1. M.S. Ramaiah Institute of Technology, Bangalore, India
View author publications

You can also search for this author in PubMed Google Scholar

Provides a guide to the distributed computing technologies of Hadoop and Spark, from the perspective of industry practitioners
Supports the theory with case studies taken from a range of disciplines, including data mining, machine learning, graph processing and image processing
Supplies working source code to aid understanding through step-by-step implementation
Includes supplementary material: sn.pub/extras

Part of the book series: Computer Communications and Networks (CCN)

29k Accesses
4 Citations
1 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 39.99

Price excludes VAT (USA)

Softcover Book USD 54.99

Price excludes VAT (USA)

Hardcover Book USD 54.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (8 chapters)

Front Matter

Pages i-xvii

Download chapter PDF
Programming Fundamentals of High Performance Distributed Computing
1. Front Matter
  
  Pages 1-1
  
  Download chapter PDF
2. Introduction
  
  K. G. Srinivasa, Anil Kumar Muppalla
  
  Pages 3-31
3. Getting Started with Hadoop
  
  K. G. Srinivasa, Anil Kumar Muppalla
  
  Pages 33-72
4. Getting Started with Spark
  
  K. G. Srinivasa, Anil Kumar Muppalla
  
  Pages 73-99
5. Programming Internals of Scalding and Spark
  
  K.G. Srinivasa, Anil Kumar Muppalla
  
  Pages 101-154
Case Studies Using Hadoop, Scalding and Spark
1. Front Matter
  
  Pages 155-155
  
  Download chapter PDF
2. Case Study I: Data Clustering using Scalding and Spark
  
  K G Srinivasa, Anil Kumar Muppalla
  
  Pages 157-183
3. Case Study II: Data Classification using Scalding and Spark
  
  K G Srinivasa, Anil Kumar Muppalla
  
  Pages 185-217
4. Case Study III: Regression Analysis using Scalding and Spark
  
  K G Srinivasa, Anil Kumar Muppalla
  
  Pages 219-259
5. Case Study IV: Recommender System Using Scalding and Spark
  
  K. G. Srinivasa, Anil Kumar Muppalla
  
  Pages 261-301
Back Matter

Pages 303-304

Download chapter PDF

Keywords

About this book

This timely text/reference describes the development and implementation of large-scale distributed processing systems using open source tools and technologies. Comprehensive in scope, the book presents state-of-the-art material on building high performance distributed computing systems, providing practical guidance and best practices as well as describing theoretical software frameworks. Features: describes the fundamentals of building scalable software systems for large-scale data processing in the new paradigm of high performance distributed computing; presents an overview of the Hadoop ecosystem, followed by step-by-step instruction on its installation, programming and execution; Reviews the basics of Spark, including resilient distributed datasets, and examines Hadoop streaming and working with Scalding; Provides detailed case studies on approaches to clustering, data classification and regression analysis; Explains the process of creating a working recommender system using Scalding and Spark.

Authors and Affiliations

M.S. Ramaiah Institute of Technology, Bangalore, India

K.G. Srinivasa, Anil Kumar Muppalla

Bibliographic Information

Book Title: Guide to High Performance Distributed Computing
Book Subtitle: Case Studies with Hadoop, Scalding and Spark
Authors: K.G. Srinivasa, Anil Kumar Muppalla
Series Title: Computer Communications and Networks
DOI: https://doi.org/10.1007/978-3-319-13497-0
Publisher: Springer Cham
eBook Packages: Computer Science, Computer Science (R0)
Copyright Information: Springer International Publishing Switzerland 2015
Hardcover ISBN: 978-3-319-13496-3Published: 09 March 2015
Softcover ISBN: 978-3-319-38347-7Published: 06 October 2016
eBook ISBN: 978-3-319-13497-0Published: 09 February 2015
Series ISSN: 1617-7975
Series E-ISSN: 2197-8433
Edition Number: 1
Number of Pages: XVII, 304
Number of Illustrations: 43 b/w illustrations
Topics: Computer Communication Networks, Programming Techniques, Data Mining and Knowledge Discovery, Artificial Intelligence, Image Processing and Computer Vision

Publish with us

Policies and ethics

Guide to High Performance Distributed Computing

Overview

Access this book

Other ways to access

Table of contents (8 chapters)

Front Matter

Programming Fundamentals of High Performance Distributed Computing

Front Matter

Case Studies Using Hadoop, Scalding and Spark

Front Matter

Back Matter

Keywords

About this book

Authors and Affiliations

M.S. Ramaiah Institute of Technology, Bangalore, India

Bibliographic Information

Publish with us

Search

Navigation