Read While You Wait - Get immediate ebook access, if available*, when you order a print book

Beginning Apache Spark Using Azure Databricks

Unleashing Large Cluster Analytics in the Cloud

Authors: Ilijason, Robert

Free Preview
  • Teaches you how to extract value from massive datasets, using a toolset that can be up and running the same day
  • Shows you why Azure Databricks is an up-and-coming, fast-growing tool that anyone in data should know about
  • Aimed at data analysts and business analysts who are curious about the hype surrounding cloud technology and Apache Spark
see more benefits

Buy this book

eBook £23.99
price for United Kingdom (gross)
  • ISBN 978-1-4842-5781-4
  • Digitally watermarked, DRM-free
  • Included format: EPUB, PDF
  • ebooks can be used on all reading devices
  • Immediate eBook download after purchase
Softcover £29.99
price for United Kingdom (gross)
  • ISBN 978-1-4842-5780-7
  • Free shipping for individuals worldwide
  • Immediate ebook access, if available*, with your print order
  • Usually dispatched within 3 to 5 business days.
About this book

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incrementally faster.

This book explains how the confluence of these pivotal technologies gives you enormous power, and cheaply, when it comes to huge datasets. You will begin by learning how cloud infrastructure makes it possible to scale your code to large amounts of processing units, without having to pay for the machinery in advance. From there you will learn how Apache Spark, an open source framework, can enable all those CPUs for data analytics use. Finally, you will see how services such as Databricks provide the power of Apache Spark, without you having to know anything about configuring hardware or software. By removing the need for expensive experts and hardware, your resources can instead be allocated to actually finding business value in the data.

This book guides you through some advanced topics such as analytics in the cloud, data lakes, data ingestion, architecture, machine learning, and tools, including Apache Spark, Apache Hadoop, Apache Hive, Python, and SQL. Valuable exercises help reinforce what you have learned.


What You Will Learn

  • Discover the value of big data analytics that leverage the power of the cloud
  • Get started with Databricks using SQL and Python in either Microsoft Azure or AWS
  • Understand the underlying technology, and how the cloud and Apache Spark fit into the bigger picture
  • See how these tools are used in the real world
  • Run basic analytics, including machine learning, on billions of rows at a fraction of a cost or free


Who This Book Is For

Data engineers, data scientists, and cloud architects who want or need to run advanced analytics in the cloud. It is assumed that the reader has data experience, but perhaps minimal exposure to Apache Spark and Azure Databricks. The book is also recommended for people who want to get started in the analytics field, as it provides a strong foundation.

About the authors

Robert Ilijason is a 20-year veteran in the business intelligence (BI) segment. He has worked as a contractor for some of Europe’s biggest companies and has conducted large-scale analytics projects within the areas of retail, telecom, banking, government, and more. He has seen his share of analytic trends come and go over the years, but unlike most of them, he strongly believes that Apache Spark in the cloud, especially with Azure Databricks, is a game changer.

Table of contents (11 chapters)

Table of contents (11 chapters)

Buy this book

eBook £23.99
price for United Kingdom (gross)
  • ISBN 978-1-4842-5781-4
  • Digitally watermarked, DRM-free
  • Included format: EPUB, PDF
  • ebooks can be used on all reading devices
  • Immediate eBook download after purchase
Softcover £29.99
price for United Kingdom (gross)
  • ISBN 978-1-4842-5780-7
  • Free shipping for individuals worldwide
  • Immediate ebook access, if available*, with your print order
  • Usually dispatched within 3 to 5 business days.
Loading...

Recommended for you

Loading...

Bibliographic Information

Bibliographic Information
Book Title
Beginning Apache Spark Using Azure Databricks
Book Subtitle
Unleashing Large Cluster Analytics in the Cloud
Authors
Copyright
2020
Publisher
Apress
Copyright Holder
Robert Ilijason
eBook ISBN
978-1-4842-5781-4
DOI
10.1007/978-1-4842-5781-4
Softcover ISBN
978-1-4842-5780-7
Edition Number
1
Number of Pages
XVII, 274
Number of Illustrations
14 b/w illustrations
Topics

*immediately available upon purchase as print book shipments may be delayed due to the COVID-19 crisis. ebook access is temporary and does not include ownership of the ebook. Only valid for books with an ebook version. Springer Reference Works and instructor copies are not included.