Scaling Machine Learning with Spark: Distributed ML with MLlib, TensorFlow, and PyTorch
Author: Adi Polak
Price: $64.64
Publication Date: April 11, 2023
This book presents a detailed guide to harnessing the power of scalable machine learning using Spark. As data becomes exponentially larger, traditional models struggle to keep up. Polak equips readers with hands-on techniques using MLlib, TensorFlow, and PyTorch to build robust applications that can handle significant datasets. Whether you are a beginner or an experienced data scientist, the practical examples and insights shared in this book make it a must-read for anyone looking to excel in distributed machine learning.
Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud
Author: Robert Ilijason
Price: $22.69
Publication Date: June 12, 2020
This engaging guide serves as a springboard into the world of big data analytics using Azure Databricks. Ilijason beautifully explains complex concepts with clear examples, making Spark accessible for beginners. The book emphasizes practical implementation, enabling readers to effectively manage and analyze large data clusters in the cloud. With the ever-growing demand for cloud-based technologies, this book is an essential resource for those wanting to capitalize on data analysis and visualization in real-time.
Graph Algorithms: Practical Examples in Apache Spark and Neo4j
Authors: Mark Needham, Amy E. Hodler
Price: $56.01
Publication Date: June 25, 2019
Needham and Hodler dive into the world of graph algorithms, showcasing their necessity in modern data tasks. This book takes a unique approach by integrating Apache Spark and Neo4j, demonstrating how to leverage graph structures and algorithms in big data environments. With a focus on practical examples and applications, this book is invaluable for data scientists and analysts eager to uncover insights from connected data.
Databricks Certified Associate Developer for Apache Spark Using Python
Author: Saba Shah
Price: $28.00
Publication Date: June 14, 2024
Shah’s ultimate guide paves the way for readers aspiring to become certified developers in Apache Spark. Filled with practical examples and exercises, it couples theoretical understanding with hands-on practice. Additionally, this book familiarizes readers with the Python language as it pertains to managing Spark applications, enhancing their professional competency. For those eyeing certification or advanced knowledge in Spark, this guide is a vital step in your learning journey.
Apache Airflow Best Practices: A Practical Guide to Orchestrating Data Workflow with Apache Airflow
Authors: Dylan Intorf, Dylan Storey, Kendrick van Doorn
Price: $35.99
Publication Date: October 31, 2024
This book emerges as a crucial resource for professionals looking to improve their data orchestration skills using Apache Airflow. Intorf and his co-authors share best practices for workflow management and real-world applications, providing insightful tips and strategies to overcome common challenges in data engineering. If you are keen on mastering data pipelines, this book offers essential knowledge to ensure your workflows are efficient and error-free.
Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications
Authors: Fabian Hueske, Vasiliki Kalavri
Price: $47.99
Publication Date: May 21, 2019
This comprehensive guide addresses the growing demand for understanding stream processing systems like Apache Flink. Hueske and Kalavri delve into the fundamentals and operational strategies, ensuring readers grasp both the theoretical and practical elements of stream processing. With this book, professionals will learn how to design, implement, and maintain real-time streaming applications effectively, essential in a world where timely data processing is pivotal.
Beginning Apache Spark 3: With DataFrame, Spark SQL, Structured Streaming, and Spark Machine Learning Library
Author: Hien Luu
Price: $47.66
Publication Date: October 22, 2021
Luu’s book serves as an excellent entry point into the latest iterations of Spark, particularly concentrating on practical usage of DataFrames, SQL, and machine learning. With detailed explanations and practical applications, this book empowers readers to leverage the full potential of Apache Spark 3. It is an invaluable asset for data professionals wanting to deepen their knowledge of big data technologies and embrace Spark’s advancements for impactful analytics.
Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications
Author: Scott Haines
Price: $29.06
Publication Date: March 23, 2022
Haines presents a practical and insightful guide for engineers interested in building mission-critical applications using Apache Spark. This book emphasizes hands-on examples and methodologies—enabling readers to construct robust, scalable, and efficient data processing systems. The contemporary approach to data engineering covered, couples theory with real-world applications, making it a must-read for engineers in the rapidly evolving field of big data.
Hands-on Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing
Author: Alfonso Antolínez García
Price: $33.81
Publication Date: June 6, 2023
This guide stands out as an essential reference for building scalable engines addressing both batch and stream data processing needs. Antolínez García introduces strong foundational principles, complemented by hands-on tutorials, that will propel both beginners and professionals to build effective data processing engines. This book is particularly useful for tech enthusiasts looking to apply Spark 3 capabilities in real-world scenarios.
Data Analysis with Python and PySpark
Author: Jonathan Rioux
Price: $59.99
Publication Date: March 22, 2022
Rioux delivers an insightful exploration of data analysis leveraging Python and PySpark. This book covers an extensive range of techniques, from basic data manipulation to advanced analytics, thereby catering to a diverse audience—from beginners to seasoned analysts. The practical, hands-on approach ensures that readers can apply the concepts immediately, making this guide a staple for anyone serious about data-driven decision-making.