PRACTICAL MACHINE LEARNING ON DATABRICKS seamlessly transition ML models and MLOps on Databricks

Name: PRACTICAL MACHINE LEARNING ON DATABRICKS seamlessly transition ML models and MLOps on Databricks /
Availability: OnlineOnly
Author: Sinha, Debu,

Average Rating

Author

Sinha, Debu,

Published

Birmingham, UK : Packt Publishing Ltd., 2023.

Status

Available Online

Links

O'Reilly

Description

Take your machine learning skills to the next level by mastering databricks and building robust ML pipeline solutions for future ML innovations Key Features Learn to build robust ML pipeline solutions for databricks transition Master commonly available features like AutoML and MLflow Leverage data governance and model deployment using MLflow model registry Purchase of the print or Kindle book includes a free PDF eBook Book Description Unleash the potential of databricks for end-to-end machine learning with this comprehensive guide, tailored for experienced data scientists and developers transitioning from DIY or other cloud platforms. Building on a strong foundation in Python, Practical Machine Learning on Databricks serves as your roadmap from development to production, covering all intermediary steps using the databricks platform. You'll start with an overview of machine learning applications, databricks platform features, and MLflow. Next, you'll dive into data preparation, model selection, and training essentials and discover the power of databricks feature store for precomputing feature tables. You'll also learn to kickstart your projects using databricks AutoML and automate retraining and deployment through databricks workflows. By the end of this book, you'll have mastered MLflow for experiment tracking, collaboration, and advanced use cases like model interpretability and governance. The book is enriched with hands-on example code at every step. While primarily focused on generally available features, the book equips you to easily adapt to future innovations in machine learning, databricks, and MLflow. What you will learn Transition smoothly from DIY setups to databricks Master AutoML for quick ML experiment setup Automate model retraining and deployment Leverage databricks feature store for data prep Use MLflow for effective experiment tracking Gain practical insights for scalable ML solutions Find out how to handle model drifts in production environments Who this book is for This book is for experienced data scientists, engineers, and developers proficient in Python, statistics, and ML lifecycle looking to transition to databricks from DIY clouds. Introductory Spark knowledge is a must to make the most out of this book, however, end-to-end ML workflows will be covered. If you aim to accelerate your machine learning workflows and deploy scalable, robust solutions, this book is an indispensable resource.

More Details

Format

Edition

1st edition.

Language

English

ISBN

9781801818292, 1801818290

Notes

Description

Local note

O'Reilly O'Reilly Online Learning: Academic/Public Library Edition

Cover

Title page

Contributors

About the author

About the reviewers

Table of Contents

Preface

Part 1: Introduction

Chapter 1: The ML Process and Its Challenges

Understanding the typical machine learning process

Discovering the roles associated with machine learning projects in organizations

Challenges with productionizing machine learning use cases in organizations

Understanding the requirements of an enterprise-grade machine learning platform

Scalability

the growth catalyst

Performance

ensuring efficiency and speed

Security

safeguarding data and models

Governance

steering the machine learning life cycle

Reproducibility

ensuring trust and consistency

Ease of use

balancing complexity and usability

Exploring Databricks and the Lakehouse architecture

Scalability

the growth catalyst

Performance

ensuring efficiency and speed

Security

safeguarding data and models

Governance

steering the machine learning life cycle

Reproducibility

ensuring trust and consistency

Ease of use

balancing complexity and usability

Simplifying machine learning development with the Lakehouse architecture

Summary

Further reading

Chapter 2: Overview of ML on Databricks

Technical requirements

Setting up a Databricks trial account

Exploring the workspace

Repos

Exploring clusters

Single user

Shared

No isolation shared

Single-node clusters

Exploring notebooks

Exploring data

Exploring experiments

Discovering the feature store

Discovering the model registry

Libraries

Storing libraries

Managing libraries

Databricks Runtime and libraries

Library usage modes

Unity Catalog limitations

Installation sources for libraries

Summary

Further reading

Chapter 4: Understanding MLflow Components on Databricks

Technical requirements

Overview of MLflow

MLflow Tracking

MLflow Models

MLflow Model Registry

Example code showing how to track ML model training in Databricks

Summary

Chapter 5: Create a Baseline Model Using Databricks AutoML

Technical requirements

Understanding the need for AutoML

Understanding AutoML in Databricks

Sampling large datasets

Imbalance data detection

Splitting data into train/validation/test sets

Enhancing semantic type detection

Shapley value (SHAP) for model explainability

Feature Store integration

Running AutoML on our churn prediction dataset

Summary

Further reading

Part 3: ML Governance and Deployment

Chapter 6: Model Versioning and Webhooks

Technical requirements

Understanding the need for the Model Registry

Registering your candidate model to the Model Registry and managing access

Diving into the webhooks support in the Model Registry

Summary

Further reading

Chapter 7: Model Deployment Approaches

Technical requirements

Understanding ML deployments and paradigms

Deploying ML models for batch and streaming inference

Batch inference on Databricks

Streaming inference on Databricks

Deploying ML models for real-time inference

In-depth analysis of the constraints and capabilities of Databricks Model Serving.

Incorporating custom Python libraries into MLflow models for Databricks deployment

Deploying custom models with MLflow and Model Serving

Packaging dependencies with MLflow models

Summary

Further reading

Chapter 8: Automating ML Workflows Using Databricks Jobs

Technical requirements

Understanding Databricks Workflows

Utilizing Databricks Workflows with Jobs to automate model training and testing

Summary

Further reading

Chapter 9: Model Drift Detection and Retraining

Technical requirements

The motivation behind model monitoring

Introduction to model drift

Introduction to Statistical Drift

Techniques for drift detection

Hypothesis testing

Statistical tests and measurements for numeric features

Statistical tests and measurements for categorical features

Statistical tests and measurements on models

Implementing drift detection on Databricks

Summary

Chapter 10: Using CI/CD to Automate Model Retraining and Redeployment

Introduction to MLOps

Delta Lake

more than just a data lake

Comprehensive model management with Databricks MLflow

Integrating DevOps and MLOps for robust ML pipelines with Databricks

Fundamentals of MLOps and deployment patterns

Navigating environment isolation in Databricks

multiple strategies for MLOps

Understanding ML deployment patterns

The deploy models approach

The deploy code approach

Summary

Further reading

Index

Other Books You May Enjoy.

Subjects

LC Subjects

Machine learning -- Computer programs.

Discover More

Summary

Video And Music

Video Games

About The Author

Look Inside

Series

Professional Reviews

Also Available As

Awards and Honors

Browse Shelf

Reviews from GoodReads

Loading GoodReads Reviews.

Citations

APA Citation, 7th Edition (style guide)

Sinha, D. (2023). PRACTICAL MACHINE LEARNING ON DATABRICKS: seamlessly transition ML models and MLOps on Databricks (1st edition.). Packt Publishing Ltd..

Chicago / Turabian - Author Date Citation, 17th Edition (style guide)

Sinha, Debu. 2023. PRACTICAL MACHINE LEARNING ON DATABRICKS: Seamlessly Transition ML Models and MLOps On Databricks. Birmingham, UK: Packt Publishing Ltd.

Chicago / Turabian - Humanities (Notes and Bibliography) Citation, 17th Edition (style guide)

Sinha, Debu. PRACTICAL MACHINE LEARNING ON DATABRICKS: Seamlessly Transition ML Models and MLOps On Databricks Birmingham, UK: Packt Publishing Ltd, 2023.

Harvard Citation (style guide)

Sinha, D. (2023). PRACTICAL MACHINE LEARNING ON DATABRICKS: seamlessly transition ML models and mlops on databricks. 1st edn. Birmingham, UK: Packt Publishing Ltd.

MLA Citation, 9th Edition (style guide)

Sinha, Debu. PRACTICAL MACHINE LEARNING ON DATABRICKS: Seamlessly Transition ML Models and MLOps On Databricks 1st edition., Packt Publishing Ltd., 2023.

Note! Citations contain only title, author, edition, publisher, and year published. Citations should be used as a guideline and should be double checked for accuracy. Citation formats are based on standards as of August 2021.

Staff View

Grouped Work ID

e5fdfd95-c7b0-cfd9-f736-83399ccd3ae1-eng

Go To Grouped Work View in Staff Client

Grouping Information

Grouped Work ID	e5fdfd95-c7b0-cfd9-f736-83399ccd3ae1-eng
Full title	practical machine learning on databricks seamlessly transition ml models and mlops on databricks
Author	sinha debu
Grouping Category	book
Last Update	2025-01-24 12:33:29PM
Last Indexed	2025-05-22 03:43:05AM

Book Cover Information

Image Source	default
First Loaded	May 30, 2025
Last Used	May 30, 2025

Marc Record

First Detected	Dec 16, 2024 11:27:19 PM
Last File Modification Time	Dec 17, 2024 08:26:44 AM
Suppressed	Record had no items

MARC Record

LEADER	09029cam a22003977a 4500
001	on1407093858
003	OCoLC
005	20241217082456.0
006	m o d
007	cr \|n\|\|\|\|\|\|\|\|\|
008	231103s2022 enk o 000 0 eng d
020			\|a 9781801818292\|q (electronic bk.)
020			\|a 1801818290\|q (electronic bk.)
035			\|a (OCoLC)1407093858
037			\|a 9781801812030\|b O'Reilly Media
040			\|a YDX\|b eng\|c YDX\|d OCLCO\|d ORMDA\|d DXU
049			\|a MAIN
050		4	\|a Q325.5
082	0	4	\|a 006.3/1\|2 23/eng/20231205
100	1		\|a Sinha, Debu,\|e author.
245	1	0	\|a PRACTICAL MACHINE LEARNING ON DATABRICKS\|h [electronic resource] :\|b seamlessly transition ML models and MLOps on Databricks /\|c Debu Sinha.
250			\|a 1st edition.
260			\|a Birmingham, UK :\|b Packt Publishing Ltd.,\|c 2023.
300			\|a 1 online resource
505	0		\|a Cover -- Title page -- Copyright and credits -- Contributors -- About the author -- About the reviewers -- Table of Contents -- Preface -- Part 1: Introduction -- Chapter 1: The ML Process and Its Challenges -- Understanding the typical machine learning process -- Discovering the roles associated with machine learning projects in organizations -- Challenges with productionizing machine learning use cases in organizations -- Understanding the requirements of an enterprise-grade machine learning platform -- Scalability -- the growth catalyst -- Performance -- ensuring efficiency and speed -- Security -- safeguarding data and models -- Governance -- steering the machine learning life cycle -- Reproducibility -- ensuring trust and consistency -- Ease of use -- balancing complexity and usability -- Exploring Databricks and the Lakehouse architecture -- Scalability -- the growth catalyst -- Performance -- ensuring efficiency and speed -- Security -- safeguarding data and models -- Governance -- steering the machine learning life cycle -- Reproducibility -- ensuring trust and consistency -- Ease of use -- balancing complexity and usability -- Simplifying machine learning development with the Lakehouse architecture -- Summary -- Further reading -- Chapter 2: Overview of ML on Databricks -- Technical requirements -- Setting up a Databricks trial account -- Exploring the workspace -- Repos -- Exploring clusters -- Single user -- Shared -- No isolation shared -- Single-node clusters -- Exploring notebooks -- Exploring data -- Exploring experiments -- Discovering the feature store -- Discovering the model registry -- Libraries -- Storing libraries -- Managing libraries -- Databricks Runtime and libraries -- Library usage modes -- Unity Catalog limitations -- Installation sources for libraries -- Summary -- Further reading.
505	8		\|a Part 2: ML Pipeline Components and Implementation -- Chapter 3: Utilizing the Feature Store -- Technical requirements -- Diving into feature stores and the problems they solve -- Discovering feature stores on the Databricks platform -- Feature table -- Offline store -- Online store -- Training Set -- Model packaging -- Registering your first feature table in Databricks Feature Store -- Summary -- Further reading -- Chapter 4: Understanding MLflow Components on Databricks -- Technical requirements -- Overview of MLflow -- MLflow Tracking -- MLflow Models -- MLflow Model Registry -- Example code showing how to track ML model training in Databricks -- Summary -- Chapter 5: Create a Baseline Model Using Databricks AutoML -- Technical requirements -- Understanding the need for AutoML -- Understanding AutoML in Databricks -- Sampling large datasets -- Imbalance data detection -- Splitting data into train/validation/test sets -- Enhancing semantic type detection -- Shapley value (SHAP) for model explainability -- Feature Store integration -- Running AutoML on our churn prediction dataset -- Summary -- Further reading -- Part 3: ML Governance and Deployment -- Chapter 6: Model Versioning and Webhooks -- Technical requirements -- Understanding the need for the Model Registry -- Registering your candidate model to the Model Registry and managing access -- Diving into the webhooks support in the Model Registry -- Summary -- Further reading -- Chapter 7: Model Deployment Approaches -- Technical requirements -- Understanding ML deployments and paradigms -- Deploying ML models for batch and streaming inference -- Batch inference on Databricks -- Streaming inference on Databricks -- Deploying ML models for real-time inference -- In-depth analysis of the constraints and capabilities of Databricks Model Serving.
505	8		\|a Incorporating custom Python libraries into MLflow models for Databricks deployment -- Deploying custom models with MLflow and Model Serving -- Packaging dependencies with MLflow models -- Summary -- Further reading -- Chapter 8: Automating ML Workflows Using Databricks Jobs -- Technical requirements -- Understanding Databricks Workflows -- Utilizing Databricks Workflows with Jobs to automate model training and testing -- Summary -- Further reading -- Chapter 9: Model Drift Detection and Retraining -- Technical requirements -- The motivation behind model monitoring -- Introduction to model drift -- Introduction to Statistical Drift -- Techniques for drift detection -- Hypothesis testing -- Statistical tests and measurements for numeric features -- Statistical tests and measurements for categorical features -- Statistical tests and measurements on models -- Implementing drift detection on Databricks -- Summary -- Chapter 10: Using CI/CD to Automate Model Retraining and Redeployment -- Introduction to MLOps -- Delta Lake -- more than just a data lake -- Comprehensive model management with Databricks MLflow -- Integrating DevOps and MLOps for robust ML pipelines with Databricks -- Fundamentals of MLOps and deployment patterns -- Navigating environment isolation in Databricks -- multiple strategies for MLOps -- Understanding ML deployment patterns -- The deploy models approach -- The deploy code approach -- Summary -- Further reading -- Index -- Other Books You May Enjoy.
520			\|a Take your machine learning skills to the next level by mastering databricks and building robust ML pipeline solutions for future ML innovations Key Features Learn to build robust ML pipeline solutions for databricks transition Master commonly available features like AutoML and MLflow Leverage data governance and model deployment using MLflow model registry Purchase of the print or Kindle book includes a free PDF eBook Book Description Unleash the potential of databricks for end-to-end machine learning with this comprehensive guide, tailored for experienced data scientists and developers transitioning from DIY or other cloud platforms. Building on a strong foundation in Python, Practical Machine Learning on Databricks serves as your roadmap from development to production, covering all intermediary steps using the databricks platform. You'll start with an overview of machine learning applications, databricks platform features, and MLflow. Next, you'll dive into data preparation, model selection, and training essentials and discover the power of databricks feature store for precomputing feature tables. You'll also learn to kickstart your projects using databricks AutoML and automate retraining and deployment through databricks workflows. By the end of this book, you'll have mastered MLflow for experiment tracking, collaboration, and advanced use cases like model interpretability and governance. The book is enriched with hands-on example code at every step. While primarily focused on generally available features, the book equips you to easily adapt to future innovations in machine learning, databricks, and MLflow. What you will learn Transition smoothly from DIY setups to databricks Master AutoML for quick ML experiment setup Automate model retraining and deployment Leverage databricks feature store for data prep Use MLflow for effective experiment tracking Gain practical insights for scalable ML solutions Find out how to handle model drifts in production environments Who this book is for This book is for experienced data scientists, engineers, and developers proficient in Python, statistics, and ML lifecycle looking to transition to databricks from DIY clouds. Introductory Spark knowledge is a must to make the most out of this book, however, end-to-end ML workflows will be covered. If you aim to accelerate your machine learning workflows and deploy scalable, robust solutions, this book is an indispensable resource.
590			\|a O'Reilly\|b O'Reilly Online Learning: Academic/Public Library Edition
650		0	\|a Machine learning\|x Computer programs.
776	0	8	\|i Print version:\|z 9781801818292
776	0	8	\|i Print version:\|z 1801812039\|z 9781801812030\|w (OCoLC)1337143940
856	4	0	\|u https://library.access.arlingtonva.us/login?url=https://learning.oreilly.com/library/view/~/9781801812030/?ar\|x O'Reilly\|z eBook
938			\|a YBP Library Services\|b YANK\|n 305792890
994			\|a 92\|b VIA
999			\|c 360015\|d 360015

ALERT: Maintenance and Upgrades at Central Library

Navigation

PRACTICAL MACHINE LEARNING ON DATABRICKS seamlessly transition ML models and MLOps on Databricks

Links

Description

More Details

Notes

Table of Contents

Subjects

Discover More

Summary

Video And Music

Video Games

About The Author

Look Inside

Series

You May Also Like

Top Picks

Professional Reviews

Also Available As

Awards and Honors

Browse Shelf

Also in this Series

More Like This

Excerpt

Author Notes

Similar Series From Novelist

Similar Titles From NoveList

Similar Authors From NoveList

Published Reviews

Reviews from GoodReads

Citations

Staff View

Grouping Information

Book Cover Information

Marc Record

MARC Record