PRACTICAL MACHINE LEARNING ON DATABRICKS seamlessly transition ML models and MLOps on Databricks

Book Cover
Average Rating
Published
Birmingham, UK : Packt Publishing Ltd., 2023.
Status
Available Online

Description

Take your machine learning skills to the next level by mastering databricks and building robust ML pipeline solutions for future ML innovations Key Features Learn to build robust ML pipeline solutions for databricks transition Master commonly available features like AutoML and MLflow Leverage data governance and model deployment using MLflow model registry Purchase of the print or Kindle book includes a free PDF eBook Book Description Unleash the potential of databricks for end-to-end machine learning with this comprehensive guide, tailored for experienced data scientists and developers transitioning from DIY or other cloud platforms. Building on a strong foundation in Python, Practical Machine Learning on Databricks serves as your roadmap from development to production, covering all intermediary steps using the databricks platform. You'll start with an overview of machine learning applications, databricks platform features, and MLflow. Next, you'll dive into data preparation, model selection, and training essentials and discover the power of databricks feature store for precomputing feature tables. You'll also learn to kickstart your projects using databricks AutoML and automate retraining and deployment through databricks workflows. By the end of this book, you'll have mastered MLflow for experiment tracking, collaboration, and advanced use cases like model interpretability and governance. The book is enriched with hands-on example code at every step. While primarily focused on generally available features, the book equips you to easily adapt to future innovations in machine learning, databricks, and MLflow. What you will learn Transition smoothly from DIY setups to databricks Master AutoML for quick ML experiment setup Automate model retraining and deployment Leverage databricks feature store for data prep Use MLflow for effective experiment tracking Gain practical insights for scalable ML solutions Find out how to handle model drifts in production environments Who this book is for This book is for experienced data scientists, engineers, and developers proficient in Python, statistics, and ML lifecycle looking to transition to databricks from DIY clouds. Introductory Spark knowledge is a must to make the most out of this book, however, end-to-end ML workflows will be covered. If you aim to accelerate your machine learning workflows and deploy scalable, robust solutions, this book is an indispensable resource.

More Details

Format
Edition
1st edition.
Language
English
ISBN
9781801818292, 1801818290

Notes

Description
Take your machine learning skills to the next level by mastering databricks and building robust ML pipeline solutions for future ML innovations Key Features Learn to build robust ML pipeline solutions for databricks transition Master commonly available features like AutoML and MLflow Leverage data governance and model deployment using MLflow model registry Purchase of the print or Kindle book includes a free PDF eBook Book Description Unleash the potential of databricks for end-to-end machine learning with this comprehensive guide, tailored for experienced data scientists and developers transitioning from DIY or other cloud platforms. Building on a strong foundation in Python, Practical Machine Learning on Databricks serves as your roadmap from development to production, covering all intermediary steps using the databricks platform. You'll start with an overview of machine learning applications, databricks platform features, and MLflow. Next, you'll dive into data preparation, model selection, and training essentials and discover the power of databricks feature store for precomputing feature tables. You'll also learn to kickstart your projects using databricks AutoML and automate retraining and deployment through databricks workflows. By the end of this book, you'll have mastered MLflow for experiment tracking, collaboration, and advanced use cases like model interpretability and governance. The book is enriched with hands-on example code at every step. While primarily focused on generally available features, the book equips you to easily adapt to future innovations in machine learning, databricks, and MLflow. What you will learn Transition smoothly from DIY setups to databricks Master AutoML for quick ML experiment setup Automate model retraining and deployment Leverage databricks feature store for data prep Use MLflow for effective experiment tracking Gain practical insights for scalable ML solutions Find out how to handle model drifts in production environments Who this book is for This book is for experienced data scientists, engineers, and developers proficient in Python, statistics, and ML lifecycle looking to transition to databricks from DIY clouds. Introductory Spark knowledge is a must to make the most out of this book, however, end-to-end ML workflows will be covered. If you aim to accelerate your machine learning workflows and deploy scalable, robust solutions, this book is an indispensable resource.
Local note
O'Reilly O'Reilly Online Learning: Academic/Public Library Edition

Discover More

Also in this Series

Checking series information...

Reviews from GoodReads

Loading GoodReads Reviews.

Citations

APA Citation, 7th Edition (style guide)

Sinha, D. (2023). PRACTICAL MACHINE LEARNING ON DATABRICKS: seamlessly transition ML models and MLOps on Databricks (1st edition.). Packt Publishing Ltd..

Chicago / Turabian - Author Date Citation, 17th Edition (style guide)

Sinha, Debu. 2023. PRACTICAL MACHINE LEARNING ON DATABRICKS: Seamlessly Transition ML Models and MLOps On Databricks. Birmingham, UK: Packt Publishing Ltd.

Chicago / Turabian - Humanities (Notes and Bibliography) Citation, 17th Edition (style guide)

Sinha, Debu. PRACTICAL MACHINE LEARNING ON DATABRICKS: Seamlessly Transition ML Models and MLOps On Databricks Birmingham, UK: Packt Publishing Ltd, 2023.

Harvard Citation (style guide)

Sinha, D. (2023). PRACTICAL MACHINE LEARNING ON DATABRICKS: seamlessly transition ML models and mlops on databricks. 1st edn. Birmingham, UK: Packt Publishing Ltd.

MLA Citation, 9th Edition (style guide)

Sinha, Debu. PRACTICAL MACHINE LEARNING ON DATABRICKS: Seamlessly Transition ML Models and MLOps On Databricks 1st edition., Packt Publishing Ltd., 2023.

Note! Citations contain only title, author, edition, publisher, and year published. Citations should be used as a guideline and should be double checked for accuracy. Citation formats are based on standards as of August 2021.

Staff View

Grouped Work ID
e5fdfd95-c7b0-cfd9-f736-83399ccd3ae1-eng
Go To Grouped Work View in Staff Client

Grouping Information

Grouped Work IDe5fdfd95-c7b0-cfd9-f736-83399ccd3ae1-eng
Full titlepractical machine learning on databricks seamlessly transition ml models and mlops on databricks
Authorsinha debu
Grouping Categorybook
Last Update2025-01-24 12:33:29PM
Last Indexed2025-05-03 03:37:44AM

Book Cover Information

Image Sourcegoogle_isbn
First LoadedDec 17, 2024
Last UsedApr 28, 2025

Marc Record

First DetectedDec 16, 2024 11:27:19 PM
Last File Modification TimeDec 17, 2024 08:26:44 AM
SuppressedRecord had no items

MARC Record

LEADER09029cam a22003977a 4500
001on1407093858
003OCoLC
00520241217082456.0
006m     o  d        
007cr |n|||||||||
008231103s2022    enk     o     000 0 eng d
020 |a 9781801818292|q (electronic bk.)
020 |a 1801818290|q (electronic bk.)
035 |a (OCoLC)1407093858
037 |a 9781801812030|b O'Reilly Media
040 |a YDX|b eng|c YDX|d OCLCO|d ORMDA|d DXU
049 |a MAIN
050 4|a Q325.5
08204|a 006.3/1|2 23/eng/20231205
1001 |a Sinha, Debu,|e author.
24510|a PRACTICAL MACHINE LEARNING ON DATABRICKS|h [electronic resource] :|b seamlessly transition ML models and MLOps on Databricks /|c Debu Sinha.
250 |a 1st edition.
260 |a Birmingham, UK :|b Packt Publishing Ltd.,|c 2023.
300 |a 1 online resource
5050 |a Cover -- Title page -- Copyright and credits -- Contributors -- About the author -- About the reviewers -- Table of Contents -- Preface -- Part 1: Introduction -- Chapter 1: The ML Process and Its Challenges -- Understanding the typical machine learning process -- Discovering the roles associated with machine learning projects in organizations -- Challenges with productionizing machine learning use cases in organizations -- Understanding the requirements of an enterprise-grade machine learning platform -- Scalability -- the growth catalyst -- Performance -- ensuring efficiency and speed -- Security -- safeguarding data and models -- Governance -- steering the machine learning life cycle -- Reproducibility -- ensuring trust and consistency -- Ease of use -- balancing complexity and usability -- Exploring Databricks and the Lakehouse architecture -- Scalability -- the growth catalyst -- Performance -- ensuring efficiency and speed -- Security -- safeguarding data and models -- Governance -- steering the machine learning life cycle -- Reproducibility -- ensuring trust and consistency -- Ease of use -- balancing complexity and usability -- Simplifying machine learning development with the Lakehouse architecture -- Summary -- Further reading -- Chapter 2: Overview of ML on Databricks -- Technical requirements -- Setting up a Databricks trial account -- Exploring the workspace -- Repos -- Exploring clusters -- Single user -- Shared -- No isolation shared -- Single-node clusters -- Exploring notebooks -- Exploring data -- Exploring experiments -- Discovering the feature store -- Discovering the model registry -- Libraries -- Storing libraries -- Managing libraries -- Databricks Runtime and libraries -- Library usage modes -- Unity Catalog limitations -- Installation sources for libraries -- Summary -- Further reading.
5058 |a Part 2: ML Pipeline Components and Implementation -- Chapter 3: Utilizing the Feature Store -- Technical requirements -- Diving into feature stores and the problems they solve -- Discovering feature stores on the Databricks platform -- Feature table -- Offline store -- Online store -- Training Set -- Model packaging -- Registering your first feature table in Databricks Feature Store -- Summary -- Further reading -- Chapter 4: Understanding MLflow Components on Databricks -- Technical requirements -- Overview of MLflow -- MLflow Tracking -- MLflow Models -- MLflow Model Registry -- Example code showing how to track ML model training in Databricks -- Summary -- Chapter 5: Create a Baseline Model Using Databricks AutoML -- Technical requirements -- Understanding the need for AutoML -- Understanding AutoML in Databricks -- Sampling large datasets -- Imbalance data detection -- Splitting data into train/validation/test sets -- Enhancing semantic type detection -- Shapley value (SHAP) for model explainability -- Feature Store integration -- Running AutoML on our churn prediction dataset -- Summary -- Further reading -- Part 3: ML Governance and Deployment -- Chapter 6: Model Versioning and Webhooks -- Technical requirements -- Understanding the need for the Model Registry -- Registering your candidate model to the Model Registry and managing access -- Diving into the webhooks support in the Model Registry -- Summary -- Further reading -- Chapter 7: Model Deployment Approaches -- Technical requirements -- Understanding ML deployments and paradigms -- Deploying ML models for batch and streaming inference -- Batch inference on Databricks -- Streaming inference on Databricks -- Deploying ML models for real-time inference -- In-depth analysis of the constraints and capabilities of Databricks Model Serving.
5058 |a Incorporating custom Python libraries into MLflow models for Databricks deployment -- Deploying custom models with MLflow and Model Serving -- Packaging dependencies with MLflow models -- Summary -- Further reading -- Chapter 8: Automating ML Workflows Using Databricks Jobs -- Technical requirements -- Understanding Databricks Workflows -- Utilizing Databricks Workflows with Jobs to automate model training and testing -- Summary -- Further reading -- Chapter 9: Model Drift Detection and Retraining -- Technical requirements -- The motivation behind model monitoring -- Introduction to model drift -- Introduction to Statistical Drift -- Techniques for drift detection -- Hypothesis testing -- Statistical tests and measurements for numeric features -- Statistical tests and measurements for categorical features -- Statistical tests and measurements on models -- Implementing drift detection on Databricks -- Summary -- Chapter 10: Using CI/CD to Automate Model Retraining and Redeployment -- Introduction to MLOps -- Delta Lake -- more than just a data lake -- Comprehensive model management with Databricks MLflow -- Integrating DevOps and MLOps for robust ML pipelines with Databricks -- Fundamentals of MLOps and deployment patterns -- Navigating environment isolation in Databricks -- multiple strategies for MLOps -- Understanding ML deployment patterns -- The deploy models approach -- The deploy code approach -- Summary -- Further reading -- Index -- Other Books You May Enjoy.
520 |a Take your machine learning skills to the next level by mastering databricks and building robust ML pipeline solutions for future ML innovations Key Features Learn to build robust ML pipeline solutions for databricks transition Master commonly available features like AutoML and MLflow Leverage data governance and model deployment using MLflow model registry Purchase of the print or Kindle book includes a free PDF eBook Book Description Unleash the potential of databricks for end-to-end machine learning with this comprehensive guide, tailored for experienced data scientists and developers transitioning from DIY or other cloud platforms. Building on a strong foundation in Python, Practical Machine Learning on Databricks serves as your roadmap from development to production, covering all intermediary steps using the databricks platform. You'll start with an overview of machine learning applications, databricks platform features, and MLflow. Next, you'll dive into data preparation, model selection, and training essentials and discover the power of databricks feature store for precomputing feature tables. You'll also learn to kickstart your projects using databricks AutoML and automate retraining and deployment through databricks workflows. By the end of this book, you'll have mastered MLflow for experiment tracking, collaboration, and advanced use cases like model interpretability and governance. The book is enriched with hands-on example code at every step. While primarily focused on generally available features, the book equips you to easily adapt to future innovations in machine learning, databricks, and MLflow. What you will learn Transition smoothly from DIY setups to databricks Master AutoML for quick ML experiment setup Automate model retraining and deployment Leverage databricks feature store for data prep Use MLflow for effective experiment tracking Gain practical insights for scalable ML solutions Find out how to handle model drifts in production environments Who this book is for This book is for experienced data scientists, engineers, and developers proficient in Python, statistics, and ML lifecycle looking to transition to databricks from DIY clouds. Introductory Spark knowledge is a must to make the most out of this book, however, end-to-end ML workflows will be covered. If you aim to accelerate your machine learning workflows and deploy scalable, robust solutions, this book is an indispensable resource.
590 |a O'Reilly|b O'Reilly Online Learning: Academic/Public Library Edition
650 0|a Machine learning|x Computer programs.
77608|i Print version:|z 9781801818292
77608|i Print version:|z 1801812039|z 9781801812030|w (OCoLC)1337143940
85640|u https://library.access.arlingtonva.us/login?url=https://learning.oreilly.com/library/view/~/9781801812030/?ar|x O'Reilly|z eBook
938 |a YBP Library Services|b YANK|n 305792890
994 |a 92|b VIA
999 |c 360015|d 360015