Machine Learning Lifecycle Made Easy with MLflow

Speakers: Kalyan Munjuluri & Karishma Babbar

Track: Other

Type: Talk

Room: Video Stream 2

Time: Oct 08 (Fri): 14:30

Duration: 0:45

ABSTRACT

Beyond the usual concerns in software development, machine learning development comes with additional challenges. These include trying multiple algorithms and parameters to get the best results, tracking these runs for reproducibility, and moving the model to diverse deployment environments. This talk demonstrates the use of an open-source platform called MLflow for managing the complete machine learning lifecycle with Python. The talk requires a basic understanding of Python and Machine Learning concepts.

DESCRIPTION

In theory, the crux of machine learning (ML) development lies with data collection, model creation, model training, and deployment. In reality, machine learning projects are not so straightforward. They are a cycle iterating between improving the data, model, and evaluation that is never really finished. Unlike in traditional software development, ML developers experiment with multiple algorithms, tools, and parameters to optimize performance, and they need to track these experiments to reproduce work. Furthermore, developers need to use many distinct systems to productionize models.

In this talk, we introduce MLflow, an open-source platform that aims at simplifying the entire ML lifecycle where we can use any ML library and development tool of our choice to reliably build and share ML applications. MLflow offers simple abstractions through lightweight APIs to package reproducible projects, track results, and encapsulate models that are compatible with existing tools, thereby, accelerating ML lifecycle of any size.

With the help of an example, we will show how using MLflow can ease bookkeeping of experiment runs and results across frameworks, quickly reproducing runs on any platform (cloud or local execution), and productionizing models on diverse deployment tools.

At the end of this talk, you will be familiar with –

  • Key concepts, abstractions, and components of open-source MLflow
  • How each component of MLflow addresses challenges of ML lifecycle
  • How to use MLflow Tracking during model training to record experimental runs
  • How to use MLflow Tracking User Interface to visualize experimental runs with different tuning parameters and evaluation metrics
  • How to use MLflow Projects for packaging reusable and reproducible models
  • How to use MLflow Models general format to serve models using MLflow REST API

The purpose of the session is to introduce the audience to MLflow and give a taste of the ML development lifecycle. It is intended at providing a breadth than depth survey of MLflow platform, and we leave the audience to experiment with it further through takeaway exercises.

PRE-REQUISITES

  • Basic knowledge of Python programming language
  • Basic understanding of machine learning concepts

TRACK

Data Science in Production, Machine Learning, Data Engineering or MLOps

URLs