Problems deploying your ML models? Here is your solution! (2024)

Decoding ML Notes

This weekโ€™s topics:

  • The ultimate guide on installing PyTorch with CUDA support in all possible ways

  • Generate a synthetic domain-specific Q&A dataset in <30 minutes

  • The power of serverless in the world of ML

Exciting news ๐Ÿ”ฅ I was invited by Maven to speak in their Lighting Lesson series about how to ๐—”๐—ฟ๐—ฐ๐—ต๐—ถ๐˜๐—ฒ๐—ฐ๐˜ ๐—ฌ๐—ผ๐˜‚๐—ฟ ๐—Ÿ๐—Ÿ๐—  ๐—ง๐˜„๐—ถ๐—ป.

Problems deploying your ML models? Here is your solution! (1)

This 30-min session is for ML & MLOps engineers who want to learn:

๐—Ÿ๐—Ÿ๐—  ๐—ฆ๐˜†๐˜€๐˜๐—ฒ๐—บ ๐—ฑ๐—ฒ๐˜€๐—ถ๐—ด๐—ป ๐—ผ๐—ณ ๐˜†๐—ผ๐˜‚๐—ฟ ๐—Ÿ๐—Ÿ๐—  ๐—ง๐˜„๐—ถ๐—ป

โ†’ Using the 3-pipeline architecture & MLOps good practices

๐——๐—ฒ๐˜€๐—ถ๐—ด๐—ป ๐—ฎ ๐—ฑ๐—ฎ๐˜๐—ฎ ๐—ฐ๐—ผ๐—น๐—น๐—ฒ๐—ฐ๐˜๐—ถ๐—ผ๐—ป ๐—ฝ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ

โ†’ data crawling, ETLs, CDC, AWS

๐——๐—ฒ๐˜€๐—ถ๐—ด๐—ป ๐—ฎ ๐—ณ๐—ฒ๐—ฎ๐˜๐˜‚๐—ฟ๐—ฒ ๐—ฝ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ

โ†’ streaming engine in Python, data ingestion for fine-tuning & RAG, vector DBs

๐——๐—ฒ๐˜€๐—ถ๐—ด๐—ป ๐—ฎ ๐˜๐—ฟ๐—ฎ๐—ถ๐—ป๐—ถ๐—ป๐—ด ๐—ฝ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ

โ†’ create a custom dataset, fine-tuning, model registries, experiment trackers, LLM evaluation

๐——๐—ฒ๐˜€๐—ถ๐—ด๐—ป ๐—ฎ๐—ป ๐—ถ๐—ป๐—ณ๐—ฒ๐—ฟ๐—ฒ๐—ป๐—ฐ๐—ฒ ๐—ฝ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ

โ†’ real-time deployment, REST API, RAG, LLM monitoring

โ†“โ†“โ†“

Join LIVE on ๐˜๐˜ณ๐˜ช, ๐˜”๐˜ข๐˜บ 3!

Register here (itโ€™s free) โ†

The ultimate guide on installing PyTorch with CUDA support in all possible ways

Ever wanted to quit ML while wrestling with ๐—–๐—จ๐——๐—” ๐—ฒ๐—ฟ๐—ฟ๐—ผ๐—ฟ๐˜€? I know I did. โ†’ Discover ๐—ต๐—ผ๐˜„ to install ๐—–๐—จ๐——๐—” & ๐—ฃ๐˜†๐—ง๐—ผ๐—ฟ๐—ฐ๐—ต ๐—ฝ๐—ฎ๐—ถ๐—ป๐—น๐—ฒ๐˜€๐˜€๐—น๐˜† in all possible ways.

Here is the story of most ML people:

1. You just got excited about a new model that came out.

2. You want to try it out.

3. You install everything.

4. You run the model.

5. Bam... CUDA error.

6. You fix the error.

7. Bam... Another CUDA error

7. You fix the error.

8. ...Yet another CUDA error.

You get the idea.

โ†’ Now it is 3:00 am, and you finally solved all your CUDA errors and ran your model.

Now, it's time to do your actual work.

Do you relate?

If so...

I started a Medium article where I documented good practices and step-by-step instructions on how to install CUDA & PyTorch with:

- Pip
- Conda (or Mamba)
- Poetry
- Docker

Problems deploying your ML models? Here is your solution! (2)

๐—ก๐—ผ๐˜๐—ฒ: Feel free to comment with any improvements on how to install CUDA + PyTorch. Let's make the ultimate tutorial on installing these 2 beasts ๐Ÿ”ฅ

Generate a synthetic domain-specific Q&A dataset in <30 minutes

How do you ๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ a ๐˜€๐˜†๐—ป๐˜๐—ต๐—ฒ๐˜๐—ถ๐—ฐ ๐—ฑ๐—ผ๐—บ๐—ฎ๐—ถ๐—ป-๐˜€๐—ฝ๐—ฒ๐—ฐ๐—ถ๐—ณ๐—ถ๐—ฐ ๐—ค&๐—” ๐—ฑ๐—ฎ๐˜๐—ฎ๐˜€๐—ฒ๐˜ in <๐Ÿฏ๐Ÿฌ ๐—บ๐—ถ๐—ป๐˜‚๐˜๐—ฒ๐˜€ to ๐—ณ๐—ถ๐—ป๐—ฒ-๐˜๐˜‚๐—ป๐—ฒ your ๐—ผ๐—ฝ๐—ฒ๐—ป-๐˜€๐—ผ๐˜‚๐—ฟ๐—ฐ๐—ฒ ๐—Ÿ๐—Ÿ๐— ?

This method is also known as ๐—ณ๐—ถ๐—ป๐—ฒ๐˜๐˜‚๐—ป๐—ถ๐—ป๐—ด ๐˜„๐—ถ๐˜๐—ต ๐—ฑ๐—ถ๐˜€๐˜๐—ถ๐—น๐—น๐—ฎ๐˜๐—ถ๐—ผ๐—ป. Here are its 3 ๐˜ฎ๐˜ข๐˜ช๐˜ฏ ๐˜ด๐˜ต๐˜ฆ๐˜ฑ๐˜ด โ†“

๐˜๐˜ฐ๐˜ณ ๐˜ฆ๐˜น๐˜ข๐˜ฎ๐˜ฑ๐˜ญ๐˜ฆ, ๐˜ญ๐˜ฆ๐˜ต'๐˜ด ๐˜จ๐˜ฆ๐˜ฏ๐˜ฆ๐˜ณ๐˜ข๐˜ต๐˜ฆ ๐˜ข ๐˜˜&๐˜ˆ ๐˜ง๐˜ช๐˜ฏ๐˜ฆ-๐˜ต๐˜ถ๐˜ฏ๐˜ช๐˜ฏ๐˜จ ๐˜ฅ๐˜ข๐˜ต๐˜ข๐˜ด๐˜ฆ๐˜ต ๐˜ถ๐˜ด๐˜ฆ๐˜ฅ ๐˜ต๐˜ฐ ๐˜ง๐˜ช๐˜ฏ๐˜ฆ-๐˜ต๐˜ถ๐˜ฏ๐˜ฆ ๐˜ข ๐˜ง๐˜ช๐˜ฏ๐˜ข๐˜ฏ๐˜ค๐˜ช๐˜ข๐˜ญ ๐˜ข๐˜ฅ๐˜ท๐˜ช๐˜ด๐˜ฐ๐˜ณ ๐˜“๐˜“๐˜”.

๐—ฆ๐˜๐—ฒ๐—ฝ ๐Ÿญ: ๐— ๐—ฎ๐—ป๐˜‚๐—ฎ๐—น๐—น๐˜† ๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ ๐—ฎ ๐—ณ๐—ฒ๐˜„ ๐—ถ๐—ป๐—ฝ๐˜‚๐˜ ๐—ฒ๐˜…๐—ฎ๐—บ๐—ฝ๐—น๐—ฒ๐˜€

Generate a few input samples (~3) that have the following structure:
- ๐˜ถ๐˜ด๐˜ฆ๐˜ณ_๐˜ค๐˜ฐ๐˜ฏ๐˜ต๐˜ฆ๐˜น๐˜ต: describe the type of investor (e.g., "I am a 28-year-old marketing professional")
- ๐˜ฒ๐˜ถ๐˜ฆ๐˜ด๐˜ต๐˜ช๐˜ฐ๐˜ฏ: describe the user's intention (e.g., "Is Bitcoin a good investment option?")

๐—ฆ๐˜๐—ฒ๐—ฝ ๐Ÿฎ: ๐—˜๐˜…๐—ฝ๐—ฎ๐—ป๐—ฑ ๐˜๐—ต๐—ฒ ๐—ถ๐—ป๐—ฝ๐˜‚๐˜ ๐—ฒ๐˜…๐—ฎ๐—บ๐—ฝ๐—น๐—ฒ๐˜€ ๐˜„๐—ถ๐˜๐—ต ๐˜๐—ต๐—ฒ ๐—ต๐—ฒ๐—น๐—ฝ ๐—ผ๐—ณ ๐—ฎ ๐˜๐—ฒ๐—ฎ๐—ฐ๐—ต๐—ฒ๐—ฟ ๐—Ÿ๐—Ÿ๐— 

Use a powerful LLM as a teacher (e.g., GPT4, Falcon 180B, etc.) to generate up to +N similar input examples.

We generated 100 input examples in our use case, but you can generate more.

You will use the manually filled input examples to do few-shot prompting.

This will guide the LLM to give you domain-specific samples.

๐˜›๐˜ฉ๐˜ฆ ๐˜ฑ๐˜ณ๐˜ฐ๐˜ฎ๐˜ฑ๐˜ต ๐˜ธ๐˜ช๐˜ญ๐˜ญ ๐˜ญ๐˜ฐ๐˜ฐ๐˜ฌ ๐˜ญ๐˜ช๐˜ฌ๐˜ฆ ๐˜ต๐˜ฉ๐˜ช๐˜ด:
"""
...
Generate 100 more examples with the following pattern:

# USER CONTEXT 1
...

# QUESTION 1
...

# USER CONTEXT 2
...
"""

๐—ฆ๐˜๐—ฒ๐—ฝ ๐Ÿฏ: ๐—จ๐˜€๐—ฒ ๐˜๐—ต๐—ฒ ๐˜๐—ฒ๐—ฎ๐—ฐ๐—ต๐—ฒ๐—ฟ ๐—Ÿ๐—Ÿ๐—  ๐˜๐—ผ ๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ ๐—ผ๐˜‚๐˜๐—ฝ๐˜‚๐˜๐˜€ ๐—ณ๐—ผ๐—ฟ ๐—ฎ๐—น๐—น ๐˜๐—ต๐—ฒ ๐—ถ๐—ป๐—ฝ๐˜‚๐˜ ๐—ฒ๐˜…๐—ฎ๐—บ๐—ฝ๐—น๐—ฒ๐˜€

Now, you will have the same powerful LLM as a teacher, but this time, it will answer all your N input examples.

But first, to introduce more variance, we will use RAG to enrich the input examples with news context.

Afterward, we will use the teacher LLM to answer all N input examples.

...and bam! You generated a domain-specific Q&A dataset with almost 0 manual work.

.

Now, you will use this data to train a smaller LLM (e.g., Falcon 7B) on a niched task, such as financial advising.

This technique is known as finetuning with distillation because you use a powerful LLM as the teacher (e.g., GPT4, Falcon 180B) to generate the data, which will be used to fine-tune a smaller LLM (e.g., Falcon 7B), which acts as the student.

Problems deploying your ML models? Here is your solution! (3)


โœ’๏ธ ๐˜•๐˜ฐ๐˜ต๐˜ฆ: To ensure that the generated data is of high quality, you can hire a domain expert to check & refine it.

The power of serverless in the world of ML

๐——๐—ฒ๐—ฝ๐—น๐—ผ๐˜†๐—ถ๐—ป๐—ด & ๐—บ๐—ฎ๐—ป๐—ฎ๐—ด๐—ถ๐—ป๐—ด ML models is ๐—ต๐—ฎ๐—ฟ๐—ฑ, especially when running your models on GPUs.

But ๐˜€๐—ฒ๐—ฟ๐˜ƒ๐—ฒ๐—ฟ๐—น๐—ฒ๐˜€๐˜€ makes things ๐—ฒ๐—ฎ๐˜€๐˜†.

Using Beam as your serverless provider, deploying & managing ML models can be as easy as โ†“

๐——๐—ฒ๐—ณ๐—ถ๐—ป๐—ฒ ๐˜†๐—ผ๐˜‚๐—ฟ ๐—ถ๐—ป๐—ณ๐—ฟ๐—ฎ๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ & ๐—ฑ๐—ฒ๐—ฝ๐—ฒ๐—ป๐—ฑ๐—ฒ๐—ป๐—ฐ๐—ถ๐—ฒ๐˜€

In a few lines of code, you define the application that contains:

- the requirements of your infrastructure, such as the CPU, RAM, and GPU
- the dependencies of your application
- the volumes from where you can load your data and store your artifacts

๐——๐—ฒ๐—ฝ๐—น๐—ผ๐˜† ๐˜†๐—ผ๐˜‚๐—ฟ ๐—ท๐—ผ๐—ฏ๐˜€

Using the Beam application, you can quickly decorate your Python functions to:

- run them once on the given serverless application
- put your task/job in a queue to be processed or even schedule it using a CRON-based syntax
- even deploy it as a RESTful API endpoint

.

As you can see in the image below, you can have one central function for training or inference, and with minimal effort, you can switch from all these deployment methods.

Also, you don't have to bother at all with managing the infrastructure on which your jobs run. You specify what you need, and Beam takes care of the rest.

By doing so, you can directly start to focus on your application and stop carrying about the infrastructure.

This is the power of serverless!

Problems deploying your ML models? Here is your solution! (4)

โ†ณ๐Ÿ”— ๐˜Š๐˜ฉ๐˜ฆ๐˜ค๐˜ฌ ๐˜ฐ๐˜ถ๐˜ต ๐˜‰๐˜ฆ๐˜ข๐˜ฎ ๐˜ต๐˜ฐ ๐˜ญ๐˜ฆ๐˜ข๐˜ณ๐˜ฏ ๐˜ฎ๐˜ฐ๐˜ณ๐˜ฆ

Images

If not otherwise stated, all images are created by the author.

Problems deploying your ML models? Here is your solution! (2024)
Top Articles
Latest Posts
Article information

Author: Virgilio Hermann JD

Last Updated:

Views: 5714

Rating: 4 / 5 (61 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Virgilio Hermann JD

Birthday: 1997-12-21

Address: 6946 Schoen Cove, Sipesshire, MO 55944

Phone: +3763365785260

Job: Accounting Engineer

Hobby: Web surfing, Rafting, Dowsing, Stand-up comedy, Ghost hunting, Swimming, Amateur radio

Introduction: My name is Virgilio Hermann JD, I am a fine, gifted, beautiful, encouraging, kind, talented, zealous person who loves writing and wants to share my knowledge and understanding with you.