Decoding ML Notes
This weekโs topics:
The ultimate guide on installing PyTorch with CUDA support in all possible ways
Generate a synthetic domain-specific Q&A dataset in <30 minutes
The power of serverless in the world of ML
Exciting news ๐ฅ I was invited by Maven to speak in their Lighting Lesson series about how to ๐๐ฟ๐ฐ๐ต๐ถ๐๐ฒ๐ฐ๐ ๐ฌ๐ผ๐๐ฟ ๐๐๐ ๐ง๐๐ถ๐ป.
This 30-min session is for ML & MLOps engineers who want to learn:
๐๐๐ ๐ฆ๐๐๐๐ฒ๐บ ๐ฑ๐ฒ๐๐ถ๐ด๐ป ๐ผ๐ณ ๐๐ผ๐๐ฟ ๐๐๐ ๐ง๐๐ถ๐ป
โ Using the 3-pipeline architecture & MLOps good practices
๐๐ฒ๐๐ถ๐ด๐ป ๐ฎ ๐ฑ๐ฎ๐๐ฎ ๐ฐ๐ผ๐น๐น๐ฒ๐ฐ๐๐ถ๐ผ๐ป ๐ฝ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ
โ data crawling, ETLs, CDC, AWS
๐๐ฒ๐๐ถ๐ด๐ป ๐ฎ ๐ณ๐ฒ๐ฎ๐๐๐ฟ๐ฒ ๐ฝ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ
โ streaming engine in Python, data ingestion for fine-tuning & RAG, vector DBs
๐๐ฒ๐๐ถ๐ด๐ป ๐ฎ ๐๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐ฝ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ
โ create a custom dataset, fine-tuning, model registries, experiment trackers, LLM evaluation
๐๐ฒ๐๐ถ๐ด๐ป ๐ฎ๐ป ๐ถ๐ป๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ฐ๐ฒ ๐ฝ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ
โ real-time deployment, REST API, RAG, LLM monitoring
โโโ
Join LIVE on ๐๐ณ๐ช, ๐๐ข๐บ 3!
The ultimate guide on installing PyTorch with CUDA support in all possible ways
Ever wanted to quit ML while wrestling with ๐๐จ๐๐ ๐ฒ๐ฟ๐ฟ๐ผ๐ฟ๐? I know I did. โ Discover ๐ต๐ผ๐ to install ๐๐จ๐๐ & ๐ฃ๐๐ง๐ผ๐ฟ๐ฐ๐ต ๐ฝ๐ฎ๐ถ๐ป๐น๐ฒ๐๐๐น๐ in all possible ways.
Here is the story of most ML people:
1. You just got excited about a new model that came out.
2. You want to try it out.
3. You install everything.
4. You run the model.
5. Bam... CUDA error.
6. You fix the error.
7. Bam... Another CUDA error
7. You fix the error.
8. ...Yet another CUDA error.
You get the idea.
โ Now it is 3:00 am, and you finally solved all your CUDA errors and ran your model.
Now, it's time to do your actual work.
Do you relate?
If so...
I started a Medium article where I documented good practices and step-by-step instructions on how to install CUDA & PyTorch with:
- Pip
- Conda (or Mamba)
- Poetry
- Docker
Check it out โ
๐ The ultimate guide on installing PyTorch with CUDA support in all possible ways
๐ก๐ผ๐๐ฒ: Feel free to comment with any improvements on how to install CUDA + PyTorch. Let's make the ultimate tutorial on installing these 2 beasts ๐ฅ
Generate a synthetic domain-specific Q&A dataset in <30 minutes
How do you ๐ด๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ฒ a ๐๐๐ป๐๐ต๐ฒ๐๐ถ๐ฐ ๐ฑ๐ผ๐บ๐ฎ๐ถ๐ป-๐๐ฝ๐ฒ๐ฐ๐ถ๐ณ๐ถ๐ฐ ๐ค&๐ ๐ฑ๐ฎ๐๐ฎ๐๐ฒ๐ in <๐ฏ๐ฌ ๐บ๐ถ๐ป๐๐๐ฒ๐ to ๐ณ๐ถ๐ป๐ฒ-๐๐๐ป๐ฒ your ๐ผ๐ฝ๐ฒ๐ป-๐๐ผ๐๐ฟ๐ฐ๐ฒ ๐๐๐ ?
This method is also known as ๐ณ๐ถ๐ป๐ฒ๐๐๐ป๐ถ๐ป๐ด ๐๐ถ๐๐ต ๐ฑ๐ถ๐๐๐ถ๐น๐น๐ฎ๐๐ถ๐ผ๐ป. Here are its 3 ๐ฎ๐ข๐ช๐ฏ ๐ด๐ต๐ฆ๐ฑ๐ด โ
๐๐ฐ๐ณ ๐ฆ๐น๐ข๐ฎ๐ฑ๐ญ๐ฆ, ๐ญ๐ฆ๐ต'๐ด ๐จ๐ฆ๐ฏ๐ฆ๐ณ๐ข๐ต๐ฆ ๐ข ๐&๐ ๐ง๐ช๐ฏ๐ฆ-๐ต๐ถ๐ฏ๐ช๐ฏ๐จ ๐ฅ๐ข๐ต๐ข๐ด๐ฆ๐ต ๐ถ๐ด๐ฆ๐ฅ ๐ต๐ฐ ๐ง๐ช๐ฏ๐ฆ-๐ต๐ถ๐ฏ๐ฆ ๐ข ๐ง๐ช๐ฏ๐ข๐ฏ๐ค๐ช๐ข๐ญ ๐ข๐ฅ๐ท๐ช๐ด๐ฐ๐ณ ๐๐๐.
๐ฆ๐๐ฒ๐ฝ ๐ญ: ๐ ๐ฎ๐ป๐๐ฎ๐น๐น๐ ๐ด๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ฒ ๐ฎ ๐ณ๐ฒ๐ ๐ถ๐ป๐ฝ๐๐ ๐ฒ๐ ๐ฎ๐บ๐ฝ๐น๐ฒ๐
Generate a few input samples (~3) that have the following structure:
- ๐ถ๐ด๐ฆ๐ณ_๐ค๐ฐ๐ฏ๐ต๐ฆ๐น๐ต: describe the type of investor (e.g., "I am a 28-year-old marketing professional")
- ๐ฒ๐ถ๐ฆ๐ด๐ต๐ช๐ฐ๐ฏ: describe the user's intention (e.g., "Is Bitcoin a good investment option?")
๐ฆ๐๐ฒ๐ฝ ๐ฎ: ๐๐ ๐ฝ๐ฎ๐ป๐ฑ ๐๐ต๐ฒ ๐ถ๐ป๐ฝ๐๐ ๐ฒ๐ ๐ฎ๐บ๐ฝ๐น๐ฒ๐ ๐๐ถ๐๐ต ๐๐ต๐ฒ ๐ต๐ฒ๐น๐ฝ ๐ผ๐ณ ๐ฎ ๐๐ฒ๐ฎ๐ฐ๐ต๐ฒ๐ฟ ๐๐๐
Use a powerful LLM as a teacher (e.g., GPT4, Falcon 180B, etc.) to generate up to +N similar input examples.
We generated 100 input examples in our use case, but you can generate more.
You will use the manually filled input examples to do few-shot prompting.
This will guide the LLM to give you domain-specific samples.
๐๐ฉ๐ฆ ๐ฑ๐ณ๐ฐ๐ฎ๐ฑ๐ต ๐ธ๐ช๐ญ๐ญ ๐ญ๐ฐ๐ฐ๐ฌ ๐ญ๐ช๐ฌ๐ฆ ๐ต๐ฉ๐ช๐ด:
"""
...
Generate 100 more examples with the following pattern:
# USER CONTEXT 1
...
# QUESTION 1
...
# USER CONTEXT 2
...
"""
๐ฆ๐๐ฒ๐ฝ ๐ฏ: ๐จ๐๐ฒ ๐๐ต๐ฒ ๐๐ฒ๐ฎ๐ฐ๐ต๐ฒ๐ฟ ๐๐๐ ๐๐ผ ๐ด๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ฒ ๐ผ๐๐๐ฝ๐๐๐ ๐ณ๐ผ๐ฟ ๐ฎ๐น๐น ๐๐ต๐ฒ ๐ถ๐ป๐ฝ๐๐ ๐ฒ๐ ๐ฎ๐บ๐ฝ๐น๐ฒ๐
Now, you will have the same powerful LLM as a teacher, but this time, it will answer all your N input examples.
But first, to introduce more variance, we will use RAG to enrich the input examples with news context.
Afterward, we will use the teacher LLM to answer all N input examples.
...and bam! You generated a domain-specific Q&A dataset with almost 0 manual work.
.
Now, you will use this data to train a smaller LLM (e.g., Falcon 7B) on a niched task, such as financial advising.
This technique is known as finetuning with distillation because you use a powerful LLM as the teacher (e.g., GPT4, Falcon 180B) to generate the data, which will be used to fine-tune a smaller LLM (e.g., Falcon 7B), which acts as the student.
โ๏ธ ๐๐ฐ๐ต๐ฆ: To ensure that the generated data is of high quality, you can hire a domain expert to check & refine it.
The power of serverless in the world of ML
๐๐ฒ๐ฝ๐น๐ผ๐๐ถ๐ป๐ด & ๐บ๐ฎ๐ป๐ฎ๐ด๐ถ๐ป๐ด ML models is ๐ต๐ฎ๐ฟ๐ฑ, especially when running your models on GPUs.
But ๐๐ฒ๐ฟ๐๐ฒ๐ฟ๐น๐ฒ๐๐ makes things ๐ฒ๐ฎ๐๐.
Using Beam as your serverless provider, deploying & managing ML models can be as easy as โ
๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ผ๐๐ฟ ๐ถ๐ป๐ณ๐ฟ๐ฎ๐๐๐ฟ๐๐ฐ๐๐๐ฟ๐ฒ & ๐ฑ๐ฒ๐ฝ๐ฒ๐ป๐ฑ๐ฒ๐ป๐ฐ๐ถ๐ฒ๐
In a few lines of code, you define the application that contains:
- the requirements of your infrastructure, such as the CPU, RAM, and GPU
- the dependencies of your application
- the volumes from where you can load your data and store your artifacts
๐๐ฒ๐ฝ๐น๐ผ๐ ๐๐ผ๐๐ฟ ๐ท๐ผ๐ฏ๐
Using the Beam application, you can quickly decorate your Python functions to:
- run them once on the given serverless application
- put your task/job in a queue to be processed or even schedule it using a CRON-based syntax
- even deploy it as a RESTful API endpoint
.
As you can see in the image below, you can have one central function for training or inference, and with minimal effort, you can switch from all these deployment methods.
Also, you don't have to bother at all with managing the infrastructure on which your jobs run. You specify what you need, and Beam takes care of the rest.
By doing so, you can directly start to focus on your application and stop carrying about the infrastructure.
This is the power of serverless!
โณ๐ ๐๐ฉ๐ฆ๐ค๐ฌ ๐ฐ๐ถ๐ต ๐๐ฆ๐ข๐ฎ ๐ต๐ฐ ๐ญ๐ฆ๐ข๐ณ๐ฏ ๐ฎ๐ฐ๐ณ๐ฆ
Images
If not otherwise stated, all images are created by the author.