[VLM101] Creating a Multimodal Chatbot with Fine-tuning (feat.MCP)

Name: [VLM101] Creating a Multimodal Chatbot with Fine-tuning (feat.MCP)
Price: 77000 KRW

This is an introductory course to understand the concepts and application methods of Vision-Language Models (VLM), and to practically run the LLaVA model in an Ollama-based environment, practicing the process of integrating it with MCP (Model Context Protocol). This course covers the principles of multimodal models, Quantization, service, and integrated demo development, providing a balanced mix of theory and practice.

(4.6) 10 reviews

56 learners

dreamingbumblebee

실습 중심

mcp

Vision Transformer

transformer

Llama

Model Context Protocol

Reviews from Early Learners

What you will learn!

Understand what MCP is
Making VLM tuning and PoC demos by hand

We use AI services like ChatGPT, Gemini, and Claude every day, but have you ever wondered how they ‘understand’ images? The core technology is the Vision-Language Model (VLM).

In this lecture, you will learn how to fine-tune the latest VLM models, LLaVA and Qwen2.5v, run them locally with Olama, and create your own multimodal chatbot using MCP (Model Context Protocol). You will also learn practical and immediately applicable technologies, such as CLIP Vision Encoder, Quantization, and MCP Server construction, and you will be able to experience the entire workflow, from VLM's operating principles to MCP integration, beyond simple API calls.

✅ Create multimodal AI experiences yourself, not through API calls
This is a hands-on, hands-on configuration that goes beyond simply using the model, to tuning, connecting, and completing it yourself.

✅ Experience the evolution of VLM technology step by step
Experience the systematic development of a multimodal model from CLIP → LLaVA → LLaVA 1.5 → OneVision.

✅ Reflects the latest multimodal technology
It contains the latest multimodal AI trends such as LLaVA OneVision and MCP.

✅ GPU Lab Design You Can Complete for $10
Full hands-on training is available at an affordable cost, based on the RunPod environment.

✅ Complete your own portfolio through lectures
Upon completion of the course, you will have a multimodal chatbot of your own creation.

💡 Lectures required for these students

😤 "It's frustrating to only use API"

If you have created a service using ChatGPT API, but are frustrated because it is expensive and has many restrictions,
For those who are curious about the inside of an AI model like a black box and want to touch it directly

💸 "AI service operating costs are too expensive"

Startup developers who are considering building their own models due to the cost of calling the OpenAI Vision API
Anyone planning a service that requires large-scale image processing

🚀 "I want to become a multimodal AI expert"

Anyone who wants to advance their career as an AI developer but has only taken a text-based LLM
Job seekers who want to add a differentiated project to their portfolio

🤔 "I don't know exactly what VLM is"

People who want to follow AI trends but don't exactly understand what multimodal is or what VLM is
For those who are curious about the principles of AI that processes images and text simultaneously

💡 Specific changes you can achieve after taking the course

🎯 Immediately actionable practical skills

After completing the course, you will be able to work on the following hands-on projects on your own:

My own VLM service : Image analysis chatbot specialized for specific domains (medical, education, shopping, etc.)
Local AI Workflow : An automated system where multiple AI tools collaborate using MCP
Cost-effective AI service : Service that reduces API dependency and operates with its own model

📈 Portfolio for career advancement

GitHub repository : A complete repository containing the entire practice code and trained models.
Technical Blog Material : Technical postings summarizing the VLM fine-tuning process and results can be written.
Interview Experience : A Differentiated Interview Story with “Experience Fine-tuning VLM Directly”

🧠 Deep understanding and application

Beyond simple usage:

Fully understand the internal workings of VLM, enabling rapid learning of new models
Apply model optimization techniques such as Quantization and GGUF transformation to other projects
Ability to design AI workflows using the MCP ecosystem

Practice environment

The lecture will be based on MacOS. If you have a Windows machine and Docker is installed, you can mostly follow along.
In this lecture, we will use cursor. I think you can follow the vscode version without any problems.
Cloud environment
- RunPod : GPU instance rental service, using H100 or A100
- Estimated Cost : $10 for the entire practice
- Pros : You can start practicing right away without any complicated environment settings.
- ⚠ Note
  - You need to create a RunPod account and register a payment card.

Recommended for
these people

Who is this course right for?

Multimodal, VLM for beginners
Person who wants to build an MCP-based demo

Need to know before starting?

LLM Basics

Hello
This is

259

Learners

Reviews

Answers

4.3

Rating

Courses

📱contact: dreamingbumblebee@gmail.com

Curriculum

All

23 lectures ∙ (2hr 52min)

Course Materials:

Lecture resources

Section 1. Orientation

1 lectures ∙ (2min)

1. Orientation - Course Introduction
02:11

Section 2. Understanding Vision Language Model (VLM)

6 lectures ∙ (54min)

2. What is VLM & Multimodal LLM(MLLM)?
01:27
3. CLIP Vision Encoder
07:46
4. LLaVA
13:21
5. LLaVA 1.5
08:42
6. LLaVA NeXT (1.6)
07:28
7. LLaVA OneVision
15:59

Section 3. VLM Finetuning & Quantization Lab

8 lectures ∙ (1hr 10min)

8. [Lab] Runpod Introduction
02:55
9. [Lab] Ollama Intro
03:53
10. [Hands-on] Uploading My Own Model to Ollama (+Demo Screen)
04:13
11. [Hands-on] RunPod Setup (feat. cursor/vscode)
11:13
12. [Hands-on] VLM SFT Hands On - Jupyter Notebook
22:08
13. [Lab] VLM SFT Hands On - HF Accelerate
06:07
14. [Hands-on] GGUF Conversion and Uploading Your Own Model to HuggingFace Hub
08:42
15. [Lab] Applying Quantization to VLM (feat. llama.cpp & gguf & ollama)
10:55

Section 4. Understanding Model Context Protocol (MCP) and PoC Demo Development

5 lectures ∙ (17min)

Section 5. Chat Demo Implementation Practice (Hands On)

3 lectures ∙ (28min)

Published:

Last updated:

Reviews

All

10 reviews

4.6

10 reviews

kimsc
Reviews 24
∙
Average Rating 4.8
Edited
5
52% enrolled
Thank you for the great lecture.
luke90
Reviews 2
∙
Average Rating 5.0
5
61% enrolled
It seems good for roughly examining concepts and creating simple demos. It's not bad for quickly grasping concepts in the early stages.
haenarashin
Reviews 9
∙
Average Rating 4.4
3
61% enrolled
Rather than being a 101-level class, it seems like something for people who have majored in it or dealt with it before to quickly skim through.
yyj
Reviews 3
∙
Average Rating 5.0
5
30% enrolled
nar998614
Reviews 9
∙
Average Rating 4.7
5
100% enrolled
It seems like you explain the key content well in a short amount of time.

$59.40

dreamingbumblebee's other courses

Check out other courses by the instructor!

[LLM 101] Llama SFT Tutorial for LLM Beginners (feat. ChatApp Poc)

dreamingbumblebee

LLM: A pro quickly delivers core content, from essentials to practical tips!

초급

NLP, ChatGPT, LLM

[LLM 101] Llama SFT Tutorial for LLM Beginners (feat. ChatApp Poc)

dreamingbumblebee

Similar courses

Explore other courses in the same field!

AI Agent Development Using LangGraph (feat. MCP)

jasonkang

LangGraph, packed with a major corporation's AI Agent lead's know-how. We deliver knowledge gained from real-world challenges.

초급

prompt engineering, LLM, AI Agent

AI Agent Development Using LangGraph (feat. MCP)

jasonkang

Future of Developers! AI Vibe Coding Part 1 - Create Your Own MCP Server with Cursor

AISchool

This is a lecture where you build your own MCP server using Cursor and experience vibe coding, which embodies the future of developers.

초급

cursor, LLM, Model Context Protocol

Future of Developers! AI Vibe Coding Part 1 - Create Your Own MCP Server with Cursor

AISchool

Deep Learning and PyTorch Bootcamp for Beginners (Easy! From Basics to ChatGPT Core Transformer) [Data Analysis/Science Part3]

funcoding

This newly designed course, drawing from the instructor's initial deep learning failures, helps you progressively learn deep learning essentials: math, theory, PyTorch-based implementation, transfer learning, and GPT's core Transformer.

초급

Deep Learning(DL), PyTorch, Machine Learning(ML)

Deep Learning and PyTorch Bootcamp for Beginners (Easy! From Basics to ChatGPT Core Transformer) [Data Analysis/Science Part3]

funcoding

Anyone Can Easily Learn MCP (Official Launch October 31st)

aitutorlab

AI technologies are emerging with various new developments day by day. We need to study AI to keep up with them, but it's honestly not easy. This lecture will teach you about MCP (Model Context Protocol), a topic that has been receiving attention in AI recently. This lecture is targeted at people who are encountering MCP for the first time or those who work in fields other than development, such as design or planning, but are interested in MCP servers.

입문

AI, Generative AI, AI Agent

Anyone Can Easily Learn MCP (Official Launch October 31st)

aitutorlab

From the concept of the latest deep learning technology Vision Transformer to Pytorch implementation

dlbro

This is a lecture that studies Vision Transformer, one of the latest deep learning technologies, and implements a paper using Pytorch. Come experience the new future of the vision field with me!

중급이상

Vision Transformer, Deep Learning(DL), PyTorch

From the concept of the latest deep learning technology Vision Transformer to Pytorch implementation

dlbro

[LLM 101] Llama SFT Tutorial for LLM Beginners (feat. ChatApp Poc)

dreamingbumblebee

LLM: A pro quickly delivers core content, from essentials to practical tips!

초급

NLP, ChatGPT, LLM

[LLM 101] Llama SFT Tutorial for LLM Beginners (feat. ChatApp Poc)

dreamingbumblebee

AI Agent Beginner's Cheat Sheet, Complete Work Automation with Claude MCP

Masocampus

AI writes on my Slack and Notion for me? Work smarter with MCP that surpasses the limitations of existing AI!

입문

Model Context Protocol, AI Agent, Generative AI

AI Agent Beginner's Cheat Sheet, Complete Work Automation with Claude MCP

Masocampus

Let's code with IntelliJ using the Sionic MCP Series 1 Model Context Protocol!

Sionic AI

This is a hands-on tutorial on how to integrate with IntelliJ and develop quickly using the Model Context Protocol (MCP).

입문

IntelliJ IDEA, LLM, AI

Let's code with IntelliJ using the Sionic MCP Series 1 Model Context Protocol!

Sionic AI

Start with Vibe Coding and MCP_Cursor AI: The Trend

Masocampus

The Latest Trend in Generative AI, AI Agents! How to Stay Ahead with Cursor, which handles all development tasks, and MCP, which expands functionality!

입문

cursor, mvp, AI Agent

Start with Vibe Coding and MCP_Cursor AI: The Trend

Masocampus

Developing a TodoList with Claude + IntelliJ - Mastering MCP

Neo

Perfectly master the working process of MCP by creating a TodoList with Claude and IntelliJ.

입문

Kotlin, Spring Boot, IntelliJ IDEA

Developing a TodoList with Claude + IntelliJ - Mastering MCP

Neo