[VLM101] Building a Multimodal Chatbot with Fine-tuning (feat.MCP / RunPod)

Name: [VLM101] Building a Multimodal Chatbot with Fine-tuning (feat.MCP / RunPod)
Price: 77000 KRW
Rating: 4.6 (13 reviews)

This is an introductory course for understanding the concept and application methods of Vision-Language Models (VLM), and practicing running the LLaVA model in an Ollama-based environment while integrating it with MCP (Model Context Protocol). This course covers the principles of multimodal models, quantization, service development, and integrated demo development, providing a balanced mix of theory and hands-on practice.

(4.6) 13 reviews

89 learners

Level Basic

Course period Unlimited

dreamingbumblebee

실습 중심

mcp

Vision Transformer

transformer

Llama

Model Context Protocol

실습 중심

mcp

Vision Transformer

transformer

Llama

Model Context Protocol

Reviews from Early Learners

What you will gain after the course

Understanding what MCP is
Hands-on VLM Tuning and Building a PoC Demo

We use AI services like ChatGPT, Gemini, and Claude every day, but have you ever wondered how they 'understand' images? The core technology is the Vision-Language Model (VLM).

In this course, you'll fine-tune the latest VLM models like LLaVA and Qwen2.5v, run them locally with Ollama, and build your own multimodal chatbot using MCP (Model Context Protocol). We'll also cover practical skills you can apply directly to real-world work, such as CLIP Vision Encoder, Quantization, and MCP Server setup. Beyond simple API calls, you'll experience the complete workflow from understanding how VLMs work to integrating with MCP.

✅Hands-on Multimodal AI Experience - Build It Yourself, Not Just API Calls
Go beyond simply using models - this is a practice-focused curriculum where you directly tune, connect, and complete them.

✅Experience the evolution of VLM technology step by step
Systematically experience the development process of multimodal models from CLIP → LLaVA → LLaVA 1.5 → OneVision.

✅Incorporating the Latest Multimodal Technology
Covers the most recent multimodal AI trends including LLaVA OneVision, MCP, and more.

✅ GPU hands-on practice designed to complete with $10
Based on RunPod environment, all hands-on exercises can be completed at an affordable cost.

✅Build your own portfolio through this course
Upon completion, you'll have your own multimodal chatbot as a tangible result.

💡 A course for students who need this

😤 "I'm frustrated with just using APIs"

Those who built a service with ChatGPT API but feel frustrated due to cost burden and many limitations
Those curious about the inner workings of black-box AI models and want to get hands-on experience

💸 "AI service operating costs are too expensive"

Startup developers considering building their own model due to the burden of OpenAI Vision API costs
Those planning a service that requires processing large volumes of images

🚀 "I want to become a multimodal AI expert"

Those who want to advance their career as an AI developer but have only worked with text-based LLMs
Job seekers who want to add differentiated projects to their portfolio

🤔 "I'm not sure exactly what VLM is"

Those who want to keep up with AI trends but don't fully understand what multimodal or VLM is
Those curious about the principles of AI that processes images and text simultaneously

After taking this course

You can fully understand the operating principles of CLIP and LLaVA series. Multimodal AI will no longer be a black box.. AI đa phương thức sẽ không còn là hộp đen nữa.
You can fine-tune and deploy VLM in a practical environment using Ollama and RunPod.
With Quantization techniques, you can make huge models lightweight so they can run even on personal PCs.
You can build a workflow that integrates multiple AI tools using MCP (Model Context Protocol)..
You'll be able to create your own multimodal chatbot from start to finish. từ đầu đến cuối.

💡 Concrete Changes You'll Gain After Taking the Course

🎯 Immediately Applicable Practical Skills

After completing the course, you'll be able to independently work on the following real-world projects:

Your Own VLM Service: Image analysis chatbot specialized for specific domains (medical, education, shopping, etc.)
Local AI Workflow: An automated system where multiple AI tools collaborate using MCP
Cost-Effective AI Services: Services that reduce API dependency and operate with proprietary models

📈 Portfolio for Career Development

GitHub Repository: A well-organized repository with complete practice code and trained models
Technical blog material: Can write technical posts documenting the VLM fine-tuning process and results
Interview Experience: A differentiated interview story with "hands-on experience fine-tuning VLM"

🧠 Deep Understanding and Application Skills

Beyond simple usage:

Fully understand the internal workings of VLM to quickly learn new models as well
Apply model optimization techniques such as Quantization, GGUF conversion to other projects as well
The ability to design AI workflows utilizing the MCP ecosystem

Practice Environment

The course is explained based on MacOS. If you're using a Windows machine, you should be able to follow along as long as Docker is installed.
The course uses Cursor. I believe you can follow along without any issues using the VSCode version as well.
Cloud Environment
- RunPod: GPU instance rental service, using H100 or A100
- Estimated Cost: $10 for the entire hands-on practice
- Advantages: Can practice immediately without complex environment setup
- ⚠ Important Notice
  - RunPod account creation and payment card registration required

Recommended for
these people

Who is this course right for?

For those new to Multimodal and VLM
People who want to create an MCP-based demo

Need to know before starting?

LLM Fundamentals

Hello
This is

310

Learners

Reviews

Answers

4.4

Rating

Courses

📱contact: dreamingbumblebee@gmail.com

Curriculum

All

23 lectures ∙ (2hr 52min)

Course Materials:

Lecture resources

Section 1. Orientation

1 lectures ∙ (2min)

1. Orientation - Course Introduction
02:11

Section 2. Understanding Vision Language Model (VLM)

6 lectures ∙ (54min)

2. What is VLM & Multimodal LLM(MLLM)?
01:27
3. CLIP Vision Encoder
07:46
4. LLaVA
13:21
5. LLaVA 1.5
08:42
6. LLaVA NeXT (1.6)
07:28
7. LLaVA OneVision
15:59

Section 3. VLM Finetuning & Quantization Lab

8 lectures ∙ (1hr 10min)

8. [Lab] Runpod Introduction
02:55
9. [Lab] Ollama Intro
03:53
10. [Hands-on] Uploading My Own Model to Ollama (+Demo Screen)
04:13
11. [Hands-on] RunPod Setup (feat. cursor/vscode)
11:13
12. [Hands-on] VLM SFT Hands On - Jupyter Notebook
22:08
13. [Lab] VLM SFT Hands On - HF Accelerate
06:07
14. [Hands-on] GGUF Conversion and Uploading Your Own Model to HuggingFace Hub
08:42
15. [Lab] Applying Quantization to VLM (feat. llama.cpp & gguf & ollama)
10:55

Section 4. Understanding Model Context Protocol (MCP) and PoC Demo Development

5 lectures ∙ (17min)

Section 5. Chat Demo Implementation Practice (Hands On)

3 lectures ∙ (28min)

Published:

Last updated:

Reviews

All

13 reviews

4.6

13 reviews

jukyellow7445
Reviews 1
∙
Average Rating 5.0
5
61% enrolled
jgryu4241
Reviews 11
∙
Average Rating 4.0
4
30% enrolled
sangsunkim11958
Reviews 1
∙
Average Rating 5.0
5
61% enrolled
kimsc
Reviews 25
∙
Average Rating 4.8
Edited
5
52% enrolled
Thank you for the great lecture.
luke90
Reviews 2
∙
Average Rating 5.0
5
61% enrolled
It seems good for roughly examining concepts and creating simple demos. It's not bad for quickly grasping concepts in the early stages.

$59.40

dreamingbumblebee's other courses

Check out other courses by the instructor!

[LLM 101] Llama SFT Tutorial for LLM Beginners (feat. ChatApp Poc)

dreamingbumblebee

LLM: A pro quickly delivers core content, from essentials to practical tips!

Basic

NLP, ChatGPT, LLM

[LLM 101] Llama SFT Tutorial for LLM Beginners (feat. ChatApp Poc)

dreamingbumblebee

Similar courses

Explore other courses in the same field!

[LLM 101] Llama SFT Tutorial for LLM Beginners (feat. ChatApp Poc)

dreamingbumblebee

LLM: A pro quickly delivers core content, from essentials to practical tips!

Basic

NLP, ChatGPT, LLM

[LLM 101] Llama SFT Tutorial for LLM Beginners (feat. ChatApp Poc)

dreamingbumblebee

AI Agent Development Using LangGraph (feat. MCP)

jasonkang

LangGraph, packed with a major corporation's AI Agent lead's know-how. We deliver knowledge gained from real-world challenges.

Basic

prompt engineering, LLM, AI Agent

AI Agent Development Using LangGraph (feat. MCP)

jasonkang

Getting Started with AI Agents Right Away – From Essential Basics to Practical Knowledge That Everyone Needs to Use Immediately!

kyoungsh7152

This is a course that beginners can follow along with in a fun and easy way. Beyond simple chatbots, AI Agents that automate industry-specific business workflows. This course quickly covers the basic structure and core technologies of AI Agents (LangChain, LangGraph, RAG) in 1.5 hours, and is an introductory course where you build practical skills by directly creating mini agents that work with actual code. After completing the course, you'll understand the necessity of industry-specific data preprocessing and scalable design, and be prepared for advanced concepts covered in deeper courses.

Beginner

Python, RAG, AI Agent

Getting Started with AI Agents Right Away – From Essential Basics to Practical Knowledge That Everyone Needs to Use Immediately!

kyoungsh7152

From the concept of the latest deep learning technology Vision Transformer to Pytorch implementation

dlbro

This is a lecture that studies Vision Transformer, one of the latest deep learning technologies, and implements a paper using Pytorch. Come experience the new future of the vision field with me!

Intermediate

Vision Transformer, Deep Learning(DL), PyTorch

From the concept of the latest deep learning technology Vision Transformer to Pytorch implementation

dlbro

Building a TodoList with Claude + IntelliJ - Complete Guide to MCP

Neo

Building a TodoList with Claude + IntelliJ - Complete MCP Mastery Master the workings of MCP by creating a TodoList with Claude and IntelliJ!

Beginner

Kotlin, Spring Boot, IntelliJ IDEA

Building a TodoList with Claude + IntelliJ - Complete Guide to MCP

Neo

Deep Learning and PyTorch Bootcamp for Beginners (Easy! From Basics to ChatGPT Core Transformer) [Data Analysis/Science Part3]

funcoding

This newly designed course, drawing from the instructor's initial deep learning failures, helps you progressively learn deep learning essentials: math, theory, PyTorch-based implementation, transfer learning, and GPT's core Transformer.

Basic

Deep Learning(DL), PyTorch, Machine Learning(ML)

Deep Learning and PyTorch Bootcamp for Beginners (Easy! From Basics to ChatGPT Core Transformer) [Data Analysis/Science Part3]

funcoding

Start with Vibe Coding and MCP_Cursor AI: The Trend

Masocampus

The Latest Trend in Generative AI, AI Agents! How to Stay Ahead with Cursor, which handles all development tasks, and MCP, which expands functionality!

Beginner

cursor, mvp, AI Agent

Start with Vibe Coding and MCP_Cursor AI: The Trend

Masocampus

[Free] Notion MCP: From Beginner to Application

dakgangjung123

This course covers the fundamentals of the Notion API and teaches you how to automate Notion by integrating AI (Claude) using Notion MCP. You'll learn to directly control blocks, pages, and databases by following the official API documentation, and ultimately complete a hands-on project that creates databases and automatically adds pages by analyzing text file content using only natural language commands (prompts).

Beginner

AI, Model Context Protocol, claude

[Free] Notion MCP: From Beginner to Application

dakgangjung123

Let's code with IntelliJ using the Sionic MCP Series 1 Model Context Protocol!

Sionic AI

This is a hands-on tutorial on how to integrate with IntelliJ and develop quickly using the Model Context Protocol (MCP).

Beginner

IntelliJ IDEA, LLM, AI

Let's code with IntelliJ using the Sionic MCP Series 1 Model Context Protocol!

Sionic AI

AI Agent Beginner's Cheat Sheet, Complete Work Automation with Claude MCP

Masocampus

AI writes on my Slack and Notion for me? Work smarter with MCP that surpasses the limitations of existing AI!

Beginner

Model Context Protocol, AI Agent, Generative AI

AI Agent Beginner's Cheat Sheet, Complete Work Automation with Claude MCP

Masocampus