[LLM 101] Llama SFT Tutorial for LLM Beginners (feat. ChatApp Poc)
dreamingbumblebee
$73.70
Basic / NLP, ChatGPT, LLM, Llama, Fine-Tuning
4.2
(29)
LLM: A pro quickly delivers core content, from essentials to practical tips!
Basic
NLP, ChatGPT, LLM
This is an introductory course for understanding the concept and application methods of Vision-Language Models (VLM), and practicing running the LLaVA model in an Ollama-based environment while integrating it with MCP (Model Context Protocol). This course covers the principles of multimodal models, quantization, service development, and integrated demo development, providing a balanced mix of theory and hands-on practice.
231 learners
Level Basic
Course period Unlimited
Reviews from Early Learners
5.0
주영훈
.
5.0
김덕진
It is a very informative training session.
5.0
박규동
I didn't expect to be able to learn about VLM in such detail, so it was a very informative and excellent lecture.
Understanding what MCP is
Hands-on VLM Tuning and Building a PoC Demo
We use AI services like ChatGPT, Gemini, and Claude every day, but have you ever wondered how they 'understand' images? The core technology is the Vision-Language Model (VLM).
In this course, you'll fine-tune the latest VLM models like LLaVA and Qwen2.5v, run them locally with Ollama, and build your own multimodal chatbot using MCP (Model Context Protocol). We'll also cover practical skills you can apply directly to real-world work, such as CLIP Vision Encoder, Quantization, and MCP Server setup. Beyond simple API calls, you'll experience the complete workflow from understanding how VLMs work to integrating with MCP.
📌 The evolution of multimodal AI at a glance
From CLIP to LLaVA OneVision, we organize the evolution and technical context of VLM.
📌 Build Your Own VLM Chatbot
Fine-tuning, optimization, and local execution with Ollama - build the model yourself
📌 Perfect Balance of Theory and Practice
Train and test models using actual GPUs in the RunPod environment
📌 Anyone with deep learning experience is welcome
We explain the basic concepts step by step so that even beginners can follow along
✅Hands-on Multimodal AI Experience - Build It Yourself, Not Just API Calls
Go beyond simply using models - this is a practice-focused curriculum where you directly tune, connect, and complete them.
✅Experience the evolution of VLM technology step by step
Systematically experience the development process of multimodal models from CLIP → LLaVA → LLaVA 1.5 → OneVision.
✅Incorporating the Latest Multimodal Technology
Covers the most recent multimodal AI trends including LLaVA OneVision, MCP, and more.
✅ GPU hands-on practice designed to complete with $10
Based on RunPod environment, all hands-on exercises can be completed at an affordable cost.
✅Build your own portfolio through this course
Upon completion, you'll have your own multimodal chatbot as a tangible result.

🚀 I want to level up with AI development.
Developers / students who have only used the ChatGPT API and now want to work with AI models directly

👁 I'm interested in multimodal AI.
How does AI that processes text and images simultaneously work? For those curious about the principles of VLM

⚡ I'm curious about building a local AI environment.
Those who find cloud API costs burdensome and want to run AI models locally
😤 "I'm frustrated with just using APIs"
Those who built a service with ChatGPT API but feel frustrated due to cost burden and many limitations
Those curious about the inner workings of black-box AI models and want to get hands-on experience
💸 "AI service operating costs are too expensive"
Startup developers considering building their own model due to the burden of OpenAI Vision API costs
Those planning a service that requires processing large volumes of images
🚀 "I want to become a multimodal AI expert"
Those who want to advance their career as an AI developer but have only worked with text-based LLMs
Job seekers who want to add differentiated projects to their portfolio
🤔 "I'm not sure exactly what VLM is"
Those who want to keep up with AI trends but don't fully understand what multimodal or VLM is
Those curious about the principles of AI that processes images and text simultaneously
You can fully understand the operating principles of CLIP and LLaVA series. Multimodal AI will no longer be a black box.. AI đa phương thức sẽ không còn là hộp đen nữa.
You can fine-tune and deploy VLM in a practical environment using Ollama and RunPod.
With Quantization techniques, you can make huge models lightweight so they can run even on personal PCs.
You can build a workflow that integrates multiple AI tools using MCP (Model Context Protocol)..
You'll be able to create your own multimodal chatbot from start to finish. từ đầu đến cuối.
🎯 Immediately Applicable Practical Skills
After completing the course, you'll be able to independently work on the following real-world projects:
Your Own VLM Service: Image analysis chatbot specialized for specific domains (medical, education, shopping, etc.)
Local AI Workflow: An automated system where multiple AI tools collaborate using MCP
Cost-Effective AI Services: Services that reduce API dependency and operate with proprietary models
📈 Portfolio for Career Development
GitHub Repository: A well-organized repository with complete practice code and trained models
Technical blog material: Can write technical posts documenting the VLM fine-tuning process and results
Interview Experience: A differentiated interview story with "hands-on experience fine-tuning VLM"
🧠 Deep Understanding and Application Skills
Beyond simple usage:
Fully understand the internal workings of VLM to quickly learn new models as well
Apply model optimization techniques such as Quantization, GGUF conversion to other projects as well
The ability to design AI workflows utilizing the MCP ecosystem
🧠 VLM Core Principles: From CLIP to LLaVA OneVision
How does multimodal AI 'understand' images? Learn step-by-step the evolution of VLM, from the principles of CLIP Vision Encoder to the latest LLaVA OneVision.
🔧 Hands-on Fine-tuning: Building Your Own VLM
Fine-tune the LLaVA model directly in a RunPod GPU environment. Learn efficient training methods using Jupyter Notebook and HuggingFace Accelerate.
⚡ Model Optimization: Quantization & GGUF Conversion
Learn practical techniques to convert large VLMs to GGUF format and apply quantization so they can run on personal PCs.
🔗 MCP Integration: Collaboration of AI Tools
Learn how to connect multiple AI models and tools into a single workflow using the Model Context Protocol.
2016 ~ Present: NLP & LLM Development Practitioner (Worked at large corporations N and S)
The course is explained based on MacOS. If you're using a Windows machine, you should be able to follow along as long as Docker is installed.
The course uses Cursor. I believe you can follow along without any issues using the VSCode version as well.
Cloud Environment
RunPod: GPU instance rental service, using H100 or A100
Estimated Cost: $10 for the entire hands-on practice
Advantages: Can practice immediately without complex environment setup
⚠ Important Notice
RunPod account creation and payment card registration required
You can check the attached PDF and source code
Basic Python syntax (classes, functions, module usage)
Basic concepts of deep learning/machine learning (neural networks, training, inference, etc.)
Experience with model training in GPU environments is helpful (but not required)
Familiarity with using terminal/command line will be helpful
Who is this course right for?
For those new to Multimodal and VLM
People who want to create an MCP-based demo
Need to know before starting?
LLM Fundamentals
461
Learners
81
Reviews
4
Answers
4.5
Rating
2
Courses
📱contact: dreamingbumblebee@gmail.com
All
23 lectures ∙ (2hr 52min)
Course Materials:
4. LLaVA
13:21
5. LLaVA 1.5
08:42
6. LLaVA NeXT (1.6)
07:28
7. LLaVA OneVision
15:59
All
52 reviews
4.6
52 reviews
Reviews 1
∙
Average Rating 5.0
Reviews 3
∙
Average Rating 4.7
Reviews 1
∙
Average Rating 5.0
Reviews 1
∙
Average Rating 5.0
Reviews 1
∙
Average Rating 5.0
Check out other courses by the instructor!
Explore other courses in the same field!
25% off for new members
$49.40
25%
$59.40