[VLM101] Building a Multimodal Chatbot with Fine-tuning (feat.MCP / RunPod)
dreamingbumblebee
$59.40
Basic / Vision Transformer, transformer, Llama, Model Context Protocol
4.6
(33)
This is an introductory course for understanding the concept and application methods of Vision-Language Models (VLM), and practicing running the LLaVA model in an Ollama-based environment while integrating it with MCP (Model Context Protocol). This course covers the principles of multimodal models, quantization, service development, and integrated demo development, providing a balanced mix of theory and hands-on practice.
Basic
Vision Transformer, transformer, Llama










![[AI Cheat Sheet] The secret to finishing work instantly, Agentic AICourse Thumbnail](https://cdn.inflearn.com/public/files/courses/340717/cover/ai/1/3b5cb844-25b5-4576-8224-d293d0989376.png?w=420)