[VLM101] Building a Multimodal Chatbot with Fine-tuning (feat.MCP / RunPod)
dreamingbumblebee
This is an introductory course for understanding the concept and application methods of Vision-Language Models (VLM), and practicing running the LLaVA model in an Ollama-based environment while integrating it with MCP (Model Context Protocol). This course covers the principles of multimodal models, quantization, service development, and integrated demo development, providing a balanced mix of theory and hands-on practice.
초급
Vision Transformer, transformer, Llama
















