This is an introductory course to understand the concepts and application methods of Vision-Language Models (VLM), and to practically run the LLaVA model in an Ollama-based environment, practicing the process of integrating it with MCP (Model Context Protocol). This course covers the principles of multimodal models, Quantization, service, and integrated demo development, providing a balanced mix of theory and practice.
56 learners