BEST

Understanding LLM Architecture and GPU Utilization Strategies for AI Beginners

Name: Understanding LLM Architecture and GPU Utilization Strategies for AI Beginners
Price: 143000 KRW
Rating: 5 (16 reviews)

Understand Transformer-based LLM architectures and GPU utilization strategies, and gain hands-on experience with the actual serving process using vLLM. This course covers the entire practical workflow, from building AI system pipelines to monitoring and multi-GPU utilization, and is designed for intuitive understanding through diagrams and practice without complex formulas.

(5.0) 16 reviews

214 learners

Level Basic

Course period Unlimited

hyunjinkim

GPU

attention-model

transformer

LLM

GPU

attention-model

transformer

LLM

Reviews from Early Learners

5.0

WonJune Lee

43% enrolled

I am not in the deep learning industry, but I work in the field of computer vision (rule-based). Since my company requires LLM and vision-related deep learning technologies, I have been studying these topics. I have only completed about 40% of the course, but I felt compelled to leave a review now. I have taken many deep learning courses, including those by famous and highly-rated instructors, but I haven't found any course as clean and clear as this one. The best part is the quality of the lecture materials. The instructor recorded every single matrix calculation in Excel, which is incredibly helpful when reviewing. The Python code is also well-commented in many places. The quality of the lectures themselves is excellent; the instructor reminds you of parts you might have forgotten, ensuring you don't miss anything. While most other courses show calculations once or twice and then move on, this course goes through the calculations together until the end, which provides great clarity. It seems like the Q&A is monitored frequently, as I received immediate answers to my questions. The lectures seem to have been filmed this year, so it's great that they include a lot of the latest trends. It feels like this course hasn't gained much word-of-mouth yet, but I highly recommend it to anyone who needs to study these topics.

5.0

jjanec

31% enrolled

It's challenging, but I'll do my best to follow along. I like that the content contains only the essential core information.

5.0

김민서

31% enrolled

It's very helpful.

What you will gain after the course

Understanding the encoder-decoder structure and core operating principles of the Transformer model
Understanding the evolution of the latest attention mechanisms, including MHA, MQA, GQA, and MLA
Hands-on practice on how to utilize the vLLM engine, the de facto standard for current AI serving
Monitoring key performance metrics such as TTFT and TPOT in a vLLM serving environment
Design and implementation of multi-GPU architecture utilizing Tensor/Pipeline/Data Parallelism
Understanding the Core Concepts of Agent AI and the Principles of Tool Calling
Experience in building AI system pipelines and monitoring performance from a real-world industry perspective
Understanding the latest LLM trends such as MLA, MTP, and n-grams based on the latest research papers

In the era of autonomous AI Agents,
you can utilize various Agent tools and Public APIs such as OpenAI, Claude, and Codex.

However, in a real service environment, you must also consider
data security, network costs, token costs, and GPU resource management.

Therefore, what is important is
an understanding of the Hybrid AI architecture, which combines Public APIs and self-hosted GPU-based LLMs
according to the situation.
sao cho phù hợp với từng tình huống.

In that case, is using only Public APIs always the best option?

Not necessarily.

These days, many LLMs comparable to public APIs (ChatGPT, Claude, Sonnet, etc.) are being developed both domestically and internationally.

Therefore, it is now time to learn the architecture for serving LLMs directly.

🌟 From LLM Architecture to Serving

In the era of agents, we have moved from the age of training to the age of inference. While using public APIs effectively is necessary, many companies prefer building local serving environments for various reasons such as security, governance, and cost. Learn everything from understanding LLM architecture for building local serving environments to architectural configuration and the latest trends in LLM development.

Attention is the beginning and end of the Transformer model, which serves as the foundation for current LLM models.

The attention-model emerged in 2017, but
it has remained the most powerful algorithm for nearly a decade.
While many efforts are being made to move beyond the Transformer structure,
no architecture has yet emerged that completely replaces the Transformer's attention.

⚠️ You must never have just a vague understanding of attention.

Gain a perfect understanding of the principles of attention and learn about its evolutionary flow.

⚠️ This course will be updated as vLLM is updated.

vLLM's update speed is very fast. However, the major version is still in the 0.x range.
Nevertheless, many companies are using vLLM as their inference engine as a de facto standard.
vLLM supports not only the Transformer models that currently form the backbone of LLMs but also alternative architectures like Mamba , and it is updated every time new features are added to models, such as Multi Token Prediction, to support them.
This course will also be updated as new vLLM features or new model types are released.

Don't miss out on the latest LLM trends.

Recommended for
these people

Who is this course right for?

Practitioners who use ChatGPT and generative AI but want to understand how LLMs actually work
Beginners who aim to become AI engineers and want to systematically learn LLM serving and system architecture.
Developers who want to understand Transformer and Attention structures from a practical perspective without complex formulas.
Backend and infrastructure engineers who want to understand GPU optimization and the actual workflow of building AI systems in multi-GPU environments
PMs and planners who want to understand LLM architecture and GPU utilization strategies during the AI service planning and development process.

Need to know before starting?

Understanding of basic Python syntax (variables, functions, conditional statements, etc.)
Basic usage of git

Hello
This is hyunjinkim

Inflearn Verified

1,672

Learners

117

Reviews

246

Answers

4.9

Rating

Courses

Hello.

I am a 17-year veteran currently working in the Data & AI field at a large corporation.

Since obtaining my Professional Engineer Information Management certification, I have been creating content to share the knowledge I've gained with as many people as possible.

Nice to meet you. :)

Contact: hjkim_sun@naver.com

Curriculum

All

54 lectures ∙ (14hr 27min)

Course Materials:

Lecture resources

Section 1. Course Introduction

2 lectures ∙ (11min)

Section 2. Getting familiar with HuggingFace

4 lectures ∙ (1hr 13min)

3. The necessity of building a local LLM
13:37
4. HuggingFace Model Download & Inference
19:04
5. Tokenizer
22:53
6. Embedding
17:43

Section 3. Understanding the Transformer Model

7 lectures ∙ (1hr 43min)

7. Understanding Transformers
21:23
8. Attention Mechanism
16:57
9. View encoder model in detail
18:05
10. View decoder model in detail
11:30
11. Understanding the head
16:35
12. Encoder vs Decoder
09:17
13. View model source code
09:35

Section 4. Everything about Decoders

5 lectures ∙ (1hr 15min)

Section 5. Serving Engine, vLLM

6 lectures ∙ (1hr 34min)

Section 6. Runpod & Service Development

6 lectures ∙ (1hr 33min)

Section 7. Tool Calling

5 lectures ∙ (1hr 26min)

Section 8. Performance Testing and Monitoring

4 lectures ∙ (1hr 33min)

Section 9. Multi-GPU

5 lectures ∙ (1hr 13min)

Section 10. Learn more about vLLM

4 lectures ∙ (56min)

Section 11. AI Development Trends

6 lectures ∙ (1hr 45min)

Published: 04/02/2026

Last updated: 05/26/2026

Reviews

All

16 reviews

5.0

16 reviews

boyminseo1183
Reviews 1
∙
Average Rating 5.0
06/14/2026
5
31% enrolled
It's very helpful.
- hyunjinkim
  Instructor
  06/16/2026
  Thank you for the review, Minseo Kim. I hope the lecture was very helpful ^-^
kjunekjune0812
Reviews 3
∙
Average Rating 5.0
04/08/2026
Edited
5
43% enrolled
I am not in the deep learning industry, but I work in the field of computer vision (rule-based). Since my company requires LLM and vision-related deep learning technologies, I have been studying these topics. I have only completed about 40% of the course, but I felt compelled to leave a review now. I have taken many deep learning courses, including those by famous and highly-rated instructors, but I haven't found any course as clean and clear as this one. The best part is the quality of the lecture materials. The instructor recorded every single matrix calculation in Excel, which is incredibly helpful when reviewing. The Python code is also well-commented in many places. The quality of the lectures themselves is excellent; the instructor reminds you of parts you might have forgotten, ensuring you don't miss anything. While most other courses show calculations once or twice and then move on, this course goes through the calculations together until the end, which provides great clarity. It seems like the Q&A is monitored frequently, as I received immediate answers to my questions. The lectures seem to have been filmed this year, so it's great that they include a lot of the latest trends. It feels like this course hasn't gained much word-of-mouth yet, but I highly recommend it to anyone who needs to study these topics.
- hyunjinkim
  Instructor
  04/08/2026
  Hello Wonjune Lee, Thank you for your thoughtful review! I put a lot of thought into improving the quality of the lecture materials so that students could receive meaningful resources and review them effectively even later on. I also spent a lot of time thinking about how to effectively convey operations like Attention. The conclusion I reached was that it shouldn't be explained through formulas alone, nor through simple metaphors, nor just through torch code. Believing that it can only be understood by following the flow visually, I tried my best to explain it using Excel, and I'm glad to hear that it was conveyed well :) I hope you gain great insights from the remaining parts of the course. Keep it up!
logt
Reviews 11
∙
Average Rating 5.0
06/08/2026
5
100% enrolled
I've completed the course! Thank you so much for providing such high-quality education!! Except for the notes I left in the Q&A, I had no issues performing all the exercises on a Windows-based system!
- hyunjinkim
  Instructor
  06/16/2026
  Hello logt! You left this review after completing 100% of the course! I will make sure to update the part regarding container issues on Windows OS that you mentioned in the Q&A. Thank you.
jjhgwx
Reviews 1,010
∙
Average Rating 4.9
05/04/2026
5
7% enrolled
Thank you for the great lecture!
- hyunjinkim
  Instructor
  05/07/2026
  Hello Jang jaehoon, Thank you for the course review 👍 I see you've completed 7%. I hope you enjoy the rest of the lessons and find them very helpful. You can do it!
ec93030947
Reviews 4
∙
Average Rating 5.0
06/29/2026
5
31% enrolled
It's challenging, but I'll do my best to follow along. I like that the content contains only the essential core information.

hyunjinkim's other courses

Check out other courses by the instructor!

Realtime Datalake Using Kafka & Spark

hyunjinkim

Beginner's Kafka & Spark Real-time Pipeline Course. All-in-one: Master concepts to architecture.

Basic

Kafka, Apache Spark, pyspark

Realtime Datalake Using Kafka & Spark

hyunjinkim

Airflow Master Class

hyunjinkim

This is a course to learn about Airflow, an Orchestration tool for efficiently building and managing data pipelines. Welcome to the Airflow Master Class, where even beginners can learn step-by-step!

Basic

airflow, Data Engineering, Python

Airflow Master Class

hyunjinkim

Similar courses

Explore other courses in the same field!

Understanding and Utilizing ChatGPT

papadave

With ChatGPT, create your own content (songs, comics, blogs, books, papers, etc.) and create new businesses.

Basic

ChatGPT, LLM, NLP

Understanding and Utilizing ChatGPT

papadave

[Hi,AI!] AI, who are you? First Steps into AI for Elementary, Middle, and High School Students!

inflearn

<This course is part of the 'Wrtn AI Literacy Voucher Program', a collaboration between Inflearn and Wrtn, and is available completely free for elementary, middle, and high school students who are in the AI vulnerable demographic.> From the fundamentals of artificial intelligence to generative AI that creates text, images, and videos! Artificial intelligence is already being used everywhere in our daily lives. This course is designed to help you easily understand what AI is, how it learns, and how it works. Without complicated formulas, just follow along like listening to a story and you'll naturally become friends with artificial intelligence!

Beginner

AI, wrtn

[Hi,AI!] AI, who are you? First Steps into AI for Elementary, Middle, and High School Students!

inflearn

How to Use AI to Boost Work Productivity: A Guide to Tool Selection and Application by Field

aladinacademy

Have you wanted to make good use of AI, but felt it was too difficult and complex to get started? This course was prepared specifically for you. Rather than complex theories or difficult technical explanations, we have selected only the practical applications that professionals can use in their work immediately. From schedule management and meeting minutes to press releases, reports, presentation materials, image generation, infographics, and automation of repetitive tasks. We explain in simple terms how to connect AI to the tasks that professionals encounter most frequently. Point 1. Reduce Work Time - Learn how to quickly handle repetitive tasks such as organizing meeting minutes, managing schedules, and drafting reports. - Understand the workflow of generating and editing practical documents like press releases and proposals on the spot. - Shorten overall work speed by connecting multiple tasks into a single flow. Point 2. Improve the Quality of Results - Learn structuring methods to make reports and presentations more readable and persuasive. - Create various outputs such as PPTs, images, and infographics all at once. - Improve visual quality using data and design guides. Point 3. Utilize AI as a Work Partner - Go beyond just using features and understand how to integrate AI into your specific workflow. - Consistently derive desired results based on prompts and data. - Learn collaboration methods that maintain context without the need for repetitive instructions.

Basic

claude, AI, Business Productivity

How to Use AI to Boost Work Productivity: A Guide to Tool Selection and Application by Field

aladinacademy

(2026 Edition) How to Use ChatGPT, Generative AI Prompt Engineering A to Z - Understanding and Utilizing Artificial Intelligence

Masocampus

Have you ever thought, "I wish I had Iron Man's perfect assistant, Jarvis..."?! Complete everyone's Jarvis to help you handle your work wisely!

Beginner

AI, ChatGPT, prompt engineering

(2026 Edition) How to Use ChatGPT, Generative AI Prompt Engineering A to Z - Understanding and Utilizing Artificial Intelligence

Masocampus

Mastering AI PPT Generation

signboardkr2653

A lecture that masters any tool by learning the 'principles' of how AI creates PPTs. From data research to completion using Gemini, Gamma, and Claude, finish a full PPT deck within 30 minutes.

Beginner

ChatGPT, AI, PowerPoint

Mastering AI PPT Generation

signboardkr2653

Limitations, Latest Technologies, and Future Outlook of LLMs

arigaram

Explore the limitations of LLMs, the latest technologies to overcome them, and projected future technologies.

Beginner

NLP, Service Planning, Content Planning

Limitations, Latest Technologies, and Future Outlook of LLMs

arigaram

Jenspark AI Agent 4-Week Completion Challenge - Detailed Feature Descriptions from A to Z Like Nowhere Else in the World

seulkikang

Genspark Ambassador's Lecture >> Genspark, the AI Super Agent that smartly handles everything from PPT, Excel, images, and videos to Vibe Coding! You’ve been waiting because there were no lectures or books that covered Genspark in depth, right?! Rated 4.8 by 100 students ⭐️⭐️⭐️⭐️⭐️ After using it for a year, I have organized all the detailed features into extensive learning materials and a 4-hour lecture. If you know how to use it, Genspark solves all your work; if you don't, it just eats up your credits. Increase your productivity by 300% through this lecture!

Basic

PowerPoint, AI, AI Agent

Jenspark AI Agent 4-Week Completion Challenge - Detailed Feature Descriptions from A to Z Like Nowhere Else in the World

seulkikang

Let's Efficiently Manipulate AI! ChatGPT Prompt Engineering Part. 2

usefulit

Learn the core techniques of 'Prompt Engineering' to get exactly the answers you want from AI. From a complete beginner's first steps to expert practical applications, maximize your AI utilization skills!

Basic

ChatGPT, prompt engineering, AI

Let's Efficiently Manipulate AI! ChatGPT Prompt Engineering Part. 2

usefulit

Building Production-Ready Generative AI Applications with LLMs

hammad

Master the complete lifecycle of building modern Generative AI applications using Large Language Models, Retrieval-Augmented Generation, and Agentic AI systems. Learn to design enterprise-grade AI solutions from prompt engineering to deployment, combining LLMs with vector databases and external knowledge sources for production-ready applications.

Intermediate

agents, vector-database, prompt engineering

Building Production-Ready Generative AI Applications with LLMs

hammad

Introduction to AI Principles: Understanding the Fundamentals

signboardkr2653

Mastering difficult AI with just one single analogy?! A first step into AI principles, understanding through core concepts rather than just terminology.

Beginner

AI, ChatGPT, LLM

Introduction to AI Principles: Understanding the Fundamentals

signboardkr2653

AI Tech DNA Series: ①/⑥ Core Principles of Computer Basics from an AI Perspective

sdj0831

AI Tech DNA Series: Core Principles of Computer Basics from an AI Perspective is an introductory course designed to help you easily understand the fundamental computer concepts necessary for the AI era. By explaining basic principles such as CPU, memory, storage, networks, and data structures through real-life analogies and an AI-focused lens, it builds the solid foundational strength needed to advance into Python, AI, security, and the cloud.

Beginner

Network, Computer Architecture, Operating System

AI Tech DNA Series: ①/⑥ Core Principles of Computer Basics from an AI Perspective

sdj0831

Local LLM Utilization Guide Part 1 - Using small LLM(sLLM) & Evaluating and Improving LLM Performance

AISchool

Learn how to utilize various local LLMs (Qwen, Gemma) and explore different techniques to efficiently evaluate and improve the performance of LLM systems.

Intermediate

AI, LLM, LangChain

Local LLM Utilization Guide Part 1 - Using small LLM(sLLM) & Evaluating and Improving LLM Performance

AISchool

From LLM Fundamentals to the Latest RAG & LangChain: Master the LLM Basics Course in Just 5 Hours!

HappyAI

This is a course to master the fundamental theories of LLM and the core technologies of LangChain and RAG. You can easily learn the latest AI technologies used in practice, starting from LLM basics!

Basic

Chatbot, LLM, LangChain

From LLM Fundamentals to the Latest RAG & LangChain: Master the LLM Basics Course in Just 5 Hours!

HappyAI

100 Prompts to Learn AI Utilization: Beginner's Guide

sarc

In the generative AI era, anyone can use it, but those who use it well are different. For those new to GPT, Gemini, and Claude, you will fully understand the world of AI through 100 practical prompts that you learn by actual input.

Beginner

AI, ChatGPT

100 Prompts to Learn AI Utilization: Beginner's Guide

sarc

FEEL THE AGI: LLM and AI Agents

Feel The AGI

LLM and Agent, the fundamental technologies behind AI like ChatGPT. LLM is currently the technology that has come closest to AGI. In this lecture, we will explore LLM and Agent technologies from a layperson's perspective.

Beginner

AI, LLM, AI Agent

FEEL THE AGI: LLM and AI Agents

Feel The AGI

Large Language Models, Just the Essentials!

haesunpark

This is a lecture covering LLM theory and practical examples based on <Large Language Models, Just the Essentials!> (Insight, 2025).

Beginner

Artificial Neural Network, PyTorch, LLM

Large Language Models, Just the Essentials!

haesunpark

[Complete NLP Mastery I] The Birth of Attention: Understanding NLP from RNN·Seq2Seq Limitations to Implementing Attention

Sotaaz

We understand why Attention was needed and how it works by 'implementing it directly with code'. This lecture starts from the structural limitations of RNN and Seq2Seq models, experimentally verifies the information bottleneck problem and long-term dependency issues created by fixed context vectors, and naturally explains how Attention emerged to solve these limitations. Rather than simply introducing concepts, we directly confirm RNN's structural limitations and Seq2Seq's information bottleneck problems through experiments, and implement **Bahdanau Attention (additive attention)** and **Luong Attention (dot-product attention)** one by one to clearly understand their differences. Each attention mechanism forms Query–Key–Value relationships in what way, has what mathematical and intuitive differences in the weight calculation process, and why it inevitably led to later models naturally connects to their characteristics and evolutionary flow. We learn how Attention views sentences and words, and how each word receives importance weighting to integrate information in a form where formula → intuition → code → experiment are connected as one. This lecture is a process of building 'foundational strength' to properly understand Transformers, helping you deeply understand why the concept of Attention was revolutionary, and why all subsequent state-of-the-art NLP models (Transformer, BERT, GPT, etc.) adopt Attention as a core component. This lecture is optimized for learners who want to embody the flow from RNN → Seq2Seq → Attention not through concepts but through code and experiments.

Beginner

Python, Deep Learning(DL), PyTorch

[Complete NLP Mastery I] The Birth of Attention: Understanding NLP from RNN·Seq2Seq Limitations to Implementing Attention

Sotaaz

Enterprise LLM Adoption and Utilization Strategies for Our Organization

leejoon8212

This video shows effective ways to link internal company data to sLM, and practical solutions for various issues from sLM adoption.

Beginner

LLM, AI

Enterprise LLM Adoption and Utilization Strategies for Our Organization

leejoon8212

#1 OpenClaw: Creating Your Own AI Assistant

dakgangjung123

What if AI could directly open a browser and write shell scripts to resolve server issues? Through practical automation know-how implemented with OpenClaw across Windows and Linux, we guide you on the shortest path to completing your own powerful remote development environment without complex configurations.

Beginner

LLM, AI, AI Agent

#1 OpenClaw: Creating Your Own AI Assistant

dakgangjung123

AI Development Part 3: Practical Machine Learning Projects

softcampus

"Beyond Data Analysis: Mastering Predictive Modeling with 5 Real-World Projects (45 Lectures Total)" Have you finished learning data analysis but feel stuck when it's time to actually build a model? Beyond simply learning how to call libraries, this course will help you fully master the inner workings of algorithms and optimal model validation strategies—ranging from Titanic survival prediction to spam text classification. Systematically conquer projects that wield the most power in the industry, from linear models to the latest ensemble algorithms and the basics of Natural Language Processing (NLP). Step into the world of AI modeling and start predicting the future based on analyzed data.

Basic

Machine Learning(ML), NLP, Algorithm

AI Development Part 3: Practical Machine Learning Projects

softcampus

Understanding LLM Architecture and GPU Utilization Strategies for AI Beginners

5.0

What you will gain after the course

In the era of AI Agents,practical skills for understanding AI systems are becoming increasingly important.

From Transformer-based LLM architecturesto GPU utilization, vLLM serving, and multi-GPU strategies

LLM Architecture Practical Class

3 models selected as a result of the 1st evaluation of domestic Sovereign AI

Therefore, it is now time to learn the architecture for serving LLMs directly.

🌟 From LLM Architecture to Serving

Lecture Core Structure

Core 1. Understanding Hugging Face Models

Learn how to decode the config.json file through this lecture.

Core 2. Mastering Attention

Gain a perfect understanding of the principles of attention and learn about its evolutionary flow.

Core 3. Mastering Multi-GPU Architecture

We will pass on essential GPU utilization strategies, a necessary gateway to becoming a core AI engineer.

😄 Recommended for these people

AI Beginners

AI Beginner

AI Engineer

💡 What you will learn in this lecture

Step 1. Foundation

Step 2. Attention

Step 3. Serving

Step 4. Tool Call

Step 5. Optimization

Step 6. Advanced

💡 Key Lecture Points

Point 1

Point 2

Point 3

Point 4

Point 5

Point 6

✅ Tools used in the lecture

✅ Server Practice Environment Guide

Runpod

Google Colab

✅ Local Practice Environment Guide

Runpod and Colab are used as the primary practice environments, but

You will be practicing by running OpenWebUI and FastAPI within your local environment..

⚠️ This course will be updated as vLLM is updated.

Don't miss out on the latest LLM trends.

Recommended for these people

HelloThis is hyunjinkim

Curriculum

Reviews

hyunjinkim's other courses

Similar courses

In the era of AI Agents,
practical skills for understanding AI systems are becoming increasingly important.

From Transformer-based LLM architectures
to GPU utilization, vLLM serving, and multi-GPU strategies

Recommended for
these people

Hello
This is hyunjinkim