inflearn logo

The Complete Guide to Airflow - Part 1

This course is a practical Airflow master program designed to help you understand "why Airflow works this way" and enable you to design and debug data pipelines on your own. It covers everything from Apache Airflow's core mechanisms to detailed theory and hands-on practice involving DAGs, Operators, Hooks, Scheduling, Timezones, Idempotency, and Templates.

20 learners are taking this course

Level Intermediate

Course period Unlimited

Data Engineering
Data Engineering
airflow
airflow
orchestration
orchestration
Data Engineering
Data Engineering
airflow
airflow
orchestration
orchestration

What you will gain after the course

  • Understanding the core operating mechanisms of key Airflow components

  • Understanding the roles and internal operating structures of Operators, Hooks, and the TaskFlow API

  • How to use various types of Operators (Bash, Http, SQL, S3)

  • Implementing a Practical Data Pipeline Using API, SQL, and Object Storage

  • Acquire the ability to resolve practical operational issues such as retries, timezone errors, and confusion between catchup and backfill.

  • Ability to design stable pipelines with idempotency in mind

  • Clearly understand the difference between interval-based schedules and execution-time-based schedules.

  • Reach a level where you can design and operate Airflow in a professional environment.

Master Data Pipelines in One Go! Airflow Master Class

This is a practical Airflow master course that will help you clearly understand the complexity of data pipelines and enable you to handle everything from design to debugging on your own.


Why is the data pipeline I built getting more and more complex?
Why does it take half a day just to fix one problem?

Have you ever received an emergency call in the middle of the night because your Airflow schedule got messed up?

Have you ever struggled with complex preprocessing because of redundant data accumulation?

This course is Part 1, and Part 2 will be released as a separate course in mid-June 2026. Part 2 will cover Sensors, Assets, Dynamic Task Mapping, Task Groups, Notifications, and various additional Operators and Hooks.

This course covers theory and practice using the latest version (as of the course release), Airflow 3.1.


From the core operating principles of Airflow
to the practical use of various Operators and Hooks


We will help you grow into an 'engineer capable of both designing and operating' data pipelines on your own.



By the end of this course, you will be able to

You will find clear answers to the question of 'Why?' regarding Airflow.

  • Through detailed theory and hands-on practice covering Airflow's core mechanisms, DAGs, Operators, Hooks, Scheduling, Timezone, Idempotency, and Templates, we help you gain a deep understanding of "why Airflow works this way." This will enable you to develop practical skills to design and debug data pipelines on your own.

You will acquire exceptional Airflow problem-solving skills that are recognized in the field

  • Structurally understand and resolve issues frequently encountered during Airflow operations, such as the difference between execution dates and actual run times, Timezone errors, the relationship between retries and idempotency, and the mechanics of catchup and backfill. By clearly grasping internal operational principles through theory and practice, you will be able to go beyond simple troubleshooting to eliminate root causes and design systems that prevent recurrence.

Confidently design and build stable data pipelines yourself.

  • You will build confidence in Airflow by directly constructing practical data pipelines that connect various environments such as SQL, APIs, and Object Storage. Furthermore, you will develop the skills to design and implement pipelines with idempotency in mind, ensuring system stability even in unpredictable situations like duplicate data loading or processing failures.

Grow into an Airflow expert.

  • I have designed this course with practical content that is essential for real-world production environments. Beyond simply learning how to use Airflow, upon completion of this course, you will transform into a hands-on expert capable of confidently answering Airflow-related questions within your team and successfully designing and operating complex data pipelines.



📚

Mastering Airflow's operating principles
through theory and practice


Airflow is not difficult—with this course,

Master the fundamental concepts of Airflow and progress through step-by-step learning suitable for both beginners and experienced users, covering everything from Operators and Hooks to scheduling and Templates.


Airflow Fundamentals 01 ~ 02

  • You will learn in detail about Airflow's core components—DAGs, Tasks, and defining dependencies between Tasks—as well as XCom, the mechanism for data transfer between Tasks. Additionally, you will gain a clear understanding of Airflow's operational mechanisms by studying TaskFlow API-based DAGs, the roles of key Airflow components, automatic retries, re-executing Tasks and DAG Runs through the Clear function, and the Airflow Context.


Utilizing Bash and HTTP Operators

Practice executing shell scripts using the Bash Operator and integrating with external APIs through the HTTP Operator. Build a foundation for integration with various external systems.


Database Integration Using SQL Operators and Hooks

Learn how to connect to MySQL and PostgreSQL databases to execute SQL queries. Strengthen your ability to build database-driven data processing pipelines.


Object Storage Integration Using S3 Operator and Hook (MinIO)

Practice how to integrate with S3-compatible object storage using MinIO. This content is essential for building cloud storage-based data management and processing pipelines.


Airflow Scheduling 01 ~ 02

We have filled this section with extensive theory and hands-on practice to help you clearly understand the often-confusing interval-based scheduling, timezone behavior, and methods for maintaining idempotency in periodically executed DAGs. Additionally, we provide detailed explanations on catchup, backfill, and the latest cron-expression and Timetable-based Point-In-Time scheduling.


Airflow Templating

You will learn how to configure dynamic workflows using Airflow's Jinja templates. This will allow you to write Operators and DAGs more flexibly and efficiently.


We can solve
these concerns!

📌

Working Junior Data Engineers

Those who face difficulties during operation because they do not clearly understand why an Airflow DAG runs at a specific time or why it executes redundantly
Those who find it challenging to identify the root cause of issues while debugging a running DAG

📌

Backend developers with little experience in data pipelines

Those who have introduced Airflow for batch job automation or data processing, but feel limited in its use because they do not have a deep understanding of core concepts such as
Operators, Hooks, and Scheduling.

📌

Intermediate data engineers with Airflow operational experience

Those who want to effectively resolve various issues that arise during Airflow operation, such as ensuring idempotency, implementing retry logic, and timezone setting errors,
and design more stable and efficient data pipelines.

Notes before taking the course


Practice Environment 💾

Practice Environment Specifications

  • Operating System (OS): The practice sessions are conducted on Windows, but they can be performed on Mac without any issues.


  • PC Specifications: A PC with internet access allowed for installing Docker, Docker Desktop, and VSCode, requiring at least 6GB of RAM

  • Airflow is installed as a Docker Container using the Astro CLI. The Airflow version is 3.1.

Learning Materials

  • Lecture materials can be downloaded within the course.

  • Practice materials can be downloaded at https://github.com/chulminkw/airflow_part_01 . By reviewing the practice code, you can gauge the level of Python and SQL proficiency required for the course in advance.

Recommended for
these people

Who is this course right for?

  • For everyone who uses Airflow but doesn't know why it works this way

  • Junior to Mid-level Data Engineer

  • A backend developer (responsible for batch/data processing) who wants to transition into a data engineer.

  • AI engineers who need to build data pipelines for MLOps (although this course does not cover AI itself)

  • All practitioners who want to learn Airflow "properly"

Need to know before starting?

  • Basic proficiency in Python and SQL

Hello
This is dooleyz3525

27,623

Learners

1,476

Reviews

4,057

Answers

4.9

Rating

15

Courses

(Former) Encore Consulting

(Former) Oracle Korea

AI Freelance Consultant

Author of Python Machine Learning Perfect Guide

More

Curriculum

All

124 lectures ∙ (20hr 54min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

Not enough reviews.
Please write a valuable review that helps everyone!

dooleyz3525's other courses

Check out other courses by the instructor!

Similar courses

Explore other courses in the same field!

Limited time deal

$31.90

38%

$51.70