강의

멘토링

커뮤니티

Programming

/

Back-end

Designing Large-Scale Data Processing Patterns Based on Data Workflow Management with Toss Developers

Learn the process of building data pipelines using Apache Airflow from basics to practical application. Understand Airflow's core concepts and architecture, and master advanced design patterns frequently used in practice such as dynamic DAGs, parallel processing, distributed processing, and Custom Operators through hands-on exercises. Set up a practice environment with Python and Docker, and develop practical skills to design and operate real workflows.

(4.8) 13 reviews

162 learners

Level Basic

Course period Unlimited

  • Hong
실습 중심
실습 중심
자격증
자격증
시험
시험
데이터분석
데이터분석
데이터베이스
데이터베이스
Big Data
Big Data
Docker
Docker
docker-compose
docker-compose
airflow
airflow
실습 중심
실습 중심
자격증
자격증
시험
시험
데이터분석
데이터분석
데이터베이스
데이터베이스
Big Data
Big Data
Docker
Docker
docker-compose
docker-compose
airflow
airflow

Reviews from Early Learners

What you will gain after the course

  • Understanding the Concept and Necessity of Apache Airflow

  • Understanding the Structure of Airflow Core Components

  • Dynamic DAG Design Methods

  • TaskGroup and Dependency Management Patterns

  • Parallel Processing and Large-Scale Data Reprocessing Strategy

  • Custom Operator and its related encapsulation, decorator utilization

  • Python & Docker-based Practice Environment Setup

What services would be good to use for large-scale data batch processing pipelines? 🤔

❗This content is from actual conversations.❗

😁 Toss : Hong, do you happen to know airflow??

😄 Hong : I know about it, but I haven't tried it. Why?

😁 Toss : You know that workflow lecture I made last time, I was thinking maybe it would be good to cover airflow as well after seeing that.. I've only used airflow

😄 Hong : But I airflowhaven't used it before, I don't really know how to do it

😁 Toss : It's fine, I'm actually using it in real work right now, so I can proactively teach you. I'll burn myself out for my student

😄 Hong : 😆😆😆😆 Nice concept. Got it. But do we really have to use this?? I honestly can't feel much difference from regular batch processing or cron job processing?

😁 Toss : The fact that you're even thinking about it airflowis already a reason why you should use airflow.. There are some differences between airflow and batch processing or cron jobs. Simply put, it's the same as why you should use workflows, and big data exists too.

What did the Toss senior developer's last statement mean in the previous conversation??🤔

Is Airflow really necessary for building data processing modules?? Why must we use it?? From my perspective, it seems like we could just implement it using regular batch processing modules or cron jobs??

Did you perhaps have thoughts like this?? If so, studying the process of utilizing and implementing Airflow through this course will be a great help for your career.


The answer lies in workflow management. How can this series of processes—from data extraction, processing, to handling—flow stably and be managed sequentially and dependently like a pipeline? What if this entire series of processes could be supported through a single platform?


Rather than a boring lecture that just lists theories, I've prepared this as a practical guide that perfectly conquers the core functions of workflow-based large-scale data pipeline design by examining the operational processes together. 🚀

Features of this course

📌 Rich course content with approximately 30 diagrams and lecture summary files

* This is not just a lecture that explains things with words alone, but also provides actual source code, diagrams, sequence diagrams, and additionally simple summary files of the lecture content.

📌 60% theory, 40% practice, complete testing environment provided

* This is not a lecture that simply lists theories, but provides a lightweight environment where you can actually see and learn the content with your own eyes, and allows you to flexibly test and practice in this environment.

Expertise proven through previous lectures (as of 9.27) 👨‍🏫

🧑‍🎓 3075.0

🧑‍🎓 3794.9

🧑‍🎓 4834.7

🧑‍🎓 2394.8

The course covers these topics. 🧩

* What is Airflow?

*Batch Job & Cron Job Vs Airflow

*Apache Airflow's Disadvantages and Anti-Patterns in Implementation

* Introduction to Overall Core Components Architecture

* WebServer Components Deep Dive

* Scheduler Components Deep Dive

* Executor Components Deep Dive

* MetaDataDB Components Deep Dive

* Dynamic DAG Generation Pattern [ Dynamic DAG ]

* Cross-DAG Dependencies and Data Dependencies

* Designing Complex Workflows Using TaskGroup

* Custom Operator for Reusability and Encapsulation

*Docker, docker-composeLightweight Environment Setup

* Airflow's Parallel Processing and Distributed Processing Strategies

* Notification using Slack

* Distributed Data Processing Using CeleryExecutor

What makes this course special

📌Event providing 50 coupons

Course Early Bird Discount PeriodWe will select 50 people who purchase during the early bird discount period and provide each with one 50% coupon

Related Resources 🚀

The person who created this course 🤭

  • I started as a non-major and am currently working as a platform backend developer in Pangyo

  • The goal is to teach realistic development methods and development theories, and I am a knowledge sharer who creates lectures together with capable acquaintances around me, not alone

  • A knowledge sharer who conducted an interview at Inflearn thanks to their diligent activities


  • A server developer who majored in computer engineering in a rural area, worked as a developer at Naver, and is currently doing backend development at Toss

  • A developer who always gets scolded by Hong for not having enough time...

  • A developer who wants to achieve financial freedom and dreams of solo development

Reference Notes

Practice Environment

  • python3, pip3

    • Python 3.13.2

    • 25.0 from /opt/homebrew/lib/python3.13/site-packages/pip (python 3.13)

  • docker, docker-compose

    • Docker version 28.0.0, build f9ced58158

    • Docker Compose version 2.33.1

  • OS

    • Apple M3 Air

This course is planned to have its discount rate adjusted over time to provide higher discount rates to those who purchase early. Please take note of this.

Recommended for
these people

Who is this course right for?

  • Server/Data Engineer handling large-scale data in production environments

  • Developers who want to build experience in data pipeline design and operations

  • Technical personnel at companies looking to introduce or advance Airflow

  • Architects interested in distributed processing and workflow automation

  • Team leads/senior developers who want to build a stable data platform in production environments

Hello
This is

5,155

Learners

354

Reviews

121

Answers

4.7

Rating

20

Courses

자기 소개

집에서 빈둥대다 개발에 흥미를 느껴 개발 공부를 시작하였고 현재는 판교에서 플랫폼 서버 개발을 담당하여 진행하고 있습니다. 제가 공부를 했던 방법과 실무에서 접하실 수 있는 여러가지 문제점들과 해결책을 여러분들에게 제공하고 싶어 지식공유자 활동을 이어나가고 있습니다.

 

강의는 오로지 저만의 지식을 통해 만들어지지 않습니다. 모든 강의는 함께하시는 분들이 계십니다.

 

지식공유자 경력

[前] 샌드박스IP 관련 블록체인 개발자

[前] 메타버스 백엔드 개발자

[] 판교에서 고여가는 서버 개발자

 

인터뷰 이력

Curriculum

All

29 lectures ∙ (4hr 39min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

All

13 reviews

4.8

13 reviews

  • tttos님의 프로필 이미지
    tttos

    Reviews 8

    Average Rating 5.0

    5

    93% enrolled

    I'm a developer working at Toss who prepared this lecture together on the topic of Airflow, which can be called the flower of batch processing. While the service called Airflow is often quite unfamiliar to many people, as services grow larger, workflow services like this become extremely useful. This is because it's the service that's given top priority when handling large-scale batch processing. Even if you're in an environment where you don't need to learn Airflow yet, the various perspectives and concepts taught in this lecture will definitely help you in your development and study environment. I don't think this is a lecture that simply teaches only Airflow. I ask for your great interest and hope you'll look forward to the next lecture as well. Thank you!!

    • jhong
      Instructor

      💜

  • paulmoon008308님의 프로필 이미지
    paulmoon008308

    Reviews 111

    Average Rating 4.9

    5

    21% enrolled

    • iamzzoon0226님의 프로필 이미지
      iamzzoon0226

      Reviews 10

      Average Rating 5.0

      5

      31% enrolled

      • dellahong님의 프로필 이미지
        dellahong

        Reviews 1

        Average Rating 5.0

        5

        62% enrolled

        I've been using Airflow for 3 years now, but as the scale of data processing has grown, frequent errors have started occurring, so I became curious about how other companies use it and decided to take this course. It's been incredibly helpful! Having both conceptual understanding and hands-on practice of Airflow in a practical context really helps with understanding and applying it :)

        • jhong
          Instructor

          Thank you for the kind review, dellahong. It's even more meaningful to receive feedback from a practitioner. I'll continue to work hard!

      • youngba8935643님의 프로필 이미지
        youngba8935643

        Reviews 6

        Average Rating 5.0

        5

        100% enrolled

        I think this is a meaningful lecture that introduced me to various concepts from perspectives I had never seen or couldn't see before. The content itself was... how should I put it, it seemed like content that broadened my knowledge. Thank you so much for the great lecture. Tenbagger!

        • jhong
          Instructor

          Hello Tenburger! Thank you for leaving such a good review!! I will work hard to provide you with even better content in the future!!

      $61.60

      Hong's other courses

      Check out other courses by the instructor!

      Similar courses

      Explore other courses in the same field!