강의

멘토링

로드맵

NEW
Programming

/

Back-end

Toss Senior Developer's Data Workflow Management-Based Large-Scale Data Processing Design Patterns [ By. Non-CS Major & Toss Developer ]

Learn the process of building data pipelines using Apache Airflow from basics to practical application. Understand Airflow's core concepts and architecture, and master advanced design patterns frequently used in practice such as dynamic DAGs, parallel processing, distributed processing, and Custom Operators through hands-on exercises. Set up a practice environment with Python and Docker, and develop practical skills to design and operate real workflows.

43 learners are taking this course

  • Hong
실습 중심
자격증
시험
데이터분석
데이터베이스
Big Data
Docker
docker-compose
airflow

What you will learn!

  • Understanding the Concept and Necessity of Apache Airflow

  • Understanding the Structure of Airflow Core Components

  • Dynamic DAG Design Methods

  • TaskGroup and Dependency Management Patterns

  • Parallel Processing and Large-Scale Data Reprocessing Strategy

  • Custom Operator and its related encapsulation, decorator utilization

  • Python & Docker-based Practice Environment Setup

What services would be good to use for large-scale data batch processing pipelines? 🤔

❗This content is from actual conversations.❗

😁 Toss : Hong, do you happen to know airflow??

😄 Hong : I know about it, but I haven't tried it. Why?

😁 Toss : You know that workflow lecture I made last time, I was thinking maybe it would be good to cover airflow as well after seeing that.. I've only used airflow

😄 Hong : But I airflowhaven't used it before, I don't really know how to do it

😁 Toss : It's fine, I'm actually using it in real work right now, so I can proactively teach you. I'll burn myself out for my student

😄 Hong : 😆😆😆😆 Nice concept. Got it. But do we really have to use this?? I honestly can't feel much difference from regular batch processing or cron job processing?

😁 Toss : The fact that you're even thinking about it airflowis already a reason why you should use airflow.. There are some differences between airflow and batch processing or cron jobs. Simply put, it's the same as why you should use workflows, and big data exists too.

What did the Toss senior developer's last statement mean in the previous conversation??🤔

Is Airflow really necessary for building data processing modules?? Why must we use it?? From my perspective, it seems like we could just implement it using regular batch processing modules or cron jobs??

Did you perhaps have thoughts like this?? If so, studying the process of utilizing and implementing Airflow through this course will be a great help for your career.


The answer lies in workflow management. How can this series of processes—from data extraction, processing, to handling—flow stably and be managed sequentially and dependently like a pipeline? What if this entire series of processes could be supported through a single platform?


Rather than a boring lecture that just lists theories, I've prepared this as a practical guide that perfectly conquers the core functions of workflow-based large-scale data pipeline design by examining the operational processes together. 🚀

Features of this course

📌 Rich course content with approximately 30 diagrams and lecture summary files

* This is not just a lecture that explains things with words alone, but also provides actual source code, diagrams, sequence diagrams, and additionally simple summary files of the lecture content.

📌 60% theory, 40% practice, complete testing environment provided

* This is not a lecture that simply lists theories, but provides a lightweight environment where you can actually see and learn the content with your own eyes, and allows you to flexibly test and practice in this environment.

Expertise proven through previous lectures (as of 9.27) 👨‍🏫

🧑‍🎓 3075.0

🧑‍🎓 3794.9

🧑‍🎓 4834.7

🧑‍🎓 2394.8

The course covers these topics. 🧩

* What is Airflow?

*Batch Job & Cron Job Vs Airflow

*Apache Airflow's Disadvantages and Anti-Patterns in Implementation

* Introduction to Overall Core Components Architecture

* WebServer Components Deep Dive

* Scheduler Components Deep Dive

* Executor Components Deep Dive

* MetaDataDB Components Deep Dive

* Dynamic DAG Generation Pattern [ Dynamic DAG ]

* Cross-DAG Dependencies and Data Dependencies

* Designing Complex Workflows Using TaskGroup

* Custom Operator for Reusability and Encapsulation

*Docker, docker-composeLightweight Environment Setup

* Airflow's Parallel Processing and Distributed Processing Strategies

* Notification using Slack

* Distributed Data Processing Using CeleryExecutor

What makes this course special

📌Event providing 50 coupons

Course Early Bird Discount PeriodWe will select 50 people who purchase during the early bird discount period and provide each with one 50% coupon

Related Resources 🚀

The person who created this course 🤭

  • I started as a non-major and am currently working as a platform backend developer in Pangyo

  • The goal is to teach realistic development methods and development theories, and I am a knowledge sharer who creates lectures together with capable acquaintances around me, not alone

  • A knowledge sharer who conducted an interview at Inflearn thanks to their diligent activities


  • A server developer who majored in computer engineering in a rural area, worked as a developer at Naver, and is currently doing backend development at Toss

  • A developer who always gets scolded by Hong for not having enough time...

  • A developer who wants to achieve financial freedom and dreams of solo development

Reference Notes

Practice Environment

  • python3, pip3

    • Python 3.13.2

    • 25.0 from /opt/homebrew/lib/python3.13/site-packages/pip (python 3.13)

  • docker, docker-compose

    • Docker version 28.0.0, build f9ced58158

    • Docker Compose version 2.33.1

  • OS

    • Apple M3 Air

This course is planned to have its discount rate adjusted over time to provide higher discount rates to those who purchase early. Please take note of this.

Recommended for
these people

Who is this course right for?

  • Server/Data Engineer handling large-scale data in production environments

  • Developers who want to build experience in data pipeline design and operations

  • Technical personnel at companies looking to introduce or advance Airflow

  • Architects interested in distributed processing and workflow automation

  • Team leads/senior developers who want to build a stable data platform in production environments

Hello
This is

3,137

Learners

212

Reviews

85

Answers

4.6

Rating

16

Courses

자기 소개

집에서 빈둥대다 개발에 흥미를 느껴 개발 공부를 시작하였고 현재는 판교에서 플랫폼 서버 개발을 담당하여 진행하고 있습니다.

 

제가 공부를 했던 방법과 실무에서 접하실 수 있는 여러가지 문제점들과 해결책을 여러분들에게 제공하고 싶어 지식공유자 활동을 이어나가고 있습니다.

 

강의는 오로지 저만의 지식을 통해 만들어지지 않습니다. 모든 강의는 함께하시는 분들이 계십니다.

 

지식공유자 경력

[前] 샌드박스 블록체인 개발자

[前] 넥슨 자회사 백엔드 개발자

[] 판교에서 고여가는 서버 개발자

 

인터뷰 이력

Curriculum

All

29 lectures ∙ (4hr 39min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

Not enough reviews.
Please write a valuable review that helps everyone!

Limited time deal

$39,600.00

50%

$61.60

Hong's other courses

Check out other courses by the instructor!

Similar courses

Explore other courses in the same field!