Kafka & Spark 활용한 Realtime Datalake
김현진
초보자를 위한 Kafka & Spark 실시간 파이프라인 입문 강의. 핵심 개념부터 아키텍처까지 마스터하기 위한 올인원 강의입니다.
초급
Kafka, Apache Spark, pyspark
This is a course to learn about Airflow, an Orchestration tool for efficiently building and managing data pipelines. Welcome to the Airflow Master Class, where even beginners can learn step-by-step!
897 learners
Airflow Concepts and Basics
Airflow-based Pipeline Development
Sending Automated Emails with Airflow
Airflow-based Public Data API Calls and Visualization
Airflow & Kakao, Slack for Message Alarm
Utilizing ChatGPT with Airflow
Data Pipeline, No More Worries with Airflow 📊
👉 It covers everything from the basic concepts of Apache Airflow to the architecture configuration that can operate in a large-scale environment.
👉 About 80 practice files can be downloaded from Github .
But why Airflow?
Airflow is a core orchestration solution that creates and manages data pipelines that extract, process, store, and analyze data.
Airflow is the most popular pipeline management tool among similar solutions, and its adoption continues to grow.
Airflow Basics
You will learn the basics of Airflow, including the concepts and how to create workflows, through hands-on practice. It is organized so that you can learn step by step with about 60 practice files.
Pipeline Configuration
Learn how to develop and run a DAG pipeline using Airflow, including sending emails with scheduling management.
Data collection
Let's configure a pipeline that receives and stores data via API from the Seoul Metropolitan Government Public Data Portal.
Monitoring and Integration
We will practice receiving alarms such as error messages and DAG status by linking with messenger apps such as KakaoTalk and Slack.
Data Visualization
We introduce the concept of R Shiny, which can be used for visualization using the R language. We will proceed with visualization using data received from the Seoul Public Data Portal.
Architecture
Learn about Airflow's different deployment approaches and architectures, and how to operate reliably in high-volume environments.
Automation of business
Introduce the concept of ChatGPT and learn how to connect Python API and ChatGPT. Practice automation by automatically posting to your blog the content introduced by ChatGPT about stocks that are rising rapidly through the method of retrieving stock information with Python.
1. Basic knowledge of Python
2. Docker and Docker Compose
3. SQL
Q. How are the lectures conducted?
In Airflow, workflow is called DAG , and we will practice by creating DAG together. Except for the time explaining the basic concepts, we will basically practice in each chapter.
If the practice file is long, I create a DAG file in advance and proceed by explaining the logic.
Q. Can I download practice files and study materials?
Of course! You can get all the practice files from Github . Not sure how to use Git? We'll teach you how to use Git too.
We also provide all PDF-based learning materials. You can download them from Section 0 - Download Lecture Materials.
Q. How difficult is the practical training?
In the beginning, you can understand it by just knowing the basic grammar of Python, but as you progress to the latter part, the difficulty level can be a little difficult, so it will be helpful to know concepts such as Python classes and inheritance. But don't worry. The practical content will be explained sufficiently and you will proceed.
Q. What can I do if I learn Airflow?
Bash Shell, anything you can do with Python, you can do. If you're wondering if something can be done with Airflow, first find out if it can be done with Bash Shell or Python. If you can do it with Bash Shell or Python, you can do it with Airflow.
Who is this course right for?
Those who want to learn about Data Engineers
Those curious about Airflow
Airflow users not utilizing it well
Requiring data pipeline setup and management.
Need to know before starting?
Python Fundamentals
Docker & Docker Compose Usage
SQL Basic Syntax(SELECT, FROM)
1,063
Learners
54
Reviews
185
Answers
4.9
Rating
2
Courses
안녕하세요.
데이터 & AI 분야에서 일하고 있는 15년차 현직자입니다.
정보관리기술사를 취득한 이후 지금까지 얻은 지식을 많은 사람들에게 공유하고자 컨텐츠 제작하고 있습니다.
반갑습니다. :)
Contact: hjkim_sun@naver.com
All
107 lectures ∙ (24hr 56min)
Course Materials:
All
43 reviews
4.9
43 reviews
Reviews 1
∙
Average Rating 5.0
5
주위에 데이터 엔지니어 공부를 처음 시작한 사람이 있다면 무조건 추천해주고싶습니다. (비전공자 입장에서) 데이터 엔지니어 공부를 시작할때 git도 알아야하고 리눅스도 알아야하고, 파이썬도 알아야하고 airflow도 알면 좋다는데 각각 얼만큼 알아야하는지 몰라 혼란에 빠진 시간이 길었습니다. 이번 강의를 들으면서 airflow에 필요한 git과 리눅스의 기초도 어느정도 배우게 되어 좋았고, 또 이후 DAG 실습도 풍부하다고 알고있어서 꽤 기대가 됩니다. 열심히 마저 수강해서 제목대로 airflow 마스터 하도록하겠습니다! 이후에도 데이터 엔지니어 강의 출시해주시면 꼭 듣고 싶습니다!
그리고 정말 사소한 부분도 친절히 설명해주셔서 너무 좋습니다. 여태 들어왔던 강의중에 가장 만족합니다 !
안녕하세요 dj961024님 감동의 수강평 감사합니다 ^_^ 뭐든지 그렇겠지만 기본 원리 이해가 제일 중요하다고 믿는 사람으로써 어떻게 하면 개념을 쉽게 이해할 수 있을까 고민을 많이했습니다. 덕분에 도움이 많이 되셨다니 너무 기쁘네요. 수강 중 궁금하신 것 있으면 언제든지 문의해주시고 계속 열공하시길 바래요 !
Reviews 2
∙
Average Rating 5.0
5
Airflow 초급 과정이지만 깊이 있게 알려 주셔서 많은 도움이 되었습니다. 감사합니다.
Buing-ryul 님 후기 감사드립니다. 도움이 많이 되셨다니 저도 기분이 좋네요 ^^ 현업에서도 잘 쓰시기를 바라겠습니다.
$112.20
Check out other courses by the instructor!
Explore other courses in the same field!