강의

멘토링

로드맵

Inflearn brand logo image
Data Science

/

Data Engineering

dbt, The New Standard of Data Analytics Engineering

Repetitive pipeline management in Data Warehouses (DW), now free yourself from that painful cycle with dbt! Building on the efficiency provided by dbt, become a data analytics engineer who creates higher value through business-centric data modeling, effective data lifecycle management, and more.

(4.9) 13 reviews

61 learners

  • DeepingSauce
dbt
Big Data
Business Productivity
Data Engineering
Data literacy

Reviews from Early Learners

What you will learn!

  • Hands-on directly creating and managing dbt's core resources such as Source, Seed, Model, Test, Docs.

  • From dbt repository initialization and environment setup to actual model development: the full process.

  • How to solve chronic issues in data warehouse operations, such as complex data pipeline management, data quality assurance, and documentation, with dbt.

  • Advanced dbt techniques to maximize productivity, including Incremental Materialization, dbt Packages, Schema management.

  • How to build and operate more efficient and stable data pipelines using dbt with Airflow (including comparison with existing methods)

  • Experience-based insights on Data Analytics Engineer workflow transformation and enhanced focus on critical issues following dbt adoption in practice.

🔥 End the recurring pain of DW operation with dbt.


😥 Do you find yourself spending every day tracking complex dependencies in your data pipeline, hunting down broken data, and sighing at documentation that no one updates?

If you know SQL and Airflow, but still feel like you're reinventing the wheel by spending time on these repetitive tasks, you've come to the right place.

Have you been paying attention to dbt (data build tool), which is changing the landscape of data analytics engineering? Recently, even major Silicon Valley big tech companies are increasingly requesting dbt skills in their job postings (JDs), demonstrating that dbt is no longer an optional skill, but rather a must-have. Indeed, dbt is creating significant innovation by offering powerful solutions to several long-standing challenges in the data warehouse space.


(Even in Big Tech JD, dbt is slowly creeping in...)

However, the reality is that many companies and teams are still failing to fully utilize the true potential of dbt. "DBT is so great, but how do we integrate it into our warehouse?" For those seeking an answer to this question, the core goal of this course is to apply dbt to real-world tasks , eliminate inefficiencies in data warehouse operations, and help data engineers focus on more important, fundamental problems .

This course goes beyond a simple functional explanation. It delves into the background of dbt's creation (" Why ") and the core principles (" How ") that address persistent challenges in the field, including data lineage management, quality assurance, documentation, and backfilling . Furthermore, I will demonstrate, through my own experience and vivid examples, how the practical application of these principles, along with practical applications like effective integration with Airflow, can lead to remarkable results : a minimum fivefold increase in work productivity ("Impact") .

Leave repetitive, simple tasks to dbt. And with the new possibilities dbt opens up, I hope you'll seize this opportunity to explore the core values of data, including data architecting/modeling and the data engineering lifecycle, and grow into a more capable engineer.

The features of this lecture are:


📖 Systematic curriculum based on storytelling

Unlike the official DBT documentation or lectures, which follow a feature-by-feature omnibus format, this book is structured in a storytelling style, starting with the DBT project setup and gradually expanding and deepening its functionality . Each section is organically connected, allowing for a natural understanding of DBT's overall picture.

💻 Theory + Hands-on + Practical Tips

Provides rich hands-on experience building and running dbt projects directly in a local environment (DuckDB) along with explanation of core dbt concepts.

🆚 Airflow Integration: A Clear Comparison with Existing Methods

This course covers how to use Airflow, a data pipeline orchestrator, with dbt. Specifically, it compares the inefficient pipeline construction process when using Airflow alone (without dbt) with the dramatic improvement achieved when dbt is introduced, demonstrating the power of the dbt + Airflow combination.

📈 The secret to increasing productivity by 5x: Experience-based know-how

Rather than simply listing DBT features, this book vividly conveys why DBT is powerful and how it can dramatically increase work productivity, based on the challenges and problem-solving processes experienced while implementing DBT from the ground up in actual fields.

Learn things like this


  1. Acquire DBT's core philosophy and data problem-solving skills.


  • Beyond simply learning how to use DBT, understand the core values and philosophy that have made DBT the standard in data analytics engineering.

  • Based on this, you will clearly understand how DBT can effectively solve chronic problems in data warehousing, such as difficulties in data lineage management, poor data quality, lack of documentation, and repetitive backfill work.


  1. Strengthen your ability to build practical data pipelines using SQL and Jinja/macro.

  • Learn how to leverage advanced features of DBT, such as Jinja, Macro, and Incremental Modeling , to reduce SQL code repetition, increase reusability, and optimize the efficiency of processing large amounts of data.

  • Learn how to apply data quality tests to your tables more easily than any other way.

  1. Centralized Data Catalog, dbt Docs

  • DBT Docs effectively searches and shares table/view lists, column-by-column descriptions and data types, detailed lineage diagrams between models, applied test lists and results, and various meta information in one place, thereby improving the overall team's understanding of data and creating a smooth, data-driven collaboration environment .

  1. Stable data pipeline operation through Airflow integration

  • Learn a practical architecture that loosely couples dbt and Airflow to clearly separate data transformation (dbt) and orchestration (Airflow) logic and automate the entire pipeline.

  • Thanks to dbt's automated dependency management, Airflow DAG configuration has become much simpler, reducing human errors resulting from complex manual setup and ensuring operational stability. This is clearly understood through a comparison before and after introducing dbt.

Hands-on examples of lecture practice

Model development focused solely on logic: Quickly develop SQL models while observing data lineage.

dbt docs: Gathering all your metadata: Now you can easily create rich data documents.

Airflow integration made easy : Automate Airflow DAGs with table dependencies intact.

Implement data quality testing in 3 lines : Implement data quality testing in just 3 lines of code.

After taking the class,


You will learn the core functions of DBT and gain a level of understanding that can be applied immediately in practice .

Beyond simply 'knowing' DBT, you will gain a clear understanding of 'why' and 'how' to use DBT and be able to use it confidently.

DBT can improve the quality and stability of your data pipeline and enhance collaboration efficiency .

Free yourself from the repetitive and time-consuming tasks of running a data warehouse (lineage tracking, manual documentation, complex backfill, etc.) .

With dbt, you can grow into an engineer who can focus on core values such as designing efficient SQL logic that accurately reflects business requirements and increasing the structural completeness of data models, freeing you from repetitive operational tasks.


I recommend this to these people

"I want to automate data pipeline management in a smarter way."

A junior data (analysis) engineer who is familiar with SQL/Airflow but wants to focus on efficient data transformation automation rather than repetitive DW operations.

"I know dbt is good... but how do I apply it to our company's DW?"

Data experts who have basic knowledge of DBT concepts but are curious about how to use DBT in real-world situations to solve long-standing problems (lineage, quality, documentation, etc.) in real-world DW environments.

"I want to level up to a 'REAL' Data Analytics Engineer!"

Those who want to develop the core competencies of a 'data analysis engineer' beyond simple pipeline operation, through effective data modeling, SQL reusability, and test automation.

Things to note before taking the course


💻 Practice environment

  • This lecture is based on the macOS environment.

    • Windows environment users may need to set up Anaconda prompt or WSL2 (as in the previous lectures), but no guide is provided for these.


  • It is assumed that Anaconda and other environments have already been set up. (Creating an Anaconda virtual environment and using pip are not described.)

  • Library version used


    astronomer-cosmos==1.9.2 dbt-duckdb==1.9.2 dbt-core==1.9.4
  • Other major tools used besides libraries: VSCode (with Cursor), DuckDB, DBeaver

  • The learning materials are provided as source code based on the final version of the lecture (dbt repo, airflow repo).


🧑‍💻 Lecture materials

  • The lecture materials (source code) provided contain the completed form as of the last lecture.

  • Therefore, we recommend using the provided code as a reference when you get stuck, and practicing by writing your own code according to the course curriculum. By manually entering and executing code, you'll gain a deeper understanding of the material.


🚨 Note

  • Please be sure to watch the orientation . It is crucial for understanding the course objectives and scope.

  • The library versions used in the lecture videos may differ from the latest library versions at the time of course registration. As with previous lectures, this course focuses on principles, providing a deep understanding of dbt's core philosophy and problem-solving approach . Therefore, another learning goal is to cultivate practical skills to navigate and adapt independently, referencing official documentation and community resources, without being overwhelmed by library version differences or grammar changes. (With a solid grasp of the core principles, you will be able to proactively adapt to a changing technological landscape.)

  • Rather than following along from the beginning, it may be more effective to go through the entire flow once and then come back and do the hands-on part (there is a high correlation between sections).

  • This course does not cover data extraction from source systems, initial loading into a data lake/warehouse, or real-time streaming data processing. This course focuses on data table transformation and modeling at the warehouse layer.


Introduction of knowledge sharers


I am a developer who solves various problems in the field of data engineering.

I, too, have encountered numerous challenges while building and operating a data warehouse (DW) in my field, including difficulties tracking data lineage, repetitive backfill tasks, and managing out-of-sync documents. It was during this painful (or perhaps painfully painful) journey that I discovered dbt, and I personally experienced the remarkable productivity gains and positive changes in development culture it brought. Using dbt, I efficiently manage approximately 5,000 data assets (source tables, DW tables, dashboards, etc.) at work. This has dramatically improved productivity and positively impacted our development culture. I also use it as a core component of my personal quant investment system.

It was disheartening to see so many of my fellow engineers still stuck in the inefficient practice of "reinventing the wheel." So, I prepared this lecture to share the innovative experiences and problem-solving know-how I gained through dbt, creating a pathway for you to more easily and effectively immerse yourself in the powerful world of dbt.

I hope this course goes beyond simply teaching you how to use the tool, and becomes an opportunity for you to break free from repetitive tasks and experience the innovation and growth in the way you work that dbt brings .


Other courses on Inflearn:

Recommended for
these people

Who is this course right for?

  • Those seeking solutions for chronic data warehouse operation issues like data lineage identification, quality control, documentation, backfill, etc.

  • Those who have heard of or briefly used the dbt tool, but want to grasp how to effectively apply it in real projects.

  • For those who were disappointed that existing dbt lectures or documentation were fragmented or cloud-environment-centric, and those who desire storytelling-based dbt lectures in Korean.

  • Data Analytics Engineer (Analytics Engineer) seeking to maximize productivity and focus on more critical challenges (modeling, architecture design, etc.)

  • Data engineer building/operating data pipelines, feeling difficulties with repetitive tasks & management.

  • Curious about data warehousing work and challenges?

  • As a Data Analyst, seeking to expand your role beyond analysis tasks via dbt.

Need to know before starting?

  • [Required] 'Python for all, including liberal arts students and non-majors' or basic Python content, along with conceptual understanding of 'library'. (dbt is Python-based)

  • [Required] Basic knowledge of development environment: Proficiency in Conda (or venv) virtual environments and Unix-based terminals (macOS/Linux)

  • [Required] Basic SQL query writing

  • [Recommended] Approx. 1 year of Data (Analytics/Warehouse) engineering experience OR experience running container-based Airflow and creating/operating simple DAGs.

Hello
This is

16,198

Learners

580

Reviews

326

Answers

4.8

Rating

5

Courses

데이터로 미래를 설계하고 현실의 문제를 해결하는 데이터 엔지니어입니다.

데이터 기반 통찰을 사랑하며, 평생 학습(Life-long Learner)하고 지식을 나누는 기여자(Contributor)가 되고자 합니다

Curriculum

All

42 lectures ∙ (10hr 16min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

All

13 reviews

4.9

13 reviews

  • 커리어하이님의 프로필 이미지
    커리어하이

    Reviews 1

    Average Rating 5.0

    5

    62% enrolled

    dbt 내용이 너무 방대해서 개념을 잘 못잡았었는데, 입문자들이 궁금해할만한 것들을 핵심만 잘 잡아주시는듯... 조금 더 딥한내용도 다뤄줬으면 했지만 이정도면 이제 필요한 내용만 혼자 익혀나가면 될듯

    • 찬영님의 프로필 이미지
      찬영

      Reviews 2

      Average Rating 5.0

      5

      100% enrolled

      • 호우님의 프로필 이미지
        호우

        Reviews 1

        Average Rating 5.0

        5

        76% enrolled

        기본 내용은 굿. 다만 실전에서 만들법한 데이터 파이프라인도 구현하는 예제 같은게 있었으면 좋았을듯.

        • 기웃기웃님의 프로필 이미지
          기웃기웃

          Reviews 1

          Average Rating 4.0

          4

          52% enrolled

          좋아요~

          • 또다시수강생님의 프로필 이미지
            또다시수강생

            Reviews 1

            Average Rating 5.0

            5

            31% enrolled

            dbt 잘 몰랐던 부분만 골라서 들으려고 신청 했는데, 지금까지 잘못알고있던 내용이 많았다. 그래서 처음부터 그냥 정주행 다시갑니다

            Limited time deal ends in 7 days

            $79,200.00

            20%

            $77.00

            DeepingSauce's other courses

            Check out other courses by the instructor!

            Similar courses

            Explore other courses in the same field!