강의

멘토링

커뮤니티

BEST
Data Science

/

Data Engineering

Practical Data Science Part 2. Data Preprocessing

Learn why and how to handle data exploration (EDA), data cleaning, scaling, outlier handling, log transformation, and categorical encoding in real-world applications. You will also learn how to merge tabular data and process (unstructured) time series data.

(4.7) 18 reviews

234 learners

Level Basic

Course period Unlimited

  • hjkim3
Python
Python
Python
Python

Reviews from Early Learners

Reviews from Early Learners

4.7

5.0

허룡

100% enrolled

I always understand it well because of your kind and calm explanations. Thank you!

5.0

홍성은 (sungkenh)

100% enrolled

It was very helpful for studying Python data preprocessing. I liked the various methodologies required for data preprocessing and the hands-on practice using real data.

5.0

alcatraz76

100% enrolled

Personally, I think it is a very neat and excellent lecture. I also took the previous Part 1, and although there were some parts that were a bit difficult due to the progress of the lecture, I was able to understand it without any problems.

What you will gain after the course

  • As the first step in data analysis and machine learning, you will learn the basic concepts of 1) data cleaning, 2) scaling, 3) outlier handling, and 4) data transformation (log transformation, category encoding).

  • Before starting full-scale data analysis, you will learn the exploratory analysis (EDA) method to examine the overall characteristics of the data and determine whether the collected data is suitable for analysis.

  • Learn how to process table data and time series data, and clearly understand the concepts of concat, join, merge, groupby, pivot_table, and walk forward prediction.

Contains only the essentials!
Essential data preprocessing for data analysis

Big data analytics, machine learning, deep learning, artificial intelligence, and digital transformation (DT) are among the most in-demand technology fields today. In nearly every industry, training data scientists to handle these technologies is crucial and urgent.

Data preprocessing is the task that requires the most time from data handlers in companies and has the greatest impact on data analysis (machine learning) performance.


📝 Core data preprocessing

This lecture covers effective data exploration (EDA) methods and the four key concepts of data preprocessing: data cleaning, scaling, outlier handling, and data transformation.


👩‍💻 Theory + Practice Lecture Structure


Predicting Titanic Survivors?


We help you immediately apply data analysis required in the field through theory-based exercises such as missing value handling, data transformation, and linear classification prediction.


🙋‍♂️ Topics needed on site

Handling tables
Time series data processing

In practice, combining table-structured data in various ways is often necessary. Understand the differences between the concat, append, join, merge, groupby, and pivot_table functions, and explain which functions are useful in which situations.

In practice, we often deal with unstructured time series data. We'll explain how to use datetime and the sequential walk-forward time series prediction method, and introduce binary classification and regression prediction models using linear models.


📕 Course Features

  • All content is explained with practice code.

Go to the practice code 👉 https://github.com/data-labs/preprocessing

  • The example code is structured so that you can use it right away in your work.
  • The code is concise, yet contains the essentials and is written to be easy to use.

👩‍💻 Core Data Science

Python, the foundational language of data science.
This course is designed to provide basic knowledge of Python.
For those who do not have basic knowledge of the Python language,
Practical Data Science Part 1. Through an introductory Python lecture.
I recommend learning player knowledge.

Recommended for
these people

Who is this course right for?

  • Data preprocessing is the most important process that determines the performance of data analysis. This will be helpful for those who want to systematically organize the data preprocessing methods required for practical work.

  • This is recommended for those who want to understand the basic concepts of pasting table structure data and handling time series data and apply them immediately in the field.

Need to know before starting?

  • Basic knowledge of Python is required.

Hello
This is

921

Learners

78

Reviews

11

Answers

4.8

Rating

3

Courses

"Can you fix a broken radio?"

This is a question a friend asked me after I entered the Department of Electronic Engineering. Well, I did answer. "In electronic engineering, we learn the principles of how to build a radio; fixing broken electronics isn't really what we do..."

There are more cases where a problem solver is needed rather than an expert armed with theory. I believe that solving real-world problems is more important.

Recently, I have been working on solving problems in various industrial sectors—such as finance, energy, electronics, heavy equipment, logistics, drug discovery, and food—using machine learning. It is a field with so much to learn and endless opportunities. Although my primary role is a professor (Department of Computer Science and Engineering at Kangwon National University), my deep interest in solving real-world problems has led me to hold several concurrent positions. I currently serve as the Director of the AI Drug Discovery Training Center, an Adjunct Professor at KAIST, and the CEO of Data Science Lab.

I believe that the most essential talent in the AI era is a data scientist who can solve real-world problems, and I hope all of you become highly sought-after data scientists.

Curriculum

All

19 lectures ∙ (4hr 13min)

Published: 
Last updated: 

Reviews

All

18 reviews

4.7

18 reviews

  • dfeafe님의 프로필 이미지
    dfeafe

    Reviews 8

    Average Rating 4.9

    4

    100% enrolled

    I like it because it's step-by-step and basic, like a school class.

    • hjkim3
      Instructor

      It was conducted like a class. I hope you get good results.

  • victory1791791577님의 프로필 이미지
    victory1791791577

    Reviews 5

    Average Rating 4.6

    5

    100% enrolled

    I always understand it well because of your kind and calm explanations. Thank you!

    • hjkim3
      Instructor

      Thank you for your kind review.

  • sungkenh0540님의 프로필 이미지
    sungkenh0540

    Reviews 2

    Average Rating 5.0

    5

    100% enrolled

    It was very helpful for studying Python data preprocessing. I liked the various methodologies required for data preprocessing and the hands-on practice using real data.

    • hjkim3
      Instructor

      Thank you for your kind review.

  • alcatraz761636님의 프로필 이미지
    alcatraz761636

    Reviews 2

    Average Rating 5.0

    5

    100% enrolled

    Personally, I think it is a very neat and excellent lecture. I also took the previous Part 1, and although there were some parts that were a bit difficult due to the progress of the lecture, I was able to understand it without any problems.

    • hjkim3
      Instructor

      I'm glad you figured it out on your own. If you have any questions, please ask~

  • inyong08hwang2545님의 프로필 이미지
    inyong08hwang2545

    Reviews 2

    Average Rating 5.0

    5

    100% enrolled

    Thank you for the great lecture.

    $42.90

    hjkim3's other courses

    Check out other courses by the instructor!

    Similar courses

    Explore other courses in the same field!