BEST

Practical Data Science Part 2. Data Preprocessing

Name: Practical Data Science Part 2. Data Preprocessing
Price: 55000 KRW
Rating: 4.7 (18 reviews)

Learn why and how to handle data exploration (EDA), data cleaning, scaling, outlier handling, log transformation, and categorical encoding in real-world applications. You will also learn how to merge tabular data and process (unstructured) time series data.

(4.7) 18 reviews

236 learners

Level Basic

Course period Unlimited

hjkim3

Python

Reviews from Early Learners

4.7

5.0

허룡

100% enrolled

I always understand it well because of your kind and calm explanations. Thank you!

5.0

홍성은 (sungkenh)

100% enrolled

It was very helpful for studying Python data preprocessing. I liked the various methodologies required for data preprocessing and the hands-on practice using real data.

5.0

alcatraz76

100% enrolled

Personally, I think it is a very neat and excellent lecture. I also took the previous Part 1, and although there were some parts that were a bit difficult due to the progress of the lecture, I was able to understand it without any problems.

What you will gain after the course

As the first step in data analysis and machine learning, you will learn the basic concepts of 1) data cleaning, 2) scaling, 3) outlier handling, and 4) data transformation (log transformation, category encoding).
Before starting full-scale data analysis, you will learn the exploratory analysis (EDA) method to examine the overall characteristics of the data and determine whether the collected data is suitable for analysis.
Learn how to process table data and time series data, and clearly understand the concepts of concat, join, merge, groupby, pivot_table, and walk forward prediction.

Contains only the essentials!
Essential data preprocessing for data analysis

Big data analytics, machine learning, deep learning, artificial intelligence, and digital transformation (DT) are among the most in-demand technology fields today. In nearly every industry, training data scientists to handle these technologies is crucial and urgent.

Data preprocessing is the task that requires the most time from data handlers in companies and has the greatest impact on data analysis (machine learning) performance.

📝 Core data preprocessing

This lecture covers effective data exploration (EDA) methods and the four key concepts of data preprocessing: data cleaning, scaling, outlier handling, and data transformation.

👩‍💻 Theory + Practice Lecture Structure

Predicting Titanic Survivors?

We help you immediately apply data analysis required in the field through theory-based exercises such as missing value handling, data transformation, and linear classification prediction.

🙋‍♂️ Topics needed on site

_{Handling tables}

_{Time series data processing}

In practice, combining table-structured data in various ways is often necessary. Understand the differences between the concat, append, join, merge, groupby, and pivot_table functions, and explain which functions are useful in which situations.

In practice, we often deal with unstructured time series data. We'll explain how to use datetime and the sequential walk-forward time series prediction method, and introduce binary classification and regression prediction models using linear models.

📕 Course Features

All content is explained with practice code.

Go to the practice code 👉 https://github.com/data-labs/preprocessing

The example code is structured so that you can use it right away in your work.
The code is concise, yet contains the essentials and is written to be easy to use.

👩‍💻 Core Data Science

Practical Data Science Part 1: Introduction to Python

Python, the foundational language of data science.
This course is designed to provide basic knowledge of Python.
For those who do not have basic knowledge of the Python language,

Practical Data Science Part 1. Through an introductory Python lecture.
I recommend learning player knowledge.

Recommended for
these people

Who is this course right for?

Data preprocessing is the most important process that determines the performance of data analysis. This will be helpful for those who want to systematically organize the data preprocessing methods required for practical work.
This is recommended for those who want to understand the basic concepts of pasting table structure data and handling time series data and apply them immediately in the field.

Need to know before starting?

Basic knowledge of Python is required.

Hello
This is hjkim3

924

Learners

Reviews

Answers

4.8

Rating

Courses

"Can you fix a broken radio?"

This is a question a friend asked me after I entered the Department of Electronic Engineering. Well, I did answer. "In electronic engineering, we learn the principles of how to build a radio; fixing broken electronics isn't really what we do..."

There are more cases where a problem solver is needed rather than an expert armed with theory. I believe that solving real-world problems is more important.

Recently, I have been working on solving problems in various industrial sectors—such as finance, energy, electronics, heavy equipment, logistics, drug discovery, and food—using machine learning. It is a field with so much to learn and endless opportunities. Although my primary role is a professor (Department of Computer Science and Engineering at Kangwon National University), my deep interest in solving real-world problems has led me to hold several concurrent positions. I currently serve as the Director of the AI Drug Discovery Training Center, an Adjunct Professor at KAIST, and the CEO of Data Science Lab.

I believe that the most essential talent in the AI era is a data scientist who can solve real-world problems, and I hope all of you become highly sought-after data scientists.

Curriculum

All

19 lectures ∙ (4hr 13min)

Section 1. Introduction to Data Preprocessing

1 lectures ∙ (2min)

1. Course Introduction
02:47

Section 2. Data preprocessing

4 lectures ∙ (1hr 2min)

2. Missing Value Handling
18:05
3. Scaling
18:41
4. Outlier Detection
06:15
5. Data Transformation_Category Encoding
19:14

Section 3. Data Preprocessing Lab

3 lectures ∙ (48min)

6. Missing Data Practice
14:46
7. Data Transformation Practice
13:44
8. Linear Classification Prediction Practice
19:40

Section 4. Exploratory Analysis

4 lectures ∙ (49min)

Section 5. Handling Tables

4 lectures ∙ (42min)

Section 6. Time Series Data Processing

3 lectures ∙ (47min)

Published: 01/12/2021

Last updated: 05/19/2025

Reviews

All

18 reviews

4.7

18 reviews

victory1791791577
Reviews 5
∙
Average Rating 4.6
06/25/2021
5
100% enrolled
I always understand it well because of your kind and calm explanations. Thank you!
- hjkim3
  Instructor
  07/06/2021
  Thank you for your kind review.
sungkenh0540
Reviews 2
∙
Average Rating 5.0
05/04/2021
5
100% enrolled
It was very helpful for studying Python data preprocessing. I liked the various methodologies required for data preprocessing and the hands-on practice using real data.
- hjkim3
  Instructor
  07/06/2021
  Thank you for your kind review.
alcatraz761636
Reviews 2
∙
Average Rating 5.0
03/08/2021
5
100% enrolled
Personally, I think it is a very neat and excellent lecture. I also took the previous Part 1, and although there were some parts that were a bit difficult due to the progress of the lecture, I was able to understand it without any problems.
- hjkim3
  Instructor
  03/10/2021
  I'm glad you figured it out on your own. If you have any questions, please ask~
quber02012351
Reviews 3
∙
Average Rating 3.0
02/24/2021
3
100% enrolled
I really enjoyed this great lecture. I think I understood the core of data preprocessing in 5 hours. Thank you!
- hjkim3
  Instructor
  02/24/2021
  The feature is that it is organized in a short period of time. Thank you for your review!
dfeafe
Reviews 8
∙
Average Rating 4.9
07/06/2021
4
100% enrolled
I like it because it's step-by-step and basic, like a school class.
- hjkim3
  Instructor
  07/06/2021
  It was conducted like a class. I hope you get good results.

hjkim3's other courses

Check out other courses by the instructor!

Practical Data Science Part 3. Understanding Machine Learning

hjkim3

$51.70

Basic / Machine Learning(ML)

4.7

(31)

300+

The digital transformation (DT) and introduction of artificial intelligence (AI) in companies begin with the construction of machine learning models. However, the scope of machine learning technology is very broad, and in order to select the optimal method, it is necessary to clearly understand the basic concepts. In this lecture, we will introduce the core contents necessary to clearly understand the basic concepts of machine learning, focusing on five examples.

Basic

Machine Learning(ML)

Practical Data Science Part 3. Understanding Machine Learning

hjkim3

$51.70

Basic / Machine Learning(ML)

4.7

(31)

300+

Practical Data Science Part 1. Introduction to Python

hjkim3

$51.70

Beginner / Python, Numpy, Pandas

4.9

(29)

300+

This course is for those who need to introduce data analysis, machine learning, AI, etc. to their work but are not familiar with Python programming. You will systematically learn the core functions of Python required to become a data scientist in a short period of time.

Beginner

Python, Numpy, Pandas

Practical Data Science Part 1. Introduction to Python

hjkim3

$51.70

Beginner / Python, Numpy, Pandas

4.9

(29)

300+

Similar courses

Explore other courses in the same field!

Airflow Basics from a Silicon Valley Data Leader

keeyonghan

$102.30

Basic / airflow, snowflake, SQL, Python

4.9

(16)

200+

With the advent of the AI era, building data pipelines has become a core competency that determines a company's competitiveness. Learn how to build efficient data pipelines using Airflow, the most widely used tool, directly from a Silicon Valley expert (formerly Head of the Data Team at Udemy, currently a professor in the Data Masters program at San Jose State University) with practical experience and extensive lecturing experience.

Basic

airflow, snowflake, SQL

Airflow Basics from a Silicon Valley Data Leader

keeyonghan

$102.30

Basic / airflow, snowflake, SQL, Python

4.9

(16)

200+

Python Data Analysis in Practice

daniel7

$41.80

Basic / Python, Pandas, Numpy, Seaborn, Matplotlib

Have you studied Python syntax but felt lost when it came to actually handling data? You know you need to learn NumPy and Pandas, but if you've been wondering where to start and how to connect them, this PART 2 is the answer. In PART 2 of our 50-lecture data analysis curriculum, you will experience the step-by-step process of loading, cleaning, processing, and statistically interpreting real data. You won't just learn libraries and syntax; you will establish criteria for which tools to choose in specific situations. PART 2 is the turning point where you move from simply knowing Python code to actually being able to handle data.

Basic

Python, Pandas, Numpy

Python Data Analysis in Practice

daniel7

$41.80

Basic / Python, Pandas, Numpy, Seaborn, Matplotlib

Apache Airflow with Silicon Valley Engineers

altoformula

$51.70

Basic / airflow, Big Data, Data Engineering, Python

4.6

(54)

600+

Active Replies

You will learn Apache Airflow, the most widely used Orchestrator for creating software data pipelines.

Basic

airflow, Big Data, Data Engineering

Apache Airflow with Silicon Valley Engineers

altoformula

$51.70

Basic / airflow, Big Data, Data Engineering, Python

4.6

(54)

600+

Active Replies

The Secret of Algorithmic Trading: How AI Predicts Stock Prices

cheatkeylab

$37.40

23%

$28.60

Basic / Deep Learning(DL), Python, transformer, lstm, Financial Technology

4.8

(31)

300+

Analyze over 40 types of economic indicators and stock data with AI to create your own powerful stock price analysis model that predicts not only S&P 500 and QQQ ETF, but also individual stocks!

Basic

Deep Learning(DL), Python, transformer

The Secret of Algorithmic Trading: How AI Predicts Stock Prices

cheatkeylab

$37.40

23%

$28.60

Basic / Deep Learning(DL), Python, transformer, lstm, Financial Technology

4.8

(31)

300+

Machine Learning Pipeline

aisw

Free

Basic / Machine Learning(ML), AI, Python, Docker, Tensorflow

5.0

(5)

100+

You will develop the ability to define problems based on data and clearly explain the rationale and decision-making process behind them. Additionally, rather than focusing solely on the performance of a single model, you will acquire a pipeline-oriented mindset that evaluates the completeness and reliability of the entire machine learning workflow. Furthermore, you will strengthen your problem-solving skills by tracing back the causes of errors when they occur and deriving improvement directions, and through end-to-end project experience, you will acquire practical ML pipeline capabilities that can be immediately applied in real-world settings.

Basic

Machine Learning(ML), AI, Python

Machine Learning Pipeline

aisw

Free

Basic / Machine Learning(ML), AI, Python, Docker, Tensorflow

5.0

(5)

100+

Python Data Analysis Thinking for Practical Use (EDA Practice)

daniel7

$41.80

Basic / Python, Pandas, Numpy, Seaborn, Matplotlib

If you could draw graphs but couldn't explain the data, this course is a process for building the "ability to read and explain" data. In PART 3, we perform Exploratory Data Analysis (EDA) focusing on real-world cases. ✔ Checking data distribution ✔ Analyzing relationships between variables ✔ Outlier detection and visualization interpretation You will learn the analysis structure using the Titanic and Iris datasets, and through a project utilizing TMDB 5000 movie data, you will directly experience the entire analysis process: Data cleaning → Setting analysis topics → Visualization interpretation. By the end of this course, you will possess the analytical capability to read and explain data.

Basic

Python, Pandas, Numpy

Python Data Analysis Thinking for Practical Use (EDA Practice)

daniel7

$41.80

Basic / Python, Pandas, Numpy, Seaborn, Matplotlib

[Renewed] MongoDB and NoSQL (Big Data) Database Bootcamp for Beginners [From Introduction to Application] (Updated)

funcoding

$59.40

Basic / Python, MongoDB, DBMS/RDBMS, Data Engineering

4.9

(186)

3,200+

Learn NoSQL technology for handling big data, one of the fundamental skills in full-stack and data science technologies used by modern startups. MongoDB is the easiest and fastest NoSQL technology to utilize. In this course, you will learn the basics of MongoDB in a short time and master the skills to handle and utilize MongoDB with Python.

Basic

Python, MongoDB, DBMS/RDBMS

[Renewed] MongoDB and NoSQL (Big Data) Database Bootcamp for Beginners [From Introduction to Application] (Updated)

funcoding

$59.40

Basic / Python, MongoDB, DBMS/RDBMS, Data Engineering

4.9

(186)

3,200+

Building an AI Recommendation System by a Working Engineer | Recommendation Algorithm | Recommender | Recsys

Jay

$42.90

Basic / Python, Recommendation System, AI, recommendation, recommender-systems

5.0

(4)

This course covers everything from core recommendation system algorithms to practical implementation. - Content-based filtering - Collaborative filtering and deep learning-based recommendation model implementation - Two-step recommender systems implementation - Hands-on practice using PyTorch/RecBole - Industry know-how and recommendation result visualization

Basic

Python, Recommendation System, AI

Building an AI Recommendation System by a Working Engineer | Recommendation Algorithm | Recommender | Recsys

Jay

$42.90

Basic / Python, Recommendation System, AI, recommendation, recommender-systems

5.0

(4)

Practical Data Science Part 2. Data Preprocessing

4.7

What you will gain after the course

📝 Core data preprocessing

👩‍💻 Theory + Practice Lecture Structure

Predicting Titanic Survivors?

🙋‍♂️ Topics needed on site

📕 Course Features

👩‍💻 Core Data Science

Recommended for these people

HelloThis is hjkim3

Curriculum

Reviews

hjkim3's other courses

Similar courses

Recommended for
these people

Hello
This is hjkim3