강의

멘토링

커뮤니티

BEST
Data Science

/

Data Engineering

Practical Data Science Part 2. Data Preprocessing

Learn why and how to handle data exploration (EDA), data cleaning, scaling, outlier handling, log transformation, and categorical encoding in real-world applications. You will also learn how to merge tabular data and process (unstructured) time series data.

(4.8) 17 reviews

234 learners

  • hjkim3
Python

Reviews from Early Learners

What you will learn!

  • As the first step in data analysis and machine learning, you will learn the basic concepts of 1) data cleaning, 2) scaling, 3) outlier handling, and 4) data transformation (log transformation, category encoding).

  • Before starting full-scale data analysis, you will learn the exploratory analysis (EDA) method to examine the overall characteristics of the data and determine whether the collected data is suitable for analysis.

  • Learn how to process table data and time series data, and clearly understand the concepts of concat, join, merge, groupby, pivot_table, and walk forward prediction.

Contains only the essentials!
Essential data preprocessing for data analysis

Big data analytics, machine learning, deep learning, artificial intelligence, and digital transformation (DT) are among the most in-demand technology fields today. In nearly every industry, training data scientists to handle these technologies is crucial and urgent.

Data preprocessing is the task that requires the most time from data handlers in companies and has the greatest impact on data analysis (machine learning) performance.


📝 Core data preprocessing

This lecture covers effective data exploration (EDA) methods and the four key concepts of data preprocessing: data cleaning, scaling, outlier handling, and data transformation.


👩‍💻 Theory + Practice Lecture Structure


Predicting Titanic Survivors?


We help you immediately apply data analysis required in the field through theory-based exercises such as missing value handling, data transformation, and linear classification prediction.


🙋‍♂️ Topics needed on site

Handling tables
Time series data processing

In practice, combining table-structured data in various ways is often necessary. Understand the differences between the concat, append, join, merge, groupby, and pivot_table functions, and explain which functions are useful in which situations.

In practice, we often deal with unstructured time series data. We'll explain how to use datetime and the sequential walk-forward time series prediction method, and introduce binary classification and regression prediction models using linear models.


📕 Course Features

  • All content is explained with practice code.

Go to the practice code 👉 https://github.com/data-labs/preprocessing

  • The example code is structured so that you can use it right away in your work.
  • The code is concise, yet contains the essentials and is written to be easy to use.

👩‍💻 Core Data Science

Python, the foundational language of data science.
This course is designed to provide basic knowledge of Python.
For those who do not have basic knowledge of the Python language,
Practical Data Science Part 1. Through an introductory Python lecture.
I recommend learning player knowledge.

Recommended for
these people

Who is this course right for?

  • Data preprocessing is the most important process that determines the performance of data analysis. This will be helpful for those who want to systematically organize the data preprocessing methods required for practical work.

  • This is recommended for those who want to understand the basic concepts of pasting table structure data and handling time series data and apply them immediately in the field.

Need to know before starting?

  • Basic knowledge of Python is required.

Hello
This is

921

Learners

77

Reviews

11

Answers

4.8

Rating

3

Courses

"고장난 라디오 고칠 수 있어?"

제가 전자공학과에 입학한 후 친구로부터 받은 질문입니다. 뭐, 대답은 했습니다. "전자공학과에서는 라디오 만드는 원리를 배우는 것이지 고장난 전자제품 고치는 것은 우리 일이 아니고..." 

이론으로 무장한 전문가보다 문제 해결사가 필요한 경우가 더 많습니다. 저는 실전 문제 해결이 더 중요하다고 생각합니다.

최근에는 머신러닝으로 금융, 에너지, 전자, 중장비, 물류, 신약개발, 식품 등 산업 영역의 문제를 해결하는 일을 하고 있는데, 정말 배울 것도 많고 할 일도 무궁무진한 영역인 것 같습니다. 본업은 교수지만 (강원대 컴퓨터공학과), 현장의 문제해결에 관심이 많아 여러 겸직을 하고 있습니다. AI신약개발지원센터장, KAIST 겸임교수, 그리고 데이터사이언스랩 대표를 맡고 있습니다.

AI 시대에 가장 필요한 인재는 실전 문제를 해결할 수 있는 데이터 사이언티스트라고 믿으며 여러분 모두  인기 있는 데이터 사이언티스트가 되기를 바랍니다.

Curriculum

All

19 lectures ∙ (4hr 13min)

Published: 
Last updated: 

Reviews

All

17 reviews

4.8

17 reviews

  • dfeafe님의 프로필 이미지
    dfeafe

    Reviews 8

    Average Rating 4.9

    4

    100% enrolled

    I like it because it's step-by-step and basic, like a school class.

    • hjkim3
      Instructor

      It was conducted like a class. I hope you get good results.

  • victory1791791577님의 프로필 이미지
    victory1791791577

    Reviews 5

    Average Rating 4.6

    5

    100% enrolled

    I always understand it well because of your kind and calm explanations. Thank you!

    • hjkim3
      Instructor

      Thank you for your kind review.

  • sungkenh0540님의 프로필 이미지
    sungkenh0540

    Reviews 2

    Average Rating 5.0

    5

    100% enrolled

    It was very helpful for studying Python data preprocessing. I liked the various methodologies required for data preprocessing and the hands-on practice using real data.

    • hjkim3
      Instructor

      Thank you for your kind review.

  • alcatraz761636님의 프로필 이미지
    alcatraz761636

    Reviews 2

    Average Rating 5.0

    5

    100% enrolled

    Personally, I think it is a very neat and excellent lecture. I also took the previous Part 1, and although there were some parts that were a bit difficult due to the progress of the lecture, I was able to understand it without any problems.

    • hjkim3
      Instructor

      I'm glad you figured it out on your own. If you have any questions, please ask~

  • quber02012351님의 프로필 이미지
    quber02012351

    Reviews 3

    Average Rating 3.0

    3

    100% enrolled

    I really enjoyed this great lecture. I think I understood the core of data preprocessing in 5 hours. Thank you!

    • hjkim3
      Instructor

      The feature is that it is organized in a short period of time. Thank you for your review!

$42.90

hjkim3's other courses

Check out other courses by the instructor!

Similar courses

Explore other courses in the same field!