inflearn logo

[Data Preprocessing] Don't worry! Pandas is here.

Do you have the data but feel overwhelmed about how to read and process it in Python? Don't worry. You can handle it with the magic of Pandas. Pandas is the most powerful, efficient, and useful data processing library. Skill-UP your data preprocessing with Pandas! Get instant insights!

(5.0) 2 reviews

14 learners

Level Basic

Course period Unlimited

Python
Python
Pandas
Pandas
Data Engineering
Data Engineering
data-science
data-science
data-processing
data-processing
Python
Python
Pandas
Pandas
Data Engineering
Data Engineering
data-science
data-science
data-processing
data-processing

Reviews from Early Learners

Reviews from Early Learners

5.0

5.0

otdootpo

100% enrolled

The lectures were well-organized, making them clear and easy to follow. I would also appreciate it if you could provide lectures on data analysis concepts.

5.0

sprun

29% enrolled

This was very helpful for studying Python data preprocessing. I hope follow-up lectures are also provided. Thank you for explaining everything clearly step-by-step from the basics.

What you will gain after the course

  • Data processing skills that can be utilized throughout your career

  • Pandas has become widely established as an essential element for data analysis!

  • Data merging, restructuring, missing value handling, duplicate data handling

  • Processing text data, categorical data, and date data

  • Provides downloadable textbooks (PDF) and practice files.

📢 Strengths of this course

  • We don't just teach you the functions of Pandas. We explain it so that you can understand the context of "why," "when," "how," and "by what criteria" you should perform data preprocessing, allowing you to make your own judgments.

  • You can practice coding directly in Google Colab using only a web browser, without needing to install anything on your PC.

  • PDF textbook files and ready-to-use practice code are provided.

  • You can develop practical preprocessing skills using a real-world IMDB movie dataset. You can build problem-solving abilities by tackling preprocessing challenges that occur in real data.

📌 Data Preprocessing using Pandas

  • Pandas is a powerful and flexible Python library specialized for data preprocessing.

  • Data preprocessing is an essential process of converting raw data into a format suitable for analysis before data analysis or data modeling.

  • By properly handling missing values, outliers, and duplicate data, you can improve data quality and enhance analysis efficiency.

  • Text data, categorical data, and time-series data can be processed.

  • Check out more details directly in the lecture. 😄

📌 Data Preprocessing? We answer these kinds of questions!

  • How should I load data from a file?

  • How do I select rows or columns that meet specific conditions in a DataFrame? Are there ways to filter or sort data based on desired criteria?

  • When combining or merging multiple DataFrames, I am confused about the differences between merge() and concat() and which situations are appropriate for each. Could you explain them clearly?

  • What are the effective ways to handle missing values? In which cases should they be deleted, and in which cases should they be replaced? For example, how should the criteria for replacing them with specific statistical values be determined?

  • In addition to visual methods for detecting outliers, are there ways to use statistical criteria or functions? Also, is it always best to unconditionally remove the detected outliers?

  • When preprocessing text data, I heard "Regular Expressions" are important. What are they?

  • How do you distinguish categorical data? One-Hot Encoding and Label Encoding - in which cases is it best to use each method?

  • When dealing with time series data, are there any specific preprocessing steps to be careful of besides date/time format conversion? For example, can things like adjusting time intervals or calculating moving averages be included in preprocessing?

We provide a friendly and detailed hands-on process so that anyone can easily follow along and understand.

📌 We prepared this for people like you!


Those looking to get started in data analysis

Beginners who want to challenge themselves with data analysis tasks and strengthen their data processing capabilities.


Those who feel they lack the basics

Those who want to start data analysis but feel overwhelmed and don't know where to begin


Those who are new to Pandas

Those who have studied data analysis before but have difficulty utilizing it because they are not familiar with Pandas.

🏅 What will you be able to do after completing this course?

  • Master the basics of Pandas.

  • Even those who have repeatedly felt frustrated due to a lack of familiarity with Pandas will be able to use Pandas with confidence.

  • You can understand data preprocessing techniques and become familiar with the key tasks and technologies performed during the preprocessing stage.

🤔 Do you have any questions?

Q. Can I take the course even if I don't know Python well?

Basic Python syntax should be understood to some extent.

Q. Why should we learn data preprocessing?

There is a saying that "80% of data analysis work is data preprocessing," meaning a significant amount of time is spent on this stage. Real-world data (raw data) is never clean; it often has "missing values, incorrect values, or inconsistent formats." Unrefined data can distort the results of data analysis. Therefore, data preprocessing can be considered an essential step in data analysis.

🛍 Things to note before taking the course

Practice Environment

  • Tools used: Google Colaboratory is used. All you need is a Google account and a web browser.


Learning Materials

  • We provide learning materials in PDF format.

  • Practice files (.ipynb), practice data, etc., are provided.

Prerequisite Knowledge and Important Notes

  • This is a course for data analysis beginners, so you should be familiar with basic Python syntax.

  • You don't need to take all the lectures in order. If you are already somewhat familiar with Pandas, you can choose and listen to only the parts you need. If you are new to Pandas, please study slowly from the beginning.

Python, Pandas, data-science, data-analysis, data-cleaning

Recommended for
these people

Who is this course right for?

  • Those who are thirsty for data preprocessing using Pandas

  • Those who are entering the field of data analysis

Need to know before starting?

  • Python Basics

Hello
This is aonekoda

  • Bachelor's degree in Computer Science, Master's degree in Statistics

  • Extensive corporate training experience at various companies including Samsung Display, Samsung Electronics, Oracle University Korea, Multicampus, and Etivers Learning.

  • Oracle Certified Instructor, Oracle Cloud Infrastructure (OCI) Certified Instructor

  • Google Cloud Authorized Trainer (GCP) Certified Instructor

  • Lectures on Data Analysis, Data Visualization, Machine Learning, Deep Learning, Cloud, RDBMS, etc.

     

More

Curriculum

All

24 lectures ∙ (6hr 43min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

All

2 reviews

5.0

2 reviews

  • otdootpo7073님의 프로필 이미지
    otdootpo7073

    Reviews 1

    Average Rating 5.0

    5

    100% enrolled

    The lectures were well-organized, making them clear and easy to follow. I would also appreciate it if you could provide lectures on data analysis concepts.

    • aonekoda
      Instructor

      Thank you for the great review. I will continue to provide high-quality content to ensure your valuable time is well spent. Happy studying!

  • sprun7390님의 프로필 이미지
    sprun7390

    Reviews 1

    Average Rating 5.0

    Edited

    5

    29% enrolled

    This was very helpful for studying Python data preprocessing. I hope follow-up lectures are also provided. Thank you for explaining everything clearly step-by-step from the basics.

    • aonekoda
      Instructor

      Good review, thank you.

Similar courses

Explore other courses in the same field!

Limited time deal ends in 5 days

$26.40

22%

$34.10