Inflearn brand logo image
Inflearn brand logo image
Inflearn brand logo image
AI Development

/

Deep Learning & Machine Learning

Free Python Course (Usage 7) - Machine Learning

It's finally here. The world's easiest machine learning course. It's free, so why hesitate? Just keep pressing the 'Learn' button.

(4.9) 52 reviews

2,301 learners

  • nadocoding
머신러닝
인공지능
파이썬
machinelearning
사이킷런
Machine Learning(ML)
Scikit-Learn
Python
Anaconda
Thumbnail

Reviews from Early Learners

What you will learn!

  • How to use scikit-learn, an essential package for Python machine learning

  • Major machine learning algorithms for supervised and unsupervised learning

  • How to make Netflix? Movie recommendation system project

  • Text analysis is a bonus!

With Python Machine Learning
Create a movie recommendation system! 🎞️

Step-by-step, from machine learning theory to practice! 🖥️

You've probably heard of machine learning, right? Machine learning is a branch of artificial intelligence, known in Korean as " machine learning ." Given high-quality data, the system learns from that data and creates a model . Using this model, the system predicts the output from new inputs—in other words, it's essentially creating a function.

By the way, this is not it :)

You can never experience all the rides at a large amusement park in one day. But once you visit, you'll get a general idea of what the park looks like, where the rides are located, and which rides to prioritize the next time you visit.

I hope you'll study my lectures like you're visiting an amusement park for the first time. While it's difficult to know everything about machine learning, you'll gain a sense of what it is, what you need to consider for learning, and what you might want to study further. Then, you'll be able to take it a step further and build a deeper understanding through a variety of resources. Let's get started.


Learn this 📑

1) Solid theoretical learning

Here are some points sprinkled throughout.

If you had to find just one straight line that best represents these points, what would it be?

That's right! It's number 3. Why did you think that? That's right. It just seems that way, right?

We've just experienced the process of a machine learning to build a model on its own. Once this model (in this case, a straight line) is created, we can now make predictions .

If this graph represents diamond price data by carat, with carats on the x-axis and price on the y-axis, you can roughly estimate how much a new 1.7-carat diamond would cost. Making predictions using continuous numerical data like this is called a regression model .

Regression models can sometimes become more complex. For example, if you're trying to predict test scores based on study time, study time isn't necessarily the only factor that influences the score, right? These factors that influence the test score are called independent variables , and the resulting outcome is called the dependent variable . As the number of independent variables increases, a more complex form of multiple linear regression model becomes necessary. Think of it as the graph becoming more complex as the dimensionality increases.

During the hot summer months, it's scary to use the air conditioner for long periods of time. Household electricity bills are subject to progressive tariffs, so even after a short period of use, electricity bills can skyrocket, sometimes exceeding hundreds of thousands of won. In cases where y changes rapidly in response to changes in x, such as data that increases dramatically depending on the progressive tariff, it's difficult to express them with a single straight line. In these cases, a polynomial regression model can be used.

When you have two models to represent the data of the blue dots, the orange curve is much better than the straight blue one!

But how can we be sure that these predictive models truly perform well? So, once a model is built, its performance must be evaluated. To do this, the entire dataset is split into two: one for training and one for testing. Typically, the split is 80:20, training is performed solely on the training set, and the model is then validated on the test set. In some cases, the sets are mixed for validation.

In this process, if the model predicts very well on the training set but poorly on the test set, this is called overfitting . If the model predicts poorly even on the training set, this is called underfitting . When building a model, it's important to avoid overfitting or underfitting.

A kid who is roughly overfitting to grandma's data
I saw something, but it was 2% lacking.

In addition to continuous data, there's also categorical data. This involves classification , not regression . Instead of test scores based on study time, consider categorical data, which, in this case, is categorized as pass/fail, based on a certification exam. So, if you have data indicating that someone who studied for four hours failed and someone who studied for six hours passed, you'd categorize those who studied for seven hours as either passing or failing.

A representative classification algorithm in machine learning is logistic regression . While it's called regression, it's actually a model used for classification, and classification models can adjust their criteria as needed. For example, even if the model says , "You'll pass if you study for four hours," we might take a conservative approach and say, "You'll need to study for six hours."

The content explained so far falls under the supervised category of machine learning . However, there's also unsupervised learning, which doesn't provide the correct answer. Unsupervised learning involves the machine discovering meaningful patterns or structures within data. Clustering, which groups data exhibiting similar patterns together, is an example of clustering. Dividing news articles into categories like science/technology, sports, and health is an example of clustering.

A representative clustering algorithm is K-means . Imagine you're picking apples from an orchard and dividing them for sale. What's the best way to do it? You could simply divide them into two groups: large and small. Or you could divide them into three groups: large, medium, and small. Or you could categorize them into pretty and ugly groups, selling the ugly ones at a lower price.

Here, K is the number of groups . If you are clustering a large amount of complex data, not just apples, it can be difficult to decide on a number . Fortunately, there is a method that can be referred to to find the optimal K. It is called the elbow method because it resembles the shape of an elbow. Simply put, it calculates the average distance from each data to the center of each cluster (group) according to the change in K, and considers the point at which the slope on the graph begins to become gentle as K.

Once K is determined, we can obtain results divided into K clusters (groups) from the randomly scattered data, as shown below. If this example were to show scores based on study time, we could offer different study strategies to students in each group.

2) Practice and quizzes

The fundamentals of machine learning covered above will be covered through detailed theoretical explanations and practical exercises. Finally, a quiz will be administered to review what you've learned.

The quiz only gives you a data set and seven small tasks to complete using that data. If you've studied the basics well, you'll be able to handle it. And being able to solve the quiz on your own means you'll be able to separate the data, learn from the training set, visualize the data, and even perform evaluation and predictions. Awesome, right? 😃

Now that you've completed the quiz, it's time to put it to use! As with all other application-oriented lectures, this machine learning course will also involve a project. The project topic is a movie recommendation system . Using a dataset of approximately 5,000 movies, you'll analyze and learn to select 10 recommended movies. There are several recommendation methods, but we'll briefly cover the following three.

1. Recommend a movie that many people like
2. Recommend movies that are very similar to a specific movie.
3. Customized recommendations based on individual movie tastes

In this course, you'll also learn a bit about text analysis methods. And since just looking at code can be tedious, we'll build our own movie recommendation system using Streamlit, a package that lets you create beautiful web pages with just a few lines of code. Here, when you select a movie, it will recommend 10 movies based on information like the film's genre, director, and cast, and then display a Korean poster image. Sounds pretty good, right?

In particular, the last personalized recommendation based on individual movie tastes uses a package called Surprise. Based on the sales history data accumulated so far, it can be of great help in developing strategies such as which products to recommend to which customers, and which items would sell better if sold as a set.

Images, Videos by pixabay, pexels
: https://www.pixabay.com
: https://www.pexels.com

Designed by freepik, flaticon
: https://www.freepik.com
: https://www.flaticon.com


Recommended for
these people

Who is this course right for?

  • For those who found machine learning difficult

  • For those who need a really easy and detailed explanation

  • For those who want to complete their knowledge through practical projects that go beyond theory

Need to know before starting?

  • Python Basic Grammar

  • Basic usage of Jupyter Notebook

Hello
This is

100,321

Learners

3,109

Reviews

915

Answers

4.9

Rating

11

Courses

유튜브에서 코딩 교육 채널을 운영하고 있는 나도코딩입니다.
누구나 쉽고 재미있게 코딩을 공부하실 수 있도록 친절한 설명과 쉬운 예제로 강의합니다.
코딩, 함께 하실래요? 😊

🧡 유튜브 나도코딩
🎁 코딩 자율학습 나도코딩의 파이썬 입문
📚 코딩 자율학습 나도코딩의 C 언어 입문

Curriculum

All

51 lectures ∙ (6hr 41min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

All

52 reviews

4.9

52 reviews

  • KYUNG TAE BAE님의 프로필 이미지
    KYUNG TAE BAE

    Reviews 286

    Average Rating 5.0

    5

    24% enrolled

    아 이런 좋은 강의가 있었다니.. 정말 감사드립니다~!!

    • 나치웅님의 프로필 이미지
      나치웅

      Reviews 1

      Average Rating 5.0

      5

      31% enrolled

      • 김재원님의 프로필 이미지
        김재원

        Reviews 1

        Average Rating 3.0

        3

        31% enrolled

        • 이혜정님의 프로필 이미지
          이혜정

          Reviews 1

          Average Rating 5.0

          5

          31% enrolled

          • yoonjuseo08님의 프로필 이미지
            yoonjuseo08

            Reviews 2

            Average Rating 5.0

            5

            61% enrolled

            Free

            nadocoding's other courses

            Check out other courses by the instructor!

            Similar courses

            Explore other courses in the same field!