Inflearn brand logo image
Inflearn brand logo image
Inflearn brand logo image
Data Science

/

Data Analysis

Free Python Course (Usage Part 3) - Web Scraping (5 hours)

From HTML basics to expert scraping techniques, I'll teach you everything. This one video is all you need.

(5.0) 157 reviews

5,421 learners

  • nadocoding
Web Crawling
Web Scraping
Selenium
Python
Thumbnail

Reviews from Early Learners

What you will learn!

  • Coupang, Google Movies, Naver, and other site scraping strategies

  • Dynamically loaded pages are also easy to use

  • Basic knowledge of web automation using Selenium

Fun and useful web scraping,
Handle and obtain various data with your own hands!

📣 Here is the information.
The web scraping sites covered in this course, such as Coupang, Naver Webtoon, and Tistory, have undergone continuous updates and reorganizations since the recording of the course. Furthermore, some features may no longer be available as Selenium versions have been upgraded. Consequently, many of the exercises may be difficult to follow. New students are advised to keep this in mind when learning. Rather than attempting to replicate every exercise shown in the course, we recommend using the course to better understand how to approach the page at the time. We sincerely apologize for any inconvenience.

Do you remember the story of the wolf and the seven little goats?

While their mother is away from home, seven baby goats are left behind, but a bad wolf comes looking for them.
"It's Mom, open the door."
But one baby goat refuses to open the door, saying, "My mother's voice isn't that scary!"

The wolf came back again, this time with a pretty voice
"It's Mom, can you open the door?"
A baby goat asks,

"Reach out your hand"
Then, soon after, I saw the dark fur and sharp claws on the feet.
"My mother's hands are very white," he said, not opening the door.

Seeing the wolf's feet reappearing covered in white flour,
This time, the goats are tricked into opening the door and suffer a crushing defeat. (I won't spoil the ending 😆😆)

So, here the wolf makes three attempts to break into the goat's house.

1. The lie about being a mother
2. Lie about being a mother + pretty voice
3. Lie about being a mother + pretty voice + feet covered in white flour

Eventually, the house breaks through on the third try.


Web Scraping?

The introduction was long, but web scraping requires this very process. It's like a battle between a spear and a shield. While any spear can be used for a simple shield, to pierce a sturdy shield, you need a sharper, more precise, and more powerful spear.

But in web scraping, the roles of goats and wolves are actually reversed a bit.

We're the gentle baby wolf, and the target server is a massive, muscular, horned mother goat. We have to conquer that server somehow.

There are several approaches to this, and in my lecture, I will explain each of the wolf strategies above in order, one by one.

Oh, by the way, web scraping and web crawling are a little different.

Web crawling is,
Older adults (myself included) may know, but back in the day, there was a program called "Let's Read Books, Books, Books." The highlight was that there was a bookshelf filled with books, and a cart was placed next to it, giving guests a minute or so to collect as many books as they could during that time. If they could collect as many books as they could, they would all be theirs. (I'll leave the "golden books" aside for now ^^)

What would you do if you were a guest at this time?
They'll probably just try to get all the books in there as quickly as possible, without any particular consideration. You can think of this as web crawling.

Web scraping, on the other hand, involves the teacher giving you a blank sheet of paper the day before an exam and asking you to write down anything you want. Then, during exam time, you can just open that single sheet and take the test.

So, you're probably going to write down everything you learned in class, like important concepts, difficult formulas, or English words, in an easy-to-reference format. This is web scraping. It's different, right?


In other words, web scraping refers to the act of extracting the data I want from a website and processing it into the format I want .

For example, getting the titles of all comics on the Naver webtoon page or the real-time rankings of Top 1-10,

For example, in a shopping mall like Coupang, only products that meet my exact requirements are brought in with links.
In the example

  • Within the top 1-5 pages
  • There are over 100 reviews
  • If the rating is over 4.5 points
  • Excluding Apple products
  • Excluding advertising products

Let's practice just getting the list.
(I'm not saying I hate Apple or anything, it's just for practice 😊😊)

Let's also practice downloading images.

I'm a huge movie fan, but I have trouble deciding which ones to watch. So, I've downloaded 25 movie poster images for the top five most-watched films of the past five years and I'm just going to choose any one of them. Saving each image individually would take a lot of time and clicking, but using scraping technology, I can save the file with just a few lines of code, even giving it a custom file name.

And sometimes, after importing some data, you'll need to manage it in Excel or perform additional processing. In those cases, you can simply create a CSV file and open it directly in Excel. We'll practice retrieving all KOSPI market capitalization ranking information from Naver Finance.

However, these sites may not be entirely happy about automated bots stealing information. Not only could they use the information without permission, but repeated page requests could place a significant burden on the server. Therefore, servers employ various defensive measures, such as denying access to pages or blocking access.

But as always, we will find a way.

Sometimes, you may need to log in or perform certain actions on a webpage to retrieve the data you want. For dynamically moving webpages, you can use Selenium, a web testing automation framework, to automatically control the browser. When previous methods fail, using Selenium will often solve the problem.

For example, I want to retrieve information about only the movies that are currently on sale among the popular chart movies on the Google Movies page, but here, the user has to scroll down to retrieve the next list.

Or, when I enter my desired schedule on Naver Airline Ticket and click the flight search button, it takes a long time to load before the list appears.

Even when using Selenium, a more nuanced approach is needed to reduce errors in these areas. Of course, I'll cover all of this in the lecture.

Learning web scraping requires some prior knowledge. Since a basic understanding of the web is essential, we'll briefly cover HTML and XPath. Since we'll be using Chrome (Google Chrome), we'll also cover how to use Chrome and its developer tools. Regular expressions may be necessary during the scraping process, so we'll only briefly touch on them. This may make the theoretical explanations a bit lengthy and tedious, but after a while, we'll get to practice a variety of pages, so please bear with us and follow along.


With so much material to cover, you might find it difficult to keep track of everything, so let's take a moment to wrap things up. Unlike the previous practical topics, web scraping involves using a technique that targets websites created by others, so I'll remind you of some important points to keep in mind. If you're in a hurry or just want to get the gist, this section alone will give you a general understanding of the course.

Of course, I'll give you a quiz this time too.
Take the time to scrape the search results information from the following real estate listings yourself.

Finally, we'll work on a project . The project theme is "My Virtual Assistant."
I'll create a program that makes it easy for me to wake up every morning, check the weather, and read major news and IT news. While I'm at it, I'll also try to get new English conversation texts every day for my 1 English a Day program. With just one click, all this information will be available in the format I need.

It must be very comfortable, right? ^^
Clicking the link will take you directly to the news article. And although not covered in this topic, you can easily access the information every morning by sending the data obtained above via email or KakaoTalk.


If you've learned the basics of Python and want to build your skills, learn web scraping right now.
This one video is enough.
Plus, Nadocoding is “free”.

Designed by freepik
https://www.freepik.com

Recommended for
these people

Who is this course right for?

  • If you have learned Python and are wondering where to use it,

  • If you are copying and pasting the data you need from the web one by one,

  • If you want to get all your shopping mall data in a few seconds,

Need to know before starting?

  • Python Basics

Hello
This is

100,327

Learners

3,109

Reviews

915

Answers

4.9

Rating

11

Courses

유튜브에서 코딩 교육 채널을 운영하고 있는 나도코딩입니다.
누구나 쉽고 재미있게 코딩을 공부하실 수 있도록 친절한 설명과 쉬운 예제로 강의합니다.
코딩, 함께 하실래요? 😊

🧡 유튜브 나도코딩
🎁 코딩 자율학습 나도코딩의 파이썬 입문
📚 코딩 자율학습 나도코딩의 C 언어 입문

Curriculum

All

39 lectures ∙ (5hr 26min)

Published: 
Last updated: 

Reviews

All

157 reviews

5.0

157 reviews

  • iambyunghyun님의 프로필 이미지
    iambyunghyun

    Reviews 8

    Average Rating 4.9

    5

    100% enrolled

    진짜 재능기부수준

    • 한지인님의 프로필 이미지
      한지인

      Reviews 2

      Average Rating 5.0

      5

      31% enrolled

      파이썬 초급자도 쉽게 이해할 수 있도록 천천히 꼼꼼하게 알려주셔서 강의 듣는데 부담이 없어요. 감사합니다!

      • sangcheol25님의 프로필 이미지
        sangcheol25

        Reviews 2

        Average Rating 5.0

        5

        31% enrolled

        이해가 되기 쉽게 설명해줘요

        • Interpolte MIU1014님의 프로필 이미지
          Interpolte MIU1014

          Reviews 13

          Average Rating 5.0

          5

          62% enrolled

          이해하기 쉬운 강의 감사합니다. 파이썬을 다시 공부할 수 있어서 좋아요.

          • moljin님의 프로필 이미지
            moljin

            Reviews 8

            Average Rating 4.5

            5

            31% enrolled

            현재시점으로는 네이버 웹툰은 크롤링이 전체가 이루어지지 않습니다. 해결방법은 멀까요?

            Free

            nadocoding's other courses

            Check out other courses by the instructor!

            Similar courses

            Explore other courses in the same field!