inflearn logo

The Most Powerful Crawling Technology Today: Mastering Scrapy and Selenium

For those who want to learn data science, big data, and crawling - We have prepared a variety of examples to help you quickly learn the latest and best crawling techniques available.

(4.9) 112 reviews

1,261 learners

Level Intermediate

Course period Unlimited

Web Crawling
Web Crawling
Scrapy
Scrapy
Selenium
Selenium
Web Crawling
Web Crawling
Scrapy
Scrapy
Selenium
Selenium

News

12 articles

  • funcoding님의 프로필 이미지

    Hello. This is Dave Lee from Janjaemi Coding.

    This time, I'd like to share with you my plans for improving the lectures.

    It seems like this lecture is now in its fourth year. The original intention was to make it familiar with crawling and IT-related technologies through the most diverse examples from real-world sites. In fact, since it was made from real-world sites, many sites have changed since then, so I tried to share the changed code for each site as much as possible, but now I think I have reached my limit.

    Of course, I think that showing examples of various cases will help you build your skills when you actually want to crawl the site you want. However, since this lecture is being attended by two groups of people: those who want to build an IT career and those who do not and only want to crawl, those in the latter group will probably feel a little disappointed.

    So, I'm sorry to show you various sites, but I'll give up and write the code and update the lecture so that I can test the selenium part based on a kind of blog site that I personally created for testing purposes.

    Since I am in the field, it is difficult to proceed with this part in a hurry, but I will try to update it within November if possible and notify you again.

    For those of you who have already taken several lectures on residual fun coding, I think it would be good to think positively about it as an additional lecture. For those who have always chosen my lectures, I will do my best to make sure that you have a good experience and do not disappoint your expectations.

    Thank you~~~

    2
  • funcoding님의 프로필 이미지

    hello.

    After much preparation, I am happy to share with you that my first Python Machine Learning Bootcamp course is now 100% open.
    There was a delay in the opening of the course, so we have also offered maximum discounts during the opening period.

    This lecture is one that I created by improving on the parts that I tried and failed at.
    It was 7 years ago when I started learning machine learning/AI technology. At first, I tried learning AI technology, but I got tired of hearing only about AI principles, so I gave up. I also tried learning machine learning, but I only learned mathematical proofs and linear algebra, so I gave up.

    Looking back now, I think it would have been much easier to learn artificial intelligence or machine learning technology if I had done it in the following order.
    Python -> pandas -> Machine learning key concepts + various practical techniques for practical application of machine learning -> Artificial intelligence

    Machine learning contains the most basic concepts that include artificial intelligence. It also contains special related thinking. Also, when applying concepts and actual machine learning to actual problems, there are various special techniques that are used. If you learn and apply the core concepts and techniques that are applied to actual problems, and become familiar with the application of machine learning, your understanding of related technologies will increase. If you learn artificial intelligence technology based on this, you can learn it more easily and learn and utilize it in general.

    Machine learning is so vast, and when you get into the mathematical part, it has the characteristics of a collection of several disciplines, so this lecture was created after thinking about how to organize the necessary parts well, focus on them, and learn them along with techniques used in actual problems. Since it was a lecture that had not been done before, it took even more time.

    Even for developers, machine learning/artificial intelligence seems like a somewhat ambiguous technology that can be overlooked. This lecture is not aimed at becoming the world's top 1% machine learning expert. There are stages. This lecture aims to be a stepping stone for data science careers by allowing developers to understand and utilize machine learning technology and quickly familiarizing themselves with related technologies for those considering a data science career.

    In the future, we will prepare and open artificial intelligence lectures according to the data roadmap as follows.
    I really hope that this will be helpful and that you will find the lecture very impressive.
    thank you

    Data Science Roadmap

    1. Python and data collection (crawling) basics (Python and web, data understanding basics)
    2. Conquering Scrapy and Selenium (Currently the most advanced crawling intermediate technology and related IT knowledge)
    3. SQL and Data Storage/Analysis Basics (Data Storage/Analysis)
    4. NoSQL(mongodb) Big Data Basics (Big Data Storage/Analysis)
    5. First Python Data Analysis (Data Preprocessing and Pandas, Latest Visualization) [Data Science Part 1]
    6. Python Machine Learning Bootcamp for Beginners (Easy! Learn concepts/applications with real problems) [Data Science Part 2]
    7. AI Artificial Intelligence Bootcamp (Data Prediction Automation, First Half of 22') [Data Science Part 3]

     

    0
  • funcoding님의 프로필 이미지

    Hello. This is Dave Lee from Janjaemi Coding.
    The reason is that we are sharing the next lecture by opening it on Infraon as a full-stack Part 3 lecture.
    (Recently, it took a week for the lecture to be opened after submission)
    Whether it's a web or an app, server technology is essential for opening a service. These days, a technology called Docker is absolutely necessary.
    In order to make the relevant technology my own, I have designed it so that I can test various Docker options one by one and create a real complex service. In addition, to handle the server, I have covered AWS, Linux usage, and nginx web server technology as much as necessary.
    • If you are already a developer, you definitely need to learn Docker and the latest technologies thoroughly, as they are also the foundation for Kubernetes, one of the latest server technologies, and team-based deployment technologies .
    • If you are still in the process of becoming a developer, I personally think that being able to handle Docker and servers is the first step to becoming a real developer .
    Related lectures are usually aimed at experienced developers, so they are not easy to understand, whether in books or lectures.
    So, I designed the course with a little more consideration for the students, and I designed it so that I could go back to when I first learned it, and test it one by one, and learn it.
    I hope this lecture was helpful.
    For those of you who have already taken my course, I have offered a discount.
    Additionally, there have been quite a few people recently asking about the schedule for each lecture.
    As I am working and taking more classes, it is not easy.
    • The next lecture is on machine learning and is scheduled for the end of June.
    • The full stack series recently released Flutter 2.0. Flutter can create apps/web/PC programs at once, so I think it's worth checking out the trend. If it gets a certain level of market evaluation that it's useful, I'm thinking about skipping React or Vue, or any web technology, and learning Flutter first. I'll share this in detail after looking at the trend after June.
    In my opinion, if you create the UI (front-end) with flutter and create the back-end/server only with Part 3 Docker and the latest server technology,
    It seems like we'll be able to build web and apps simultaneously much faster than we thought.
    If you have any additional suggestions, please email us at dream@fun-coding.org .
    thank you

    Courses currently open or scheduled to open on Infraon

    Full Stack Course: Tech Tree that will help you create the latest web/app services from A to Z on your own

    They are numbered in order of ripening.

    1. Python and data collection (crawling) basics (Python and web, data understanding basics)
    2. MySQL and Data Storage/Analysis Basics (SQL Database Basics)
    3. NoSQL(mongodb) Big Data Basics (NoSQL Database Basics)
    4. Fastest Full Stack: Python Backend and Web Technology Basics [Full Stack Part 1]
    5. Solid Front-end Fundamentals for Full Stack: Javascript (Vanilla JS and ES6+) and Latest Web Technologies [Full Stack Part 2]
    6. Docker and the latest server technology for full stack (Linux, nginx, AWS, HTTPS, flask deployment) [Full Stack Part 3]
    7. Flutter Basics for Full Stack App Development (Full Stack Part 4, scheduled to open in the second half of 2021)
    8. Basic Vue or React Framework for Full Stack (Full Stack Part 5, scheduled for the second half of 2021)

    As app/web technologies are rapidly changing, we have adjusted our priorities. In order to take the lead in more recent technologies, we will first proceed with flutter, the latest technology that supports both web and apps.

    * Full stack course packages are also available at a discounted price. (Discounts will be reduced soon.)
    [Beginner~Intermediate] Full-stack roadmap to learn the easiest and most up-to-date technology (Shortcut)

    Data Analysis/Science Course: The latest tech tree that can bring in the data you want, analyze it, and even make predictions.

    They are numbered in order of ripening.

    1. Python and data collection (crawling) basics (Python and web, data understanding basics)
    2. Conquering Scrapy and Selenium (Currently the most advanced crawling intermediate technology and related IT knowledge)
    3. SQL and Data Storage/Analysis Basics (Data Storage/Analysis)
    4. NoSQL(mongodb) Big Data Basics (Big Data Storage/Analysis)
    5. First Python Data Analysis (Data Preprocessing and Pandas, Latest Visualization)
    6. Machine Learning Basics (Data Prediction, June 21)
    7. AI Artificial Intelligence Basics (Data Prediction Automation, 2nd Half of 2021)

    * We are also offering our current data science course packages at a discounted price. (The discount rate will be reduced soon.)
    [Beginner~Beginner] Learn the basic data analysis techniques for employment easily and thoroughly (Shortcut)

     

    0
  • funcoding님의 프로필 이미지

    hello.

    This is Dave Lee from Jjanjaemi Coding.

    The video 'Learn how to use Scrapy in real life by crawling Gmarket 5' has been updated with additional explanations.

    There is a code explanation for the part in the video that calls parse_subcategory.

    If you write this code yourself other than the code I provided, you will need to set the following in addition to settings.

    DUPEFILTER_CLASS = 'scrapy.dupefilters.BaseDupeFilter'

    I think I explained this part somewhere in another video,

    I spent a lot of time re-recording and editing it again, just in case I missed something.

    I'm sharing this because I think it will help you understand better.

    thank you

    0
  • funcoding님의 프로필 이미지

    Hello. This is Dave Lee from Janjaemi Coding.

    I have an announcement for you today~~

    So far, many and diverse people have been taking the class. So, there are many times when I ask the same questions in the Q&A section. I'd like to share it with you.

    In the Q&A section, there were quite a few cases where people asked questions about the crawling code they wanted. In those cases, we tried to be as considerate as possible and only answered the light-hearted parts. However, I think the basic purpose of this Q&A section is to answer questions related to this lecture video.

    While taking online math classes, I have never seen a class where students ask questions about math problems they are solving and have the answers provided, other than using the math formulas provided in the class. Likewise, while taking this course, it is realistically very difficult to write your own code or the crawling code you want, or to work together to solve it.

    Moreover, since this course is available for lifelong attendance, if I support it, there may be a misunderstanding, not a misunderstanding, that if you take this course, you have to solve all the crawling codes you want. Even I have to write the actual code myself, so if I am delayed in replying because I am writing the code, other replies may also be delayed, which could be a problem for other students.

    so, When you ask questions in the Q&A section, please do not ask questions related to the crawling code you want. Thank you. I really appreciate your understanding on this part.

    Also, if possible,

    1) If you ask a question like, "I don't understand this part at a certain minute (:second) of a certain chapter,"

    2) For the code from the lecture, please attach the code itself as text to the question rather than capturing the image.

    I think I can understand much better, so I think I can answer you quickly and in detail like this .

    Thank you~~~ I'm dreaming of coding for fun

    Courses currently open or scheduled to open on Infraon

    Full Stack Course: Tech Tree that will help you create the latest web/app services from A to Z on your own

    They are numbered in order of ripening.

    1. Python and data collection (crawling) basics (Python and web, data understanding basics)
    2. SQL and Data Storage/Analysis Basics (SQL Database Basics)
    3. NoSQL(mongodb) Big Data Basics (NoSQL Database Basics)
    4. Fastest Full Stack: Python Backend and Web Technology Basics [Full Stack Part 1]
    5. Python Backend Intermediate and Full Stack Service Development (Full Stack Part 2, scheduled to open in September)
    6. Vue and Front-end Web Technology Basics for Full Stack (Full Stack Part 3, scheduled to open in October)
    7. AWS and Docker-based deployment technology basics for full stack (Full stack Part 4, scheduled to open in November)
    8. Flutter Basics for Full Stack App Development (Full Stack Part 5, scheduled to open in December)

    * All lectures of the full-stack course to date are available at a discounted price in one go as a roadmap package.
    [Beginner~Intermediate] The easiest and fastest full-stack roadmap

    Data Analysis/Science Course: The latest tech tree that can bring in the data you want, analyze it, and even make predictions.

    They are numbered in order of ripening.

    1. Python Introduction and Crawling Basics Bootcamp (Python and Data Collection Basics)
    2. Conquering Scrapy and Selenium (Currently the most advanced crawling intermediate technology and related IT knowledge)
    3. SQL and Data Storage/Analysis Basics (Data Storage/Analysis)
    4. NoSQL(mongodb) Big Data Basics (Big Data Storage/Analysis)
    5. Python Data Analysis Basics (Data Analysis)
    6. Machine Learning/AI Basics (Data Prediction, I am working hard on it)

    We also offer all lectures to date at a discounted price in one package as a roadmap package.
    [Beginner~Beginner] Learn the basic data analysis techniques for employment easily and thoroughly

    0
  • funcoding님의 프로필 이미지

    Hello. This is Dave Lee (Janjaemi Coding).

    It's been a while since I shared news about a new lecture.

    Beginner Python Data Analysis [Easily learn the basic techniques for the entire process from preprocessing to pandas and visualization]

    This course will teach you everything from data preprocessing with Python to pandas and the latest visualization (plotly).

    1. We will analyze real-world examples from beginning to end and explain related technologies, so that after taking the class, you will be able to analyze any data right away.

    2. Pandas has a tricky grammar, and even if you can program it, it's not easy to use it right away. But it's not something you can do with Excel... So I've organized it so that even beginners can understand it, and so that they can use it right away with real-world examples.

    3. The existing visualization technologies are very old, so they often don't work well and are not easy to express. So I will explain how to easily use the latest visualization technologies that are easy to use, useful for analysis, and pretty (I like them).

    4. Here are some tips you need when doing actual data analysis in the field.

    Lastly, as the number of lectures increases, some of you have asked about the order in which they should be taken . Below, I will share the order and future direction of the data analysis/science track and the full-stack track. (Data science + full-stack, isn't it great?) We are preparing to open the best lectures on Infraon.

    Personally, when I open a lecture, those who feel that my lecture was helpful
    Once you have that technology, it will be yours without much worry or trouble.
    I think it would be really great if you felt like you could use it right away.
    Thank you.~~~

    Data Analysis/Science Course

    1. Python Introduction and Crawling Basics Bootcamp (Python and Data Collection Basics)
    2. Conquering Scrapy and Selenium (Intermediate data collection skills and related IT knowledge)
    3. SQL and Data Storage/Analysis Basics (Data Storage/Analysis)
    4. NoSQL(mongodb) Big Data Basics (Big Data Storage/Analysis)
    5. Python Data Analysis Basics (Data Analysis)
    6. Machine Learning/AI Basics (Data Prediction, I am working hard on it)

    Full Stack Course

    1. Python and data collection (crawling) basics (Python and web, data understanding basics)
    2. SQL and Data Storage/Analysis Basics (SQL Database Basics)
    3. NoSQL(mongodb) Big Data Basics (NoSQL Database Basics)
    4. Backend Basics and Intermediate (I am preparing hard)
    5. Front-end basics and intermediate (I am preparing hard)
    6. Full stack basics and clone coding (I am preparing hard)

    0
  • funcoding님의 프로필 이미지

    Hello. This is Dave Lee from Janjaemi Coding.

    When I was making the lectures, I tried to make them so that students could learn and utilize the lecture materials as quickly as possible. So I know that students are downloading the materials and using them well. However, it seems that some people still don't know how to download lecture-related materials from Inflearn, so I'm sharing it again with a new notice.

    Click on the table of contents in the upper right corner as shown below, and then click on the download icon on the left of each lecture's table of contents to download the materials for each lecture.

    Since we uploaded the materials for each lecture in accordance with each lecture, you can learn more effectively if you download the materials for each lecture at that time and take the lecture. I hope this was helpful. Thank you.

    0
  • funcoding님의 프로필 이미지

    Hello, this is Dave Lee. Happy New Year.

    There may be cases where Selenium does not run depending on the PC environment. I will share some methods to try in this regard.

    First of all, a student shared a case where Chrome() can be executed on Windows by entering the executable_path as follows. (Please change the C:/path/ part to the exact name of the folder where chromedriver.exe is located.) Thank you!

    -------------------------------------------------- ---

    driver = webdriver.Chrome( executable_path=r"C:/path/chromedriver.exe" )

    -------------------------------------------------- ---

    Also, in the case of Mac environments, after updating Mac OS to Catalina, I found that the existing chromedriver does not run due to security issues. There is a way to change the security settings, but in this case, I think it would be good to simply move chromedriver to the /usr/local/bin directory and run it. If this part does not work, please visit the following site that I shared in the lecture.

    https://sites.google.com/a/chromium.org/chromedriver/

    If you download the new chromedriver, move the executable file to the /usr/local/bin directory, and run it as follows, it will run normally.

    -------------------------------------------------- ---

    from selenium import webdriver

    chromedriver = '/usr/local/bin/chromedriver'

    driver = webdriver.Chrome(chromedriver)

    -------------------------------------------------- ---

    I hope you find these tips helpful. Thanks.

    0

Access is restricted to non-public courses.