실제로 가져오는 뉴스 자료가 적은데 왜 이럴까요?

Question

#사용자입력 keyword = pyautogui.prompt("검색어를 입력하세요") lastpage = int(pyautogui.prompt("몇 페이지까지 크롤링 할까요?")) page_num = 1 for i in range(1, lastpage * 10, 10): print(f"{page_num}페이지 크롤링 중입니다=========================") response = requests.get(f"https://search.naver.com/search.naver?where=news&query=%ED%97%88%EA%B0%9C%EC%97%B4&sm=tab_opt&sort=1&photo=0&field=0&pd=0&ds=&de=&docid=&related=0&mynews=0&office_type=0&office_section_code=0&news_office_checked=&nso=so%3Add%2Cp%3Aall&is_sug_officeid=0={keyword}&start={i}") html = response.text soup = BeautifulSoup(html, 'html.parser') articles = soup.select("div.info_group") # 뉴스 기사 div 10 for article in articles: links = article.select("a.info") # 리스트 if len(links) >= 2: # 링크가 2개 이상이면 url = links[1].attrs['href'] # 두번째 링크의 href를 추출 response = requests.get(url, headers={'User-agent':'Mozila/5.0'}) html = response.text soup = BeautifulSoup(html, 'html.parser') # 만약 연예 뉴스라면 if "entertain" in response.url: title = soup.select_one(".end_tit") content = soup.select_one("#articeBody") elif "sports" in response.url: title = soup.select_one("h4.title") content = soup.select_one("#newsEndContents") #본문 내용에 불필요한 내용 삭제 divs = content.select("div") for div in divs: div.decompose() paragraphs = content.select("p") for p in paragraphs: p.decompose() else: title = soup.select_one("#articleTitle") content = soup.select_one("#articleBodyContents") print("=======링크========
", url) print("=======제목========
", title.text.strip()) print("=======본문========
", content.text.strip()) time.sleep(0.3) page_num = page_num + 1 이렇게 한 다음 뉴스는 3페이지까지 가져오기했는데 1페이지에서 4개 2, 3페이지 각각 1개정 가져오네요 ㅠㅜ

스타트코딩 · Answer

음,, 지금 코드는 잘 확인이 안되네요. 기락님이 원하는 키워드로 검색을 했을 때 네이버 뉴스에 등록된 기사의 개수가 페이지에서 4개 2, 3페이지 각각 1개 일 수가 있습니다. 직접 검색했을 때 vs 크롤링 했을 때 네이버 뉴스에 등록된 기사 확인해 보시고요. 다른 키워드로도 검색해 보시길 바랍니다. (그리고 오타일 수 있으니 다시 한번 소스코드를 지우고 새로 따라서 쳐 보세요!)