크롤링이 안됩니다 ㅠ

Question

선생님,안녕하세요. 가르쳐 ^^;; 주신대로 여러 한국 사이트를 크롤링해보니 참 잘되서 고맙습니다.아래 MLB사이트도 한번 크롤링해볼려고 하는데 여러가지로 해봐도 안끌어와서요.혹시 이유가 좀 있을까요?from bs4 import BeautifulSoupimport urllib.request as reqimport sysimport iosys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding = 'utf-8')sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding = 'utf-8')url = "http://mlb.mlb.com/stats/sortable.jsp#elem=%5Bobject+Object%5D&tab_level=child&click_text=Sortable+Player+hitting&game_type='R'&season=2018&season_type=ANY&league_code='MLB'&sectionType=sp&statType=hitting&page=1&ts=1525019114922"contents = req.urlopen(url).read()soup = BeautifulSoup(contents, "html.parser")filter_contents = soup.select("#datagrid > tr")print(filter_contents)

Answer

안녕하세요.  Lee Seongmin  님해당 사이트를 확인해보니 크롤링에 대한 request header 값을 확인하는 사이트로 확인됩니다.강의중에 fake , cookie 관한 내용이 있습니다.해당 부분을 구현 한 후 요청하셔야 될 것같습니다.contents  내용을  print 로 찍은 후에 내용이 나타난다면,"#datagrid > tr" -> 선택자 부분을 바꿔가면서 출력하시면 가능할 거라 생각됩니다.감사합니다.

Seongmin Lee

크롤링이 안됩니다 ㅠ

이 글과 비슷한 Q&A

7회 작업형3-1 질문

dt 에러 질문

뒤집은 소수 문제

bfs 시간복잡도 관련 질문입니다!