본문 바로가기

Mind Security/ETC

[Python]웹 크롤링 CSS

728x90

 

id = #
<div id = "articleBody"> 본문 내용입니다.</div>

∴ #articleBody

class = .



<div class = "info_group"> 뉴스목록 </div>

∴ .info_group


별명이 없을 때,
1) <div class="logo_sport"><span>스포츠</span></div>

∴ .logo_sport > span

2) <div class="news_headlinet"><h4>제목</h4></div>

∴ .news_headlin > h4

 

 

URL

 

 

import requests
from bs4 import BeautifulSoup


keyword = input("검색어를 입력하세요.")
lastpage = input("페이지 수를 입력해주세요.")
pageNum = 1
print(f"{pageNum} 페이지 입니다. ----------------------------------")

for i in rang(1, int(lastpage)*10, 10)
response =  requests.get(f"https:~{lastpage})
html = response.text
links  = soup.select(".new_tit")
for link in links:
    title = link.text
    url = link.attrs['href']
    print(title, url)
pageNum += 1

 

 

728x90