Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
476 views
in Technique[技术] by (71.8m points)

web scraping - How to get data with BeautifulSoup without having "None"?

I am new to web scraping and I have a problem with it.

I want to get the name of the courses in specific search results on Udemy (from this link https://www.udemy.com/courses/search/?src=ukw&q=veri+bilimi).

Here is my code:

import requests
from bs4 import BeautifulSoup

result = requests.get("https://www.udemy.com/courses/search/?src=ukw&q=veri+bilimi")

print(result.status_code)

src = result.content

soup = BeautifulSoup(src, "lxml")

print(soup.find("div", attrs={"class":"udlite-focus-visible-target udlite-heading-md course-card--course-title--2f7tE"}))

It turns "None" instead of course names. Unfortunately, I didn't understand and see my mistake.

Can you help me?

question from:https://stackoverflow.com/questions/66066256/how-to-get-data-with-beautifulsoup-without-having-none

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The udemy website is using javascript to load course title that requests won't access. You need to use selenium

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
url ="https://www.udemy.com/courses/search/?src=ukw&q=veri+bilimi"
import time
webdriver =webdriver.Chrome()

webdriver.get(url)
time.sleep(6) # delay 6 sec

soup = BeautifulSoup(webdriver.page_source, "lxml")

course_titles = soup.find_all("div", attrs={"class":"udlite-focus-visible-target udlite-heading-md course-card--course-title--2f7tE"})
for title in course_titles:
    print(title.get_text())

Selenium Installation if you need it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...