Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
314 views
in Technique[技术] by (71.8m points)

python - Beautiful Soup cannot scrape after the first div tag

Please see below. I would like to scrape for the restaurant name that is in

Popeyes

Please see the image below for the HTML on this website.

Can someone please show me how I can scrape that restaurant name "Popeyes" On Python Using Beautiful Soup or any other webscraping package?

Thanks in advance!

Screenshot of the HTML

Below is the code I used to scrape data, however, it stopped at and I couldn't go further. ''' from bs4 import BeautifulSoup as soup # HTML data structure from urllib.request import urlopen as uReq # Web client

# URl to web scrape from.
# in this example we web scrape graphics cards from Newegg.com
page_url = "https://www.doordash.com/store/popeyes-toronto-254846/en-CA"

# opens the connection and downloads html page from url
uClient = uReq(page_url)

# parses html into a soup data structure to traverse html
# as if it were a json data type.
page_soup = soup(uClient.read(), "html.parser")
uClient.close()

page_soup.div'''
question from:https://stackoverflow.com/questions/65862006/beautiful-soup-cannot-scrape-after-the-first-div-tag

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can try this (I may make a mistake on the class name):

import urllib.request
import bs4 as bs
from bs4 import BeautifulSoup

url_1 = 'https://www.doordash.com/store/popeyes-toronto-254846/en-CA'
sauce_1  = urllib.request.urlopen(url_1).read()
soup_1 = bs.BeautifulSoup(sauce_1, 'lxml')     

for x in (soup_1.find_all('h1', class_ = 'sc-AnqlK keKZVr sc-jFpLkX bsGprJ')):
   print(x)

Let me know if this help!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...