I just started webscraping so I can practice visualizing data. Now, based on some tutorials, I have written a small script, which I would like to turn into a function. I think changing it into a function works, but as soon as I change the tag and run the function twice I get an error message. My code is not perfect anyway, but I don't understand why I get an error message when I run the function twice. I have added the code below without and with an error message.
total_points = []
def getTotalpoints(tag):
url = f'https://www.procyclingstats.com/team/{tag}/analysis/start'
html_content = requests.get(url).text
soup = BeautifulSoup(html_content, "lxml")
team_riders = soup.find_all("table", attrs={"class": "basic"})
table = soup.findAll('table')[0]
rows = table.findAll('tr')
heading = table.find('tr')
headings = []
for item in heading.find_all("th"): # loop through all th elements
# convert the th elements to text and strip "
"
item = (item.text).rstrip("
")
# append the clean column name to headings
headings.append(item)
headings_true = headings[4]
# print(headings)
points = []
for row in rows[1:]:
points.append(row.findAll('td')[4].text)
total_points.append(points)
df_season_points_astana_2020 = pd.DataFrame(data=points,columns=[headings_true])
return
getTotalpoints('astana-pro-team-2010')
The code above gives me the right results if i change the tag. But when i add an another call to the function i get an error.
total_points = []
def getTotalpoints(tag):
url = f'https://www.procyclingstats.com/team/{tag}/analysis/start'
html_content = requests.get(url).text
soup = BeautifulSoup(html_content, "lxml")
team_riders = soup.find_all("table", attrs={"class": "basic"})
table = soup.findAll('table')[0]
rows = table.findAll('tr')
heading = table.find('tr')
headings = []
for item in heading.find_all("th"): # loop through all th elements
# convert the th elements to text and strip "
"
item = (item.text).rstrip("
")
# append the clean column name to headings
headings.append(item)
headings_true = headings[4]
# print(headings)
points = []
for row in rows[1:]:
points.append(row.findAll('td')[4].text)
total_points.append(points)
df_season_points_astana_2020 = pd.DataFrame(data=points,columns=[headings_true])
return
getTotalpoints('astana-pro-team-2010')
getTotalpoints('astana-pro-team-2011')
Error:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-84-8f4e84667c62> in <module>
1 getTotalpoints('astana-pro-team-2010')
----> 2 getTotalpoints('astana-pro-team-2011')
<ipython-input-83-ab852e5a1f1d> in getTotalpoints(tag)
8 team_riders = soup.find_all("table", attrs={"class": "basic"})
9
---> 10 table = soup.findAll('table')[0]
11 rows = table.findAll('tr')
12 heading = table.find('tr')
IndexError: list index out of range
I hope someone can explain to me what I'm doing wrong!
Thanks in advance.
question from:
https://stackoverflow.com/questions/65928368/the-function-is-not-working-when-changing-tag 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…