I am trying to use BeautifulSoup to scrape a table whose information I only want from one column. I have put this code in a function so that I can more easily apply this to multiple pages. As soon as I call the function multiple times I get multiple lists, but as soon as I want to convert this list into a dataframe I get the results in columns instead of rows.
total_points = []
def getTotalpoints(tag):
url = f'https://www.procyclingstats.com/team/{tag}/analysis/start'
html_content = requests.get(url).text
soup = BeautifulSoup(html_content, "lxml")
team_riders = soup.find_all("table", attrs={"class": "basic"})
table = soup.findAll('table')[0]
rows = table.findAll('tr')
heading = table.find('tr')
headings = []
for item in heading.find_all("th"): # loop through all th elements
# convert the th elements to text and strip "
"
item = (item.text).rstrip("
")
# append the clean column name to headings
headings.append(item)
headings_true = headings[4]
# print(headings)
points = []
for row in rows[1:]:
points.append(row.findAll('td')[4].text)
total_points.append(points)
return
getTotalpoints('astana-pro-team-2010')
getTotalpoints('astana-pro-team-2013')
getTotalpoints('astana-pro-team-2016')
print(total_points)
[['1372', '1076', '581', '579', '334', '288', '282', '222', '183', '146', '116', '106', '106', '102', '78', '77', '68', '54', '43', '41', '40', '38', '25', '11', '10', '5', '5'], ['2225', '838', '682', '538', '457', '456', '411', '410', '329', '286', '284', '237', '205', '196', '150', '114', '110', '109', '104', '72', '68', '67', '56', '46', '45', '28', '16', '10', '10'], ['1178', '849', '772', '701', '663', '572', '548', '530', '355', '267', '249', '247', '239', '200', '188', '175', '160', '133', '113', '109', '96', '75', '74', '68', '50', '40', '38', '37', '31', '5', '', '']]
df = pd.DataFrame(total_points)
print(df)
0 1 2 3 4 5 6 7 8 9 ... 22 23 24 25
0 1372 1076 581 579 334 288 282 222 183 146 ... 25 11 10 5
1 2225 838 682 538 457 456 411 410 329 286 ... 56 46 45 28
2 1178 849 772 701 663 572 548 530 355 267 ... 74 68 50 40
26 27 28 29 30 31
0 5 None None None None None
1 16 10 10 None None None
2 38 37 31 5
How can i achieve that every list becomes it's own column with all the rows under it? I would like to have the results like:
column 1 column 2 column 3
row 1 row 1 row 1
row 2 row 2 row 2
row 3 row 3 row 3
row 4 row 4 row 4
etc etc etc
So every list in its own column instead of every row in its own column.
Thanks for your answers!
question from:
https://stackoverflow.com/questions/65938382/get-results-in-to-df-when-using-function