after following this python scrip I tried several ways to save the data just for home games and away games, just like it is shown in https://understat.com/league/EPL , [enter image description here][1]
(遵循此python脚本之后,我尝试了几种方法来保存仅用于主场比赛和客场比赛的数据,就像https://understat.com/league/EPL所示,[在此处输入图像说明] [1])
Basically I want to scrap the data in the blue and away box, in the red this script is already doing the job.
(基本上,我想将数据剪贴在蓝色和远离的框中,在红色中,此脚本已在执行此工作。)
pic: [1]: https://i.stack.imgur.com/G9K4F.png
(图片:[1]: https : //i.stack.imgur.com/G9K4F.png)
code:
(码:)
Based on the structure of the webpage, I found that data is in the JSON variable, under tags (根据网页的结构,我发现数据位于JSON变量的标记下)
scripts = soup.find_all('script')
string_with_json_obj = ''
# Find data for teams
for el in scripts:
if 'teamsData' in el.text:
string_with_json_obj = el.text.strip()
#print(string_with_json_obj)
# strip unnecessary symbols and get only JSON data
ind_start = string_with_json_obj.index("('")+2
ind_end = string_with_json_obj.index("')")
json_data = string_with_json_obj[ind_start:ind_end]
json_data = json_data.encode('utf8').decode('unicode_escape')
# convert JSON data into Python dictionary
data = json.loads(json_data)
# Get teams and their relevant ids and put them into separate dictionary
teams = {}
for id in data.keys():
teams[id] = data[id]['title']
# EDA to get a feeling of how the JSON is structured
# Column names are all the same, so we just use first element
columns = []
# Check the sample of values per each column
values = []
for id in data.keys():
columns = list(data[id]['history'][0].keys())
values = list(data[id]['history'][0].values())
break
# Getting data for all teams
dataframes = {}
for id, team in teams.items():
teams_data = []
for row in data[id]['history']:
teams_data.append(list(row.values()))
df = pd.DataFrame(teams_data, columns=columns)
dataframes[team] = df
this script is only scraping for "all" games I would like to create but only for away games and for home games, and export these data to CSV
(该脚本仅适用于我要创建的“所有”游戏,而仅适用于客场游戏和家庭游戏,并将这些数据导出到CSV)
credits: https://towardsdatascience.com/web-scraping-advanced-football-statistics-11cace1d863a
(学分: https : //towardsdatascience.com/web-scraping-advanced-football-statistics-11cace1d863a)
ask by JoseM117 translate from so 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…