Just in case
Just in case you want to avoid looping twice, you can also use the BeautifulSoup css selector and chain class
and <a>
. So take your soup and select like this:
soup.select('.p-list-sec a')
To shape the information you like to process you can use a single for loop or a list comprehension all in one line:
[{'url':link['href'], 'title':link['title']} for link in soup.select('.p-list-sec a')]
Output
[{'url': 'link1', 'title': 'tltle1'},
{'url': 'link2', 'title': 'tltle2'},
{'url': 'link3', 'title': 'tltle3'},
{'url': 'link1', 'title': 'tltle1'},
{'url': 'link2', 'title': 'tltle2'},
{'url': 'link3', 'title': 'tltle3'},
{'url': 'link1', 'title': 'tltle1'},
{'url': 'link2', 'title': 'tltle2'},
{'url': 'link3', 'title': 'tltle3'}]
To store it in an csv feel free to push it into pandas
or csv
Pandas:
import pandas as pd
pd.DataFrame([{'url':link['href'], 'title':link['title']} for link in soup.select('.p-list-sec a')]).to_csv('url.csv', index=False)
CSV:
import csv
data_list = [{'url':link['href'], 'title':link['title']} for link in soup.select('.p-list-sec a')]
keys = data_list[0].keys()
with open('url.csv', 'w') as output_file:
dict_writer = csv.DictWriter(output_file, keys)
dict_writer.writeheader()
dict_writer.writerows(data_list)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…