python - 'list' object has no attribute 'timeout'

Question

Welcome To Ask or Share your Answers For Others

python - 'list' object has no attribute 'timeout'

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - 'list' object has no attribute 'timeout'

I am trying to download Pdfs using urllib.request.urlopen from a page but it returns an error: 'list' object has no attribute 'timeout':

def get_hansard_data(page_url):
    #Read base_url into Beautiful soup Object
    html = urllib.request.urlopen(page_url).read()
    soup = BeautifulSoup(html, "html.parser")
    #grab <div class="itemContainer"> that hold links and dates to all hansard pdfs
    hansard_menu = soup.find_all("div","itemContainer")

    #Get all hansards
    #write to a tsv file
    with open("hansards.tsv","a") as f:
        fieldnames = ("date","hansard_url")
        output = csv.writer(f, delimiter="")

        for div in hansard_menu:
            hansard_link = [HANSARD_URL + div.a["href"]]
            hansard_date = div.find("h3", "catItemTitle").string

            #download
            
            with urllib.request.urlopen(hansard_link) as response:
                data = response.read()
                r = open("/Users/Parliament Hansards/"+hansard_date +".txt","wb")
                r.write(data)
                r.close()

            print(hansard_date)
            print(hansard_link)
            output.writerow([hansard_date,hansard_link])
        print ("Done Writing File")

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T21:17:17+0000

A bit late, but might still be helpful to someone else (if not for topic starter). I found the solution by solving the same problem.

The problem was that page_url (in your case) was a list, rather than a string. The reason for that is mos likely that page_url comes from argparse.parse_args() (at least it was so in my case). Doing page_url[0] should work but it is not nice to do that inside the def get_hansard_data(page_url) function. Better would be to check the type of the parameter and return an appropriate error to the function caller, if the type does not match.

The type of an argument could be checked by calling type(page_url) and comparing the result like for example: typen("") == type(page_url). I am sure there might be more elegant way to do that, but it is out of the scope of this particular question.

Categories

python - 'list' object has no attribute 'timeout'

python - 'list' object has no attribute 'timeout'

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags