Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
266 views
in Technique[技术] by (71.8m points)

python - Creating a list of dictionaries results in a list of copies of the same dictionary

I want to get all the iframe from a webpage.

Code:

site = "http://" + url
f = urllib2.urlopen(site)
web_content =  f.read()

soup = BeautifulSoup(web_content)
info = {}
content = []
for iframe in soup.find_all('iframe'):
    info['src'] = iframe.get('src')
    info['height'] = iframe.get('height')
    info['width'] = iframe.get('width')
    content.append(info)
    print(info)       

pprint(content)

result of print(info):

{'src': u'abc.com', 'width': u'0', 'height': u'0'}
{'src': u'xyz.com', 'width': u'0', 'height': u'0'}
{'src': u'http://www.detik.com', 'width': u'1000', 'height': u'600'}

result of pprint(content):

[{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'},
{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'},
{'height': u'600', 'src': u'http://www.detik.com', 'width': u'1000'}]

Why is the value of the content not right? It's suppose to be the same as the value when I print(info).

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You are not creating a separate dictionary for each iframe, you just keep modifying the same dictionary over and over, and you keep adding additional references to that dictionary in your list.

Remember, when you do something like content.append(info), you aren't making a copy of the data, you are simply appending a reference to the data.

You need to create a new dictionary for each iframe.

for iframe in soup.find_all('iframe'):
   info = {}
    ...

Even better, you don't need to create an empty dictionary first. Just create it all at once:

for iframe in soup.find_all('iframe'):
    info = {
        "src":    iframe.get('src'),
        "height": iframe.get('height'),
        "width":  iframe.get('width'),
    }
    content.append(info)

There are other ways to accomplish this, such as iterating over a list of attributes, or using list or dictionary comprehensions, but it's hard to improve upon the clarity of the above code.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...