Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
295 views
in Technique[技术] by (71.8m points)

urllib2 - Python and urllib

I'm trying to download a zip file ("tl_2008_01001_edges.zip") from an ftp census site using urllib. What form is the zip file in when I get it and how do I save it?

I'm fairly new to Python and don't understand how urllib works.

This is my attempt:

import urllib, sys

zip_file = urllib.urlretrieve("ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/Autauga_County/", "tl_2008_01001_edges.zip")

If I know the list of ftp folders (or counties in this case), can I run through the ftp site list using the glob function?

Thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Use urllib2.urlopen() for the zip file data and directory listing.

To process zip files with the zipfile module, you can write them to a disk file which is then passed to the zipfile.ZipFile constructor. Retrieving the data is straightforward using read() on the file-like object returned by urllib2.urlopen().

Fetching directories:

>>> files = urllib2.urlopen('ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/').read().splitlines()
>>> for l in files[:4]: print l
... 
drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01001_Autauga_County
drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01003_Baldwin_County
drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01005_Barbour_County
drwxrwsr-x    2 0        4009         4096 Nov 26  2008 01007_Bibb_County
>>> 

Or, splitting for directory names:

>>> for l in files[:4]: print l.split()[-1]
... 
01001_Autauga_County
01003_Baldwin_County
01005_Barbour_County
01007_Bibb_County

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...