Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
894 views
in Technique[技术] by (71.8m points)

curl - How can I un-shorten a URL using python?

I have seen this thread already - How can I unshorten a URL?

My issue with the resolved answer (that is using the unshort.me API) is that I am focusing on unshortening youtube links. Since unshort.me is used readily, this returns almost 90% of the results with captchas which I am unable to resolve.

So far I am stuck with using:

def unshorten_url(url):
    resolvedURL = urllib2.urlopen(url)  
    print resolvedURL.url

    #t = Test()
    #c = pycurl.Curl()
    #c.setopt(c.URL, 'http://api.unshort.me/?r=%s&t=xml' % (url))
    #c.setopt(c.WRITEFUNCTION, t.body_callback)
    #c.perform()
    #c.close()
    #dom = xml.dom.minidom.parseString(t.contents)
    #resolvedURL = dom.getElementsByTagName("resolvedURL")[0].firstChild.nodeValue
    return resolvedURL.url

Note: everything in the comments is what I tried to do when using the unshort.me service which was returning captcha links.

Does anyone know of a more efficient way to complete this operation without using open (since it is a waste of bandwidth)?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

one line functions, using requests library and yes, it supports recursion.

def unshorten_url(url):
    return requests.head(url, allow_redirects=True).url

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...