Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
312 views
in Technique[技术] by (71.8m points)

python - Download file via hyperlink in PhantomJS using Selenium

I am using selenium to do a click function on a hyperlink, which is loaded on a certain page. The script works for google chrome, but does not for phantomjs. Why is this not working?

from selenium import webdriver

driver = webdriver.Chrome()   
#driver = webdriver.PhantomJS(executable_path = "/Users/jameslemieux/PythonProjects/phantomjs-1.9.8-macosx/bin/phantomjs")

driver.get("http://www.youtube-mp3.org/?e=t_exp&r=true#v=hC-T0rC6m7I")

elem = driver.find_element_by_link_text('Download')
elem.click()


driver.save_screenshot('/Users/jameslemieux/Desktop/Misc./test_image.png')

driver.quit()

This works in chrome, but it always opens up a new chrome window to complete the task. I read that I should use phantomjs to have it run behind the scenes, however when i switch the drivers to phantomjs, the download does not seem to go through. The screenshot grabs, and it is indeed at the right page, and the 'Download' is definitely there. So the

elem.click()

is not doing what it should, or it IS clicking, but phantomjs doesnt know how to deal with a direct download link. Please help, ive been at this for hours on end.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Since PhantomJS would never proceed with a download request, we need to download the file manually.

The idea here is to click the "Convert" button, wait for the "Download" link to appear, get the href attribute, containing the link to the generated mp3 file, and download it via urllib.urlretrieve():

import urllib
from urlparse import urljoin

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

base_url = 'http://www.youtube-mp3.org/'

driver = webdriver.PhantomJS()
driver.get("http://www.youtube-mp3.org/?e=t_exp&r=true#v=hC-T0rC6m7I")

# convert the video to mp3
driver.find_element_by_id('submit').click()

# wait for download link to appear
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.LINK_TEXT, "Download")))
link = element.get_attribute('href')
url = urljoin(base_url, link)

# download the song
urllib.urlretrieve(url, 'song.mp3')

driver.quit()

# enjoy the great song

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...