I just saw this, a year and a bit later, but hopefully not too late for some googler...
Typhoeus by far the best solution for this. It wraps libcurl in a really elegant fashion. You can set the max_concurrency
up to about 200 without it choking.
With respect to timeouts, if you pass Typhoeus a :timeout
flag, it will just register a timeout as the response... and then you can even put the request back in another hydra to try again if you like.
Here's your program rewritten with Typhoeus. Hopefully this helps anybody who comes across this page later!
require 'typhoeus'
urls = [
'http://www.google.com/',
'http://www.yandex.ru/',
'http://www.baidu.com/'
]
hydra = Typhoeus::Hydra.new
successes = 0
urls.each do |url|
request = Typhoeus::Request.new(url, timeout: 15000)
request.on_complete do |response|
if response.success?
puts "Successfully requested " + url
successes += 1
else
puts "Failed to get " + url
end
end
hydra.queue(request)
end
hydra.run
puts "Fetched all urls!" if successes == urls.length
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…