Radek. I'm going to show you how to fish.
When you call Mechanize::Page::parser
, it's giving you the Nokogiri document. So your "xpath
" and "at_xpath
" calls are invoking Nokogiri. The problem is in your xpaths. In general, start out with the most general xpath you can get to work, and then narrow it down. So, for example, instead of this:
puts post_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div[2]/text()').to_s.strip
start with this:
puts post_page.parser.xpath('//table').to_html
This gets the any tables, anywhere, and then prints them as html. Examine the HTML, to see what tables it brought back. It probably grabbed several when you want only one, so you'll need to tell it how to pick out the one table you want. If, for example, you notice that the table you want has CSS class "userdata
", then try this:
puts post_page.parser.xpath("//table[@class='userdata']").to_html
Any time you don't get back an array, you goofed up the xpath, so fix it before proceding. Once you're getting the table you want, then try to get the rows:
puts post_page.parser.xpath("//table[@class='userdata']//tr").to_html
If that worked, then take off the "to_html
" and you now have an array of Nokogiri nodes, each one a table row.
And that's how you do it.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…