Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
193 views
in Technique[技术] by (71.8m points)

java - Trying to parse html hidden by javascript

I've created a simple java script that used Jsoup to parse a page of data. The site creators have changed the page however, so much that if there is a certain amount of data on the page it gives you the opinion to refine your search, or, you can click a link and the data will come up. I've been tearing my hair out trying to find a solution, the url doesn't change, and the href for the link is just javacript:void(0);. Is there any way I can get at the html containing the data just using my script?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Try to use something that drives a web browser like Selenium. That's the only one I have used, never needed anything else. I'm sure there are different ones that may suit you better, you should test a few, or not.. Once you get the javascript elements with selenium (or whatever web driver you choose) parse them into JSoup Elements. This way you wouldn't have to completely change libs, but just add one.

Also, there are ways you can work around javascript by watching what changes in browser's address bar.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...