It ended up being an intricate XPath expression:
library(XML)
sitePage<-htmlParse("http://ipt.humboldt.org.co/")
hyperlinksYouNeed<-getNodeSet(sitePage,"//table[@id='resourcestable']
//td[5][.='Specimen']
/preceding-sibling
::td[3]
/a
/@href")
but let me explain the XPath expression bit-by-bit:
//table[@id='resourcestable']
-> This way we are getting the main table on the page called 'resourcestable'
//td[5][.='Specimen']
-> Now we are filtering only these rows that have Type as Specimen
/preceding-sibling
-> Now we start looking backwards
::td[3]
-> 3 steps to be precise counting backwards from where we are. Be careful preceding-sibling start counting backwards therefore td[1] is the Type column, td[2] is the Organisation column and td[3] is the Name column we want.
/a
-> now get the included a node
/@href
-> and finally more precisely the href attribute content
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…