Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
427 views
in Technique[技术] by (71.8m points)

web scraping - Web Scrapping by google sheet

I want to scrap data from link below: http://fipiran.ir/Symbol/HistoryPrice?symbolpara=%DA%86%D9%83%D8%A7%D9%BE%D8%A7

and Also I want 365 rows of the table. I tried:

=importhtml("http://fipiran.ir/Symbol/HistoryPrice?symbolpara=%DA%86%D9%83%D8%A7%D9%BE%D8%A7","table",0)

and

=importxml("http://fipiran.ir/Symbol/HistoryPrice?symbolpara=%DA%86%D9%83%D8%A7%D9%BE%D8%A7","//*[@id='gbox_grid']
")

But nothing comes out.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You cannot use IMPORTHTML on the page because this page is loaded dynamically via JavaScript (see related question).

What you can do, though, is use the API call it does (example) and get the data that way. I’m not sure if this is allowed intentionally or not, and you should check with the owners of the page.

You can add a function that does that in Google Apps Script:

function IMPORTJSON(url, columns) {
 if (!Array.isArray(columns)) return []
 columns = columns.flat()
 
 const response = UrlFetchApp.fetch(url)
 return JSON.parse(response.getContentText()).data.map(
   obj => columns.map(key => obj[key])
 )
}

And to use it:

=IMPORTJSON("http://example.com/xxx", {"propertyCol1", "propertyCol2", "etc"})

Notice that this example ignores pagination and uses the data only from the first page.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...