im scraping from this website https://www.bi.go.id/id/statistik/informasi-kurs/transaksi-bi/Default.aspx to get kurs price table.it looks like im getting blocked by that website.
with selenium and bs4,but when im trying to get the table i got error like this
ChunkedEncodingError: ("Connection broken: ConnectionResetError(10054,
'An existing connection was forcibly closed by the remote host', None,
10054, None)", ConnectionResetError(10054, 'An existing connection was
forcibly closed by the remote host', None, 10054, None))
and this is my code
driver = webdriver.Chrome()
driver.get("https://www.bi.go.id/id/statistik/informasi-kurs/transaksi-bi/Default.aspx")
wait = WebDriverWait(driver, 10)
driver.implicitly_wait(10) #secs
# click "usd"
book = wait.until(EC.element_to_be_clickable((By.ID,"selectPeriod")))
sel = Select(book)
sel.select_by_value("range")
bookk = wait.until(EC.element_to_be_clickable((By.ID,"ctl00_PlaceHolderMain_g_6c89d4ad_107f_437d_bd54_8fda17b556bf_ctl00_ddlmatauang1")))
sel = Select(bookk)
sel.select_by_value("USD ")
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
start_date = driver.find_element_by_id("ctl00_PlaceHolderMain_g_6c89d4ad_107f_437d_bd54_8fda17b556bf_ctl00_txtFrom")
start_date.send_keys("20-Nov-15")
end_date = driver.find_element_by_id("ctl00_PlaceHolderMain_g_6c89d4ad_107f_437d_bd54_8fda17b556bf_ctl00_txtTo")
end_date.send_keys(time.strftime("%d-%m-%Y"))
time.sleep(5)
buttons = driver.find_elements_by_xpath("//input[@value='Cari']")
buttons[1].click()
src = driver.page_source # gets the html source of the page
headers = {
#"Referer": "https://id.investing.com/commodities/gold-historical-data",
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0",
"X-Requested-With": "XMLHttpRequest"
}
parser = BeautifulSoup(src,"lxml") # initialize the parser and parse the source "src"
url = "https://www.bi.go.id/id/statistik/informasi-kurs/transaksi-bi/Default.aspx"
r = requests.get(url, headers=headers)
html = r.text
table = parser.find("table", attrs={"class" : "table1"}) # A list of attributes that you want to check in a tag)
rows = table.find_all('tr')
data = []
for row in rows[1:]:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
data.append([ele for ele in cols if ele])
result = pd.DataFrame(data, columns=['nilai', 'kurs_jual', 'kurs_beli', 'tanggal'])
result.to_csv("kurs1.csv", index=False)
df = pd.read_csv("kurs1.csv")
pd.set_option('display.max_rows', df.shape[0]+1)
print(df)
what should i do? please help me,actually a month ago i already succeed but suddenly the class,id from that website changed so i have to change it all.when i try to run it again i got connection error.im stuck since weeks ago! thank you in advance the csv file shouldve been like this
question from:
https://stackoverflow.com/questions/65894976/scraping-python-existing-connection-was-forcibly-closed-by-the-remote-host 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…