selenium - scraping python : existing connection was forcibly closed by the remote host

Question

Welcome To Ask or Share your Answers For Others

selenium - scraping python : existing connection was forcibly closed by the remote host

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

selenium - scraping python : existing connection was forcibly closed by the remote host

im scraping from this website https://www.bi.go.id/id/statistik/informasi-kurs/transaksi-bi/Default.aspx to get kurs price table.it looks like im getting blocked by that website. with selenium and bs4,but when im trying to get the table i got error like this

ChunkedEncodingError: ("Connection broken: ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)", ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

and this is my code

driver = webdriver.Chrome()
driver.get("https://www.bi.go.id/id/statistik/informasi-kurs/transaksi-bi/Default.aspx")

wait = WebDriverWait(driver, 10)

driver.implicitly_wait(10) #secs


# click "usd"

book = wait.until(EC.element_to_be_clickable((By.ID,"selectPeriod")))
sel = Select(book)
sel.select_by_value("range")

bookk = wait.until(EC.element_to_be_clickable((By.ID,"ctl00_PlaceHolderMain_g_6c89d4ad_107f_437d_bd54_8fda17b556bf_ctl00_ddlmatauang1")))
sel = Select(bookk)
sel.select_by_value("USD  ")

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

start_date = driver.find_element_by_id("ctl00_PlaceHolderMain_g_6c89d4ad_107f_437d_bd54_8fda17b556bf_ctl00_txtFrom")
start_date.send_keys("20-Nov-15")
end_date = driver.find_element_by_id("ctl00_PlaceHolderMain_g_6c89d4ad_107f_437d_bd54_8fda17b556bf_ctl00_txtTo")
end_date.send_keys(time.strftime("%d-%m-%Y"))

time.sleep(5)
buttons = driver.find_elements_by_xpath("//input[@value='Cari']")
buttons[1].click()

src = driver.page_source # gets the html source of the page
headers = {
    #"Referer": "https://id.investing.com/commodities/gold-historical-data",
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0",
    "X-Requested-With": "XMLHttpRequest"
}
parser = BeautifulSoup(src,"lxml") # initialize the parser and parse the source "src"
url = "https://www.bi.go.id/id/statistik/informasi-kurs/transaksi-bi/Default.aspx"
r = requests.get(url, headers=headers)
html = r.text
table = parser.find("table", attrs={"class" : "table1"}) # A list of attributes that you want to check in a tag)
rows = table.find_all('tr')
data = []
for row in rows[1:]:
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    data.append([ele for ele in cols if ele])
    
result = pd.DataFrame(data, columns=['nilai', 'kurs_jual', 'kurs_beli', 'tanggal'])
result.to_csv("kurs1.csv", index=False)

df = pd.read_csv("kurs1.csv")
pd.set_option('display.max_rows', df.shape[0]+1)
print(df)

what should i do? please help me,actually a month ago i already succeed but suddenly the class,id from that website changed so i have to change it all.when i try to run it again i got connection error.im stuck since weeks ago! thank you in advance the csv file shouldve been like this

question from:https://stackoverflow.com/questions/65894976/scraping-python-existing-connection-was-forcibly-closed-by-the-remote-host

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

selenium - scraping python : existing connection was forcibly closed by the remote host

selenium - scraping python : existing connection was forcibly closed by the remote host

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags