Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
109 views
in Technique[技术] by (71.8m points)

rselenium - Is there an R function to fill in an online form?

I'm trying to fill an online form and scrape the results. Using Rselenium I'm able to fill the data for one row:

    library(RSelenium)
    library(xml2)
    library(tidyverse)
    library(rvest)
    
    
    
    # Start Selenium Server --------------------------------------------------------
    
    # https://docs.ropensci.org/RSelenium/articles/basics.html#connecting-to-a-selenium-server-1
    # https://www.akipredictor.com/en/aki_predictor/
    
    rD <- rsDriver(browser="firefox", port=4545L, verbose=F)
    remDr <- rD[["client"]]
    
    # form ------------------------------------------------------------------
    remDr$navigate('https://www.akipredictor.com/en/aki_predictor/')
    remDr$findElement(using = "name", value = "agree_to_legal_terms")$clickElement()
    
    #Pre-admission information
    webElemAge <- remDr$findElement(using = "name", value = "age")
    webElemAge$sendKeysToElement(list("70"))
    webElemBaselineSCreat <- remDr$findElement(using = "name", value = "baseline_screat")
    webElemBaselineSCreat$sendKeysToElement(list("1"))
    webElemIsDiabetic <- remDr$findElement(using = "name", value = "is_diabetic")
    webElemIsDiabetic$sendKeysToElement(list("Yes"))
    webElemIsElectiveAdmited <- remDr$findElement(using = "name", value = "is_elective_admitted")
    webElemIsElectiveAdmited$sendKeysToElement(list("Unplanned admission"))
    webElemTypeOfSurgery <- remDr$findElement(using = "name", value = "type_of_surgery")
    webElemTypeOfSurgery$sendKeysToElement(list("Transplant surgery"))
    
    # ICU admission information
    remDr$findElement(using = "name", value = "show_admission")$clickElement()
    webElemBloodGlucose <- remDr$findElement(using = "name", value = "blood_glucose")
    webElemBloodGlucose$sendKeysToElement(list("200"))
    webElemHasSuspectedSepsis <- remDr$findElement(using = "name", value = "has_suspected_sepsis")
    webElemHasSuspectedSepsis$sendKeysToElement(list("Yes"))
    webElemHDSupport <- remDr$findElement(using = "name", value = "hd_support")
    webElemHDSupport$sendKeysToElement(list("Pharmacological"))
    # Day 1 information
    remDr$findElement(using = "name", value = "show_day1")$clickElement()
    webElemCreatinineD1 <- remDr$findElement(using = "name", value = "creatinine_d1")
    webElemCreatinineD1$sendKeysToElement(list("1.2"))
    webElemApacheIID1 <- remDr$findElement(using = "name", value = "apacheII_d1")
    webElemApacheIID1$sendKeysToElement(list("30"))
    webElemMaxLactateD1 <- remDr$findElement(using = "name", value = "max_lactate_d1")
    webElemMaxLactateD1$sendKeysToElement(list("10"))
    webElemBilirrubinD1 <- remDr$findElement(using = "name", value = "bilirubin_d1")
    webElemBilirrubinD1$sendKeysToElement(list("2"))
    webElemHoursOfICUStay <- remDr$findElement(using = "name", value = "hours_of_icu_stay")
    webElemHoursOfICUStay$sendKeysToElement(list("24"))
    remDr$findElement(using = "name", value = "predict_day1_dev")$clickElement()
    
    # extract HTML -----------------
    Sys.sleep(5) # give the page time to fully load
    html <- remDr$getPageSource()[[1]]
    
    results <- read_html(html) %>% # parse HTML
      html_nodes("div") %>% # extract node"
      .[12] %>%
      .[[1]] %>%
      html_text() # string
    
    results <- results %>%
      str_replace_all("
", "") %>%
      str_replace_all(" ", "") %>%
      str_replace_all("RiskofdevelopingAKIduringthefirstweekofICUstay", "") %>%
      str_replace_all("AdvancedoptionsChoosetheclassificationthresholdClickonthequestionmarkforadditionalexplanationSincethepredictedriskisabovethechosenclassificationthreshold,thepatientisclassifiedasdevelopingAKIwithinthefirstweekofICUstay.DependingontheintendeduseoftheAKIpredictor,theusermaychoosetoadapttheclassificationthreshold,andevaluatetheeffectonthestatisticsbelow.Thedefaultclassificationthresholdof14.5maximizedbothsensitivityandspecificityinthestudieddatabase.Sensitivity:63.8%Specificity:81.9%PPV:38.0%NPV:92.8%ΔNetbenefitNone:6.8%ΔNetbenefitAll:6.4%Atthechosenclassificationthreshold,AKIpredictorcorrectlyidentifies63.8%ofthepatientswhodevelopedAKIinthestudieddatabaseAtthechosenclassificationthreshold,AKIpredictorcorrectlyidentifies81.9%ofthepatientswhodidnotdevelopAKIinthestudieddatabaseInthestudieddatabase,38.0%ofthepatientswhodevelopedAKIhadapredictedriskabovethechosenclassificationthresholdInthestudieddatabase,92.8%ofthepatientswhodidnotdevelopAKIhadapredictedriskbelowthechosenclassificationthresholdAtthechosenclassificationthreshold,AKIpredictorincreasesthepercentageofcorrectlyidentifiedAKIby6.8%inthestudieddatabase,withoutincreasingfalseclassifications,ascomparedtoconsideringnopatientwilldevelopAKI.OnlyuseaclassificationthresholdthatresultsinaΔNetbenefitNone>0Atthechosenclassificationthreshold,AKIpredictordecreasesthepercentageofmisclassifiedAKIby6.4%inthestudieddatabase,whilekeepingthesamenumberofcorrectclassifications,ascomparedtoconsideringallpatientswilldevelopAKIOnlyuseaclassificationthresholdthatresultsinaΔNetbenefitAll>0Clickonthestatisticsfordetails", "")
    
    results
    
    
    remDr$close()

I need to do the same process using data from a data frame. I have tried the following code:

    rD <- rsDriver(browser="firefox", port=4560L, verbose=F)
    remDr <- rD[["client"]]
    remDr$navigate('https://www.akipredictor.com/en/aki_predictor/')
    
    
    
    scrape.AKIpredictor <- function(age, baselineSCreat, IsDiabetic, IsElectiveAdmited , TypeOfSurgery,
                                    Glucose, SuspectedSepsis, HDSupport,
                                    CreatinineD1, ApacheIID1, MaxLactateD1, BilirrubinD1, HoursOfICUStay) {
      
      remDr$findElement(using = "name", value = "agree_to_legal_terms")$clickElement()
    
      #Pre-admission information
      webElemAge <- remDr$findElement(using = "name", value = "age")
      webElemAge$sendKeysToElement(list(age))
      webElemBaselineSCreat <- remDr$findElement(using = "name", value = "baseline_screat")
      webElemBaselineSCreat$sendKeysToElement(list(baselineSCreat))
      webElemIsDiabetic <- remDr$findElement(using = "name", value = "is_diabetic")
      webElemIsDiabetic$sendKeysToElement(list(IsDiabetic))
      webElemIsElectiveAdmited <- remDr$findElement(using = "name", value = "is_elective_admitted")
      webElemIsElectiveAdmited$sendKeysToElement(list(IsElectiveAdmited))
      webElemTypeOfSurgery <- remDr$findElement(using = "name", value = "type_of_surgery")
      webElemTypeOfSurgery$sendKeysToElement(list(TypeOfSurgery))
      
      # ICU admission information
      remDr$findElement(using = "name", value = "show_admission")$clickElement()
      webElemBloodGlucose <- remDr$findElement(using = "name", value = "blood_glucose")
      webElemBloodGlucose$sendKeysToElement(list(Glucose))
      webElemHasSuspectedSepsis <- remDr$findElement(using = "name", value = "has_suspected_sepsis")
      webElemHasSuspectedSepsis$sendKeysToElement(list(SuspectedSepsis))
      webElemHDSupport <- remDr$findElement(using = "name", value = "hd_support")
      webElemHDSupport$sendKeysToElement(list(HDSupport))
      
      # Day 1 information
      remDr$findElement(using = "name", value = "show_day1")$clickElement()
      webElemCreatinineD1 <- remDr$findElement(using = "name", value = "creatinine_d1")
      webElemCreatinineD1$sendKeysToElement(list(CreatinineD1))
      webElemApacheIID1 <- remDr$findElement(using = "name", value = "apacheII_d1")
      webElemApacheIID1$sendKeysToElement(list(ApacheIID1))
      webElemMaxLactateD1 <- remDr$findElement(using = "name", value = "max_lactate_d1")
      webElemMaxLactateD1$sendKeysToElement(list(MaxLactateD1))
      webElemBilirrubinD1 <- remDr$findElement(using = "name", value = "bilirubin_d1")
      webElemBilirrubinD1$sendKeysToElement(list(BilirrubinD1))
      webElemHoursOfICUStay <- remDr$findElement(using = "name", value = "hours_of_icu_stay")
      webElemHoursOfICUStay$sendKeysToElement(list(HoursOfICUStay))
      remDr$findElement(using = "name", value = "predict_day1_dev")$clickElement()
      
      Sys.sleep(5) # give the page time to fully load
      html <- remDr$getPageSource()[[1]]
      
      results <- read_html(html) %>% # parse HTML
        html_nodes("div") %>% # extract node"
        .[12] %>%
        .[[1]] %>%
        html_text() # string
      
      results <- results %>% #trim trim trim
        str_replace_all("
", "") %>%
        str_replace_all(" ", "") %>%
        str_replace_all("RiskofdevelopingAKIduringthefirstweekofICUstay", "") %>%
        str_replace_all("AdvancedoptionsChoosetheclassificationthresholdClickonthequestionmarkforadditionalexplanationSincethepredictedriskisabovethechosenclassificationthreshold,thepatientisclassifiedasdevelopingAKIwithinthefirstweekofICUstay.DependingontheintendeduseoftheAKIpredictor,theusermaychoosetoadapttheclassificationthreshold,andevaluatetheeffectonthestatisticsbelow.Thedefaultclassificationthresholdof14.5maximizedbothsensitivityandspecificityinthestudieddatabase.Sensitivity:63.8%Specificity:81.9%PPV:38.0%NPV:92.8%ΔNetbenefitNone:6.8%ΔNetbenefitAll:6.4%Atthechosenclassificationthreshold,AKIpredictorcorrectlyidentifies63.8%ofthepatientswhodevelopedAKIinthestudieddatabaseAtthechosenclassificationthreshold,AKIpredictorcorrectlyidentifies81.9%ofthepatientswhodidnotdevelopAKIinthestudieddatabaseInthestudieddatabase,38.0%ofthepatientswhodevelopedAKIhadapredictedriskabovethechosenclassificationthresholdInthestudieddatabase,92.8%ofthepatientswhodidnotdevelopAKIhadapredictedriskbelowthechosenclassificationthresholdAtthechosenclassificationthreshold,AKIpredictorincreasesthepercentageofcorrectlyidentifiedAKIby6.8%inthestudieddatabase,withoutincreasingfalseclassifications,ascomparedtoconsideringnopatientwilldevelopAKI.OnlyuseaclassificationthresholdthatresultsinaΔNetbenefitNone>0Atthechosenclassificationthreshold,AKIpredictordecreasesthepercentageofmisclassifiedAKIby6.4%inthestudieddatabase,whilekeepingthesamenumberofcorrectclassifications,ascomparedtoconsideringallpatientswilldevelopAKIOnlyuseaclassificationthresholdthatresultsinaΔNetbenefitAll>0Clickonthestatisticsfordetails", "")
      
      remDr$findElement(using = "name", value = "empty_form")$clickElement()
      
      
      return(results)
        
    }
    
    #data frame
    age <- c(50, 70, 80)
    baselineSC

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Rselenium only accepts strings, using a for loop the problem was solved.

library(RSelenium)
library(xml2)
library(tidyverse)
library(rvest)

age <- c(50, 70, 80)
baselineSCreat <- c(1, 1.5, 1.1)
IsDiabetic <- c("Yes", "No", "Yes")
IsElectiveAdmited <- c("Planned admission", "Unplanned admission", "Planned admission")
TypeOfSurgery <- c("Transplant surgery", "Cardiovascular surgery (non transplant)", "Abdominal surgery")
Glucose <- c(200, 180, 140)
SuspectedSepsis <- c("Yes", "No", "Yes")
HDSupport <- c("None", "Mechanical", "Pharmacological")
CreatinineD1 <- c(1.1, 1.6, 1.2)
ApacheIID1 <- c(30, 40, 10)
MaxLactateD1 <-c(10, 15, 5)
BilirrubinD1 <- c(2, 3, 4)
HoursOfICUStay <- c(24, 24, 24)

data <- data.frame(age, baselineSCreat, IsDiabetic, IsElectiveAdmited , TypeOfSurgery,
                   Glucose, SuspectedSepsis, HDSupport,
                   CreatinineD1, ApacheIID1, MaxLactateD1, BilirrubinD1, HoursOfICUStay)

data <- sapply(data, as.character)

rD <- rsDriver(browser="firefox", port=4560L, verbose=F)
remDr <- rD[["client"]]

remDr$navigate('https://www.akipredictor.com/en/aki_predictor/')
remDr$findElement(using = "name", value = "agree_to_legal_terms")$clickElement()

output <- matrix(ncol=1, nrow=nrow(data))

Sys.sleep(5)
#Set start, end and steps
for(row in 1:nrow(data)) {
  #Pre-admission information
  webElemAge <- remDr$findElement(using = "name", value = "age")
  webElemAge$sendKeysToElement(list(data[row, "age"]))
  webElemBaselineSCreat <- remDr$findElement(using = "name", value = "baseline_screat")
  webElemBaselineSCreat$sendKeysToElement(list(data[row, "baselineSCreat"]))
  webElemIsDiabetic <- remDr$findElement(using = "name", value = "is_diabetic")
  webElemIsDiabetic$sendKeysToElement(list(data[row, "IsDiabetic"]))
  webElemIsElectiveAdmited <- remDr$findElement(using = "name", value = "is_elective_admitted")
  webElemIsElectiveAdmited$sendKeysToElement(list(data[row, "IsElectiveAdmited"]))
  webElemTypeOfSurgery <- remDr$findElement(using = "name", value = "type_of_surgery")
  webElemTypeOfSurgery$sendKeysToElement( list(data[row, "TypeOfSurgery"]))
  
  # ICU admission information
  remDr$findElement(using = "name", value = "show_admission")$clickElement()
  Sys.sleep(1)
  webElemBloodGlucose <- remDr$findElement(using = "name", value = "blood_glucose")
  webElemBloodGlucose$sendKeysToElement(list(data[row, "Glucose"]))
  webElemHasSuspectedSepsis <- remDr$findElement(using = "name", value = "has_suspected_sepsis")
  webElemHasSuspectedSepsis$sendKeysToElement(list(data[row, "SuspectedSepsis"]))
  webElemHDSupport <- remDr$findElement(using = "name", value = "hd_support")
  webElemHDSupport$sendKeysToElement(list(data[row, "HDSupport"]))
  
  # Day 1 information
  remDr$findElement(using = "name", value = "show_day1")$clickElement()
  Sys.sleep(1)
  webElemCreatinineD1 <- remDr$findElement(using = "name", value = "creatinine_d1")
  webElemCreatinineD1$sendKeysToElement(list(data[row, "CreatinineD1"]))
  webElemApacheIID1 <- remDr$findElement(using = "name", value = "apacheII_d1")
  webElemApacheIID1$sendKeysToElement(list(data[row, "ApacheIID1"]))
  webElemMaxLactateD1 <- remDr$findElement(using = "name", value = "max_lactate_d1")
  webElemMaxLactateD1$sendKeysToElement(list(data[row, "MaxLactateD1"]))
  webElemBilirrubinD1 <- remDr$findElement(using = "name", value = "bilirubin_d1")
  webElemBilirrubinD1$sendKeysToElement(list(data[row, "BilirrubinD1"]))
  webElemHoursOfICUStay <- remDr$findElement(using = "name", value = "hours_of_icu_stay")
  webElemHoursOfICUStay$sendKeysToElement(list(data[row, "HoursOfICUStay"]))
  remDr$findElement(using = "name", value = "predict_day1_dev")$clickElement()

  Sys.sleep(1) # give the page time to fully load
  html <- remDr$getPageSource()[[1]]
  
  output[row,] <- read_html(html) %>% # parse HTML
    html_nodes("div") %>% # extract node"
    .[12] %>%
    .[[1]] %>%
    html_text() # string
  
  output[row,] <- output[row,] %>% #trim trim trim
    str_replace_all("
", "") %>%
    str_replace_all(" ", "") %>%
    str_replace_all("RiskofdevelopingAKIduringthefirstweekofICUstay", "") %>%
    str_replace_all("AdvancedoptionsChoosetheclassificationthresholdClickonthequestionmarkforadditionalexplanationSincethepredictedriskisabovethechosenclassificationthreshold,thepatientisclassifiedasdevelopingAKIwithinthefirstweekofICUstay.DependingontheintendeduseoftheAKIpredictor,theusermaychoosetoadapttheclassificationthreshold,andevaluatetheeffectonthestatisticsbelow.Thedefaultclassificationthresholdof14.5maximizedbothsensitivityandspecificityinthestudieddatabase.Sensitivity:63.8%Specificity:81.9%PPV:38.0%NPV:92.8%ΔNetbenefitNone:6.8%ΔNetbenefitAll:6.4%Atthechosenclassificationthreshold,AKIpredictorcorrectlyidentifies63.8%ofthepatientswhodevelopedAKIinthestudieddatabaseAtthechosenclassificationthreshold,AKIpredictorcorrectlyidentifies81.9%ofthepatientswhodidnotdevelopAKIinthestudieddatabaseInthestudieddatabase,38.0%ofthepatientswhodevelopedAKIhadapredictedriskabovethechosenclassificationthresholdInthestudieddatabase,92.8%ofthepatientswhodidnotdevelopAKIhadapredictedriskbelowthechosenclassificationthresholdAtthechosenclassificationthreshold,AKIpredictorincreasesthepercentageofcorrectlyidentifiedAKIby6.8%inthestudieddatabase,withoutincreasingfalseclassifications,ascomparedtoconsideringnopatientwilldevelopAKI.OnlyuseaclassificationthresholdthatresultsinaΔNetbenefitNone>0Atthechosenclassificationthreshold,AKIpredictordecreasesthepercentageofmisclassifiedAKIby6.4%inthestudieddatabase,whilekeepingthesamenumberofcorrectclassifications,ascomparedtoconsideringallpatientswilldevelopAKIOnlyuseaclassificationthresholdthatresultsinaΔNetbenefitAll>0Clickonthestatisticsfordetails", "")


  remDr$findElement(using = "name", value = "empty_form")$clickElement()
  
}

output

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...