Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
575 views
in Technique[技术] by (71.8m points)

cluster analysis - R Sampling dataframe rows N times to run clustering and capture output

I am running PAM clustering using samples from dataframe df. How can repeat this process N times by getting a different sample every time and appending output of each run to a final dataframe?

sample_size <- 10000
df_lean <- df %>%
  rownames_to_column('ID') %>%
  dplyr::sample_n(sample_size) %>%
  column_to_rownames('ID')

cluster_size = 3
pam_fit <- pam(gower_dist, diss = TRUE, k = cluster_size)
pam_results <- df_lean %>%
  dplyr::select(-ID) %>%
  mutate(cluster = pam_fit$clustering) %>%
  group_by(cluster) %>%
  do(the_summary = summary(.))
pam_results$the_summary

df_result <- df_lean %>%
  rownames_to_column('MIN_ESN') %>%
mutate(cluster = pam_fit$clustering)
question from:https://stackoverflow.com/questions/65876129/r-sampling-dataframe-rows-n-times-to-run-clustering-and-capture-output

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Try putting all your code in a function and repeat the process using replicate or rerun :

library(dplyr)

run_function <- function() {
  sample_size <- 10000
  df_lean <- df %>%
    rownames_to_column('ID') %>%
    dplyr::sample_n(sample_size) %>%
    column_to_rownames('ID')
  
  cluster_size = 3
  pam_fit <- pam(gower_dist, diss = TRUE, k = cluster_size)
  pam_results <- df_lean %>%
    dplyr::select(-ID) %>%
    mutate(cluster = pam_fit$clustering) %>%
    group_by(cluster) %>%
    do(the_summary = summary(.))
  
  df_result <- df_lean %>%
    rownames_to_column('MIN_ESN') %>%
    mutate(cluster = pam_fit$clustering)
  
  return(df_result)
}

N <- 100
result <- purrr::rerun(N, run_function())

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...