r - Using an If function, how do I change string of text in dataframe to another string?

Question

Welcome To Ask or Share your Answers For Others

r - Using an If function, how do I change string of text in dataframe to another string?

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Using an If function, how do I change string of text in dataframe to another string?

using the following dataset

 structure(list(...1 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12), V1 = c("overstress", "flicker", "lotteri", "life", 
"charg", "capac", "health", "drain", "degrad", "protector", "bright", 
"use", "overstress", "flicker", "lotteri", "life", "charg", "capac", 
"health", "drain", "degrad", "protector", "bright", "use", "overstress", 
"flicker", "lotteri", "life", "charg", "capac", "health", "drain", 
"degrad", "protector", "bright", "use"), term = c("corr1", "corr1", 
"corr1", "corr1", "corr1", "corr1", "corr1", "corr1", "corr1", 
"corr1", "corr1", "corr1", "corr2", "corr2", "corr2", "corr2", 
"corr2", "corr2", "corr2", "corr2", "corr2", "corr2", "corr2", 
"corr2", "corr3", "corr3", "corr3", "corr3", "corr3", "corr3", 
"corr3", "corr3", "corr3", "corr3", "corr3", "corr3"), correlation = c(0.5, 
0.43, 0.42, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.53, 
0.29, 0.25, 0.25, 0.23, 0.2, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, 0.45, 0.16, 0.15)), row.names = c(NA, -36L), class = c("tbl_df", 
"tbl", "data.frame"))

I am looking to change if the word is corr1, corr2 or corr3, to toil1,toil2 or toil3. I tried the following code, but only receive the following error term:

three_terms_corrs_gathered$term <- if
(three_terms_corrs_gathered$term  == "corr1"){toil1} else if
(three_terms_corrs_gathered$term  == "corr2"){toil2} else
{toil3}

Warning message:

In if (three_terms_corrs_gathered$term == "corr1") { : the condition has length > 1 and only the first element will be used. So it only changes to the first condition. What am I doing wrong?

question from:https://stackoverflow.com/questions/65906378/using-an-if-function-how-do-i-change-string-of-text-in-dataframe-to-another-str

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-06T19:13:10+0000

Three options:

"Merge" mentality. This works very well when you have multiple disparate matches, as it is both efficient for code and easy to visualize and maintain. While the example here only has two replacements, the code doesn't change if corrs_df has 2 rows or 200, and entries in corrs_df that match nothing are silently discarded, doing no harm.

library(dplyr)
corrs_df <- data.frame(term = c("corr1", "corr2"), newterm = c("toil1", "toil2"))
dat %>%
  left_join(corrs_df, by = "term") %>%
  slice(c(1:3, 28:30))
# # A tibble: 6 x 5
#    ...1 V1         term  correlation newterm
#   <dbl> <chr>      <chr>       <dbl> <chr>  
# 1     1 overstress corr1        0.5  toil1  
# 2     2 flicker    corr1        0.43 toil1  
# 3     3 lotteri    corr1        0.42 toil1  
# 4     4 life       corr3       NA    <NA>   
# 5     5 charg      corr3       NA    <NA>   
# 6     6 capac      corr3       NA    <NA>   

dat %>%
  left_join(corrs_df, by = "term") %>%
  mutate(term = coalesce(newterm, term)) %>%
  slice(c(1:3, 28:30))
# # A tibble: 6 x 5
#    ...1 V1         term  correlation newterm
#   <dbl> <chr>      <chr>       <dbl> <chr>  
# 1     1 overstress toil1        0.5  toil1  
# 2     2 flicker    toil1        0.43 toil1  
# 3     3 lotteri    toil1        0.42 toil1  
# 4     4 life       corr3       NA    <NA>   
# 5     5 charg      corr3       NA    <NA>   
# 6     6 capac      corr3       NA    <NA>

You can obviously %>% select(-newterm).) The coalesce function effectively says "give me the first non-NA value from these variables". The NA in newterm occurs when the associated term variable is not present in corrs_df, which we assume means to make no change.

dplyr::case_when. (If you're into it, then data.table::fcase does effectively the same thing.)

dat %>%
  mutate(
    term = case_when(
      term == "corr1" ~ "toil1",
      term == "corr2" ~ "toil2",
      TRUE ~ term)
  ) %>%
  slice(c(1:3, 28:30))
# # A tibble: 6 x 4
#    ...1 V1         term  correlation
#   <dbl> <chr>      <chr>       <dbl>
# 1     1 overstress toil1        0.5 
# 2     2 flicker    toil1        0.43
# 3     3 lotteri    toil1        0.42
# 4     4 life       corr3       NA   
# 5     5 charg      corr3       NA   
# 6     6 capac      corr3       NA

Nested ifelse. Actually, since you're using dplyr, it is much better to use if_else for many reasons (e.g., this).

dat %>%
  mutate(
    term = if_else(term == "corr1", "toil1",
                   if_else(term == "corr2", "toil2", term))
  ) %>%
  slice(c(1:3, 28:30))
# # A tibble: 6 x 4
#    ...1 V1         term  correlation
#   <dbl> <chr>      <chr>       <dbl>
# 1     1 overstress toil1        0.5 
# 2     2 flicker    toil1        0.43
# 3     3 lotteri    toil1        0.42
# 4     4 life       corr3       NA   
# 5     5 charg      corr3       NA   
# 6     6 capac      corr3       NA

This works fine for 1 or 2 nestings, but in my opinion, it looks messy and it gets difficult to follow; in my experience, because it is harder to follow, it can be harder to maintain, making it quite simple to have incorrect placement of particular options/values. Maintainability and readability are very important in my opinion.

Categories

r - Using an If function, how do I change string of text in dataframe to another string?

r - Using an If function, how do I change string of text in dataframe to another string?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags