Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
168 views
in Technique[技术] by (71.8m points)

r - map over tibble columns while preserving groups

I have a grouped tibble. I would like to use map() to iterate over the columns of the tibble. And within each column, I would like map() to act separately on each group. In other words, I would like map() to respect the grouping structure of the tibble.

But map() doesn't seem to respect the grouping structures of tibbles. Here is a minimal example:

library(dplyr)
library(purrr)
data(iris)
iris %>%
  group_by(Species) %>%
  map(length)

In the iris dataset, there are three species and four columns (not counting "Species"). I would therefore like map() to return a list of 3 × 4 = 12 lengths, or else to return a nested list that has, in total, 12 lengths. But it returns a list of 5 elements: one for each column, counting the grouping column. Each of these five elements is simply the total length of a column (150). How can I adapt the code above to provide the result that I want?

In this minimal example, a satisfactory alternative to using map() is

iris %>%
  group_by(Species) %>%
  summarize(
    mutate(across(everything(), length))
  )

which returns

# A tibble: 3 x 5
  Species    Sepal.Length Sepal.Width Petal.Length Petal.Width
* <fct>             <int>       <int>        <int>       <int>
1 setosa               50          50           50          50
2 versicolor           50          50           50          50
3 virginica            50          50           50          50

But in most cases, this alternative won't work. The problem is that I'll usually want summarize() and mutate to return loess() objects, not integers. And when I try to get them to return loess() objects, they choke with errors like

Error: Problem with `summarise()` input `..1`.
x Input must be a vector, not a `loess` object.
question from:https://stackoverflow.com/questions/66065866/map-over-tibble-columns-while-preserving-groups

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

do allows you to do work on a group at a time

Edit: as you said do is superseded, and this is the more straight-forward (and encouraged) way of doing it. (The issue I had trying this before my do answer was I was missing the use of cur_data().)

colnms <- names(iris)[2:4]
colnms
# [1] "Sepal.Width"  "Petal.Length" "Petal.Width" 

iris %>%
  group_by(Species) %>%
  summarize(
    other = colnms,
    mdl = map(colnms, ~ loess(as.formula(paste("Sepal.Length ~", .x)),

                              data = cur_data()))
  )
# Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
#   pseudoinverse used at 0.0975
# Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
#   neighborhood radius 0.2025
# Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
#   reciprocal condition number  2.8298e-016
# Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,  :
#   There are other near singularities as well. 0.01
# # A tibble: 9 x 3
# # Groups:   Species [3]
#   Species    other        mdl    
#   <fct>      <chr>        <list> 
# 1 setosa     Sepal.Width  <loess>
# 2 setosa     Petal.Length <loess>
# 3 setosa     Petal.Width  <loess>
# 4 versicolor Sepal.Width  <loess>
# 5 versicolor Petal.Length <loess>
# 6 versicolor Petal.Width  <loess>
# 7 virginica  Sepal.Width  <loess>
# 8 virginica  Petal.Length <loess>
# 9 virginica  Petal.Width  <loess>

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...