apply - When you want to apply a function to the rows or columns of a matrix (and higher-dimensional analogues);
(apply- 当您要将函数应用于矩阵的行或列(以及更高维的类似物)时;)
not generally advisable for data frames as it will coerce to a matrix first. (通常不建议使用数据帧,因为它将首先强制转换为矩阵。)
# Two dimensional matrix M <- matrix(seq(1,16), 4, 4) # apply min to rows apply(M, 1, min) [1] 1 2 3 4 # apply max to columns apply(M, 2, max) [1] 4 8 12 16 # 3 dimensional array M <- array( seq(32), dim = c(4,4,2)) # Apply sum across each M[*, , ] - ie Sum across 2nd and 3rd dimension apply(M, 1, sum) # Result is one-dimensional [1] 120 128 136 144 # Apply sum across each M[*, *, ] - ie Sum across 3rd dimension apply(M, c(1,2), sum) # Result is two-dimensional [,1] [,2] [,3] [,4] [1,] 18 26 34 42 [2,] 20 28 36 44 [3,] 22 30 38 46 [4,] 24 32 40 48
If you want row/column means or sums for a 2D matrix, be sure to investigate the highly optimized, lightning-quick colMeans
, rowMeans
, colSums
, rowSums
.
(如果您想要2D矩阵的行/列均值或总和,请务必研究高度优化的,闪电般快速的colMeans
, rowMeans
, colSums
和rowSums
。)
lapply - When you want to apply a function to each element of a list in turn and get a list back.
(lapply- 当您想将功能依次应用于列表的每个元素并返回列表时。)
This is the workhorse of many of the other *apply functions.
(这是许多其他* apply函数的主力军。)
Peel back their code and you will often find lapply
underneath. (剥离他们的代码,您经常会在下面发现lapply
的代码。)
x <- list(a = 1, b = 1:3, c = 10:100) lapply(x, FUN = length) $a [1] 1 $b [1] 3 $c [1] 91 lapply(x, FUN = sum) $a [1] 1 $b [1] 6 $c [1] 5005
sapply - When you want to apply a function to each element of a list in turn, but you want a vector back, rather than a list.
(sapply- 当您想将函数依次应用于列表的每个元素,但又要返回向量而不是列表时。)
If you find yourself typing unlist(lapply(...))
, stop and consider sapply
.
(如果您发现自己输入了unlist(lapply(...))
,请停下来考虑sapply
。)
x <- list(a = 1, b = 1:3, c = 10:100) # Compare with above; a named vector, not a list sapply(x, FUN = length) abc 1 3 91 sapply(x, FUN = sum) abc 1 6 5005
In more advanced uses of sapply
it will attempt to coerce the result to a multi-dimensional array, if appropriate.
(在更高级的sapply
使用中,如果合适,它将尝试将结果强制为多维数组。)
For example, if our function returns vectors of the same length, sapply
will use them as columns of a matrix: (例如,如果我们的函数返回相同长度的向量,则sapply
会将它们用作矩阵的列:)
sapply(1:5,function(x) rnorm(3,x))
If our function returns a 2 dimensional matrix, sapply
will do essentially the same thing, treating each returned matrix as a single long vector:
(如果我们的函数返回二维矩阵,则sapply
基本上会做同样的事情,将每个返回的矩阵视为单个长向量:)
sapply(1:5,function(x) matrix(x,2,2))
Unless we specify simplify = "array"
, in which case it will use the individual matrices to build a multi-dimensional array:
(除非我们指定simplify = "array"
,否则在这种情况下它将使用各个矩阵构建多维数组:)
sapply(1:5,function(x) matrix(x,2,2), simplify = "array")
Each of these behaviors is of course contingent on our function returning vectors or matrices of the same length or dimension.
(这些行为中的每一个当然都取决于我们的函数返回相同长度或尺寸的向量或矩阵。)
vapply - When you want to use sapply
but perhaps need to squeeze some more speed out of your code.
(vapply- 当您想使用sapply
但可能需要从代码中挤出更多速度时。)
For vapply
, you basically give R an example of what sort of thing your function will return, which can save some time coercing returned values to fit in a single atomic vector.
(对于vapply
,您基本上为R提供了一个示例,说明您的函数将返回哪种类型的东西,这可以节省一些时间来强制将返回值适合单个原子向量。)
x <- list(a = 1, b = 1:3, c = 10:100) #Note that since the advantage here is mainly speed, this # example is only for illustration. We're telling R that # everything returned by length() should be an integer of # length 1. vapply(x, FUN = length, FUN.VALUE = 0L) abc 1 3 91
mapply - For when you have several data structures (eg vectors, lists) and you want to apply a function to the 1st elements of each, and then the 2nd elements of each, etc., coercing the result to a vector/array as in sapply
.
(mapply- 当您具有多个数据结构(例如,向量,列表)并且想要将函数应用于每个的第一个元素,然后将其应用于每个的第二个元素等时,将结果强制为向量/数组sapply
。)
This is multivariate in the sense that your function must accept multiple arguments.
(在您的函数必须接受多个参数的意义上说,这是多变量的。)
#Sums the 1st elements, the 2nd elements, etc. mapply(sum, 1:5, 1:5, 1:5) [1] 3 6 9 12 15 #To do rep(1,4), rep(2,3), etc. mapply(rep, 1:4, 4:1) [[1]] [1] 1 1 1 1 [[2]] [1] 2 2 2 [[3]] [1] 3 3 [[4]] [1] 4
Map - A wrapper to mapply
with SIMPLIFY = FALSE
, so it is guaranteed to return a list.
(mapply
使用SIMPLIFY = FALSE
进行 映射 的包装器,因此可以确保返回列表。)
Map(sum, 1:5, 1:5, 1:5) [[1]] [1] 3 [[2]] [1] 6 [[3]] [1] 9 [[4]] [1] 12 [[5]] [1] 15
rapply - For when you want to apply a function to each element of a nested list structure, recursively.
(rapply- 用于当您想将函数递归应用于嵌套列表结构的每个元素时。)
To give you some idea of how uncommon rapply
is, I forgot about it when first posting this answer!
(为了让您了解重新启动的罕见rapply
,我在首次发布此答案时就忘记了它!)
Obviously, I'm sure many people use it, but YMMV. (显然,我敢肯定会有很多人使用它,但是YMMV。)
rapply
is best illustrated with a user-defined function to apply: (最好使用用户定义的函数来说明rapply
:)
# Append ! to string, otherwise increment myFun <- function(x){ if(is.character(x)){ return(paste(x,"!",sep="")) } else{ return(x + 1) } } #A nested list structure l <- list(a = list(a1 = "Boo", b1 = 2, c1 = "Eeek"), b = 3, c = "Yikes", d = list(a2 = 1, b2 = list(a3 = "Hey", b3 = 5))) # Result is named vector, coerced to character rapply(l, myFun) # Result is a nested list like l, with values altered rapply(l, myFun, how="replace")
tapply - For when you want to apply a function to subsets of a vector and the subsets are defined by some other vector, usually a factor.
(tapply - 当你想给一个函数应用到向量的子集和子集是由一些其它载体,通常是一个因素确定。)
The black sheep of the *apply family, of sorts.
(* apply家族的败类。)
The help file's use of the phrase "ragged array" can be a bit confusing , but it is actually quite simple. (帮助文件中使用短语“参差不齐的数组”可能会有些混乱 ,但实际上非常简单。)
A vector:
(一个向量:)
x <- 1:20
A factor (of the same length!) defining groups:
(定义组的因素(长度相同!):)
y <- factor(rep(letters[1:5], each = 4))
Add up the values in x
within each subgroup defined by y
:
(将y
定义的每个子组中x
的值y
:)
tapply(x, y, sum) abcde 10 26 42 58 74
More complex examples can be handled where the subgroups are defined by the unique combinations of a list of several factors.
(可以处理更复杂的示例,其中子组由几个因素列表的唯一组合定义。)
tapply
is similar in spirit to the split-apply-combine functions that are common in R ( aggregate
, by
, ave
, ddply
, etc.) Hence its black sheep status. (tapply
是在本质上与分割应用-结合,在R 2是常用的功能(类似于aggregate
, by
, ave
, ddply
等)因此,它的黑色羊状态。)