I'm looking for a way to do simple aggregates / counts via data.table.
Consider the iris data, which has 50 observations per species. To count the observations per species I have to summaries over a column other than species, for example "Sepal.Length".
library(data.table)
dt = as.data.table(iris)
dt[,length(Sepal.Length), Species]
I find this confusing because it looks like I'm doing something on Sepal.Length at first glance, when really it's only Species that matters.
This is what I would prefer to say, but I don't get valid output:
dt[,length(Species), Species]
Correct input and output, but clunky code:
> dt[,length(Sepal.Length), Species]
Species V1
1: setosa 50
2: versicolor 50
3: virginica 50
Incorrect input and output, but nicer code:
> dt[,length(Species), Species]
Species V1
1: setosa 1
2: versicolor 1
3: virginica 1
Is there an elegant way around this?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…