I found a nice example of plotting convex hull shapes using ggplot with ddply here:
Drawing outlines around multiple geom_point groups with ggplot
I thought I'd try something similar--create something like an Ashby Diagram--to practice with the data.table package:
test<-function()
{
library(data.table)
library(ggplot2)
set.seed(1)
Here I define a simple table:
dt<-data.table(xdata=runif(15),ydata=runif(15),level=rep(c("a","b","c"),each=5),key="level")
And then I define the hull positions by level:
hulls<-dt[,as.integer(chull(.SD)),by=level]
setnames(hulls,"V1","hcol")
So then my thought was to merge hulls with dt, so that I could eventually manipulate hulls to get in the proper form for ggplot (shown below for reference):
ashby<-ggplot(dt,aes(x=xdata,y=ydata,color=level))+
geom_point()+
geom_line()+
geom_polygon(data=hulls,aes(fill=level))
}
But it seems that any way I try to merge hulls and dt, I get an error. For example, merge(hulls,dt) produces the error as shown in footnote 1.
This seems like it should be simple, and I'm sure I'm just missing something obvious. Any direction to a similar post or thoughts on how to prep hull for ggplot is greatly appreciated. Or if you think that it's best to stick with the ddply approach, please let me know.
Example undesired output:
test<-function(){
library(data.table)
library(ggplot2)
dt<-data.table(xdata=runif(15),ydata=runif(15),level=rep(c("a","b","c"),each=5),key="level")
set.seed(1)
hulls<-dt[,as.integer(chull(.SD)),by=level]
setnames(hulls,"V1","hcol")
setkey(dt, 'level') #setting the key seems unneeded
setkey(hulls, 'level')
hulls<-hulls[dt, allow.cartesian = TRUE]
ggplot(dt,aes(x=xdata,y=ydata,color=level))+
geom_point()+
geom_polygon(data=hulls,aes(fill=level))
}
results in a mess of criss-crossing polygons:
Footnote 1:
Error in vecseq(f__, len__, if (allow.cartesian) NULL else
as.integer(max(nrow(x), : Join results in 60 rows; more than 15 =
max(nrow(x),nrow(i)). Check for duplicate key values in i, each of
which join to the same group in x over and over again. If that's ok,
try including j
and dropping by
(by-without-by) so that j runs for
each group to avoid the large allocation. If you are sure you wish to
proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for
this error message in the FAQ, Wiki, Stack Overflow and datatable-help
for advice.
See Question&Answers more detail:
os