This answer is an ADDENDUM to the first answer, especially targeting your second question about significantly speeding up the whole process.
To make the run time estimation reproducible, we will fix the seed;
all other definitions are yours.
set.seed(4789)
n = 200
q = 0.5
X <- mvtnorm::rmvnorm(n = n, mean = c(0,0),
sigma = matrix(c(1,1,1,4), ncol = 2))
x0 = matrix(c(X[1,1:2]), nrow = 1)
y0 = x0 - 0.5 * log(n) * (colMeans(X) - x0)
X = rbind(X, y0)
x01 = y0[1]; x02 = y0[2]
x1 = X[,1]; x2 = X[,2]
pInit = matrix(rep(1/(n + 1), n + 1), nrow = n + 1)
First, let's do it with augmented Lagrangian and optim()
as inner solver.
f1 <- function(p) sum(sqrt(pmax(0, p)))
heq1 <- function(p) c(sum(x1 * p) - x01, sum(x2 * p) - x02, sum(p) - 1)
hin1 <- function(p) p - 1e-06
system.time( sol <- alabama::auglag(pInit, fn = function(p) -f1(p),
heq = heq1, hin = hin1) )
## user system elapsed
## 24.631 0.054 12.324
-1 * sol$value; heq1(sol$par)
## [1] 7.741285
## [1] 1.386921e-09 3.431108e-10 4.793488e-10
This problem is convex with linear constraints. Therefore we can apply an efficient convex solver such as ECOS. For modeling we will make use of the CVXR package.
# install.packages(c("ECOSolveR", "CVXR"))
library(CVXR)
p <- Variable(201)
obj <- Maximize(sum(sqrt(p)))
cons <- list(p >= 0, sum(p) == 1,
sum(x1*p)==x01, sum(x2*p)==x02)
prbl <- Problem(obj, cons)
system.time( sol <- solve(prbl, solver="ECOS") )
## user system elapsed
## 0.044 0.000 0.044
ps <- sol$getValue(p)
cat("The maximum value is:", sum(sqrt(pmax(0, ps))))
## The maximum value is: 7.74226
c(sum(ps), sum(x1*ps) - x01, sum(x2*ps) - x02)
## [1] 1.000000e+00 -1.018896e-11 9.167819e-12
We see that the convex solver is about 500 times faster (!) than the first approach with a standard nonlinear solver. IMPORTANT: We do not need a starting value because a convex problem has only one optimum.