Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
292 views
in Technique[技术] by (71.8m points)

dplyr - R function to see if combinations of multiple vectors match a target vector

Given a set of any # of vectors:

a<-c("giraffe", "dolphin", "pig")
b<-c("elephant" , "pig")
c<-c("zebra","cobra","spider","porcupine")
d<-c("porcupine")
e<-c("spider","cobra")
f<-c("elephant","pig","porcupine")

and a target vector:

target<- c("elephant" , "pig","cobra","spider","porcupine")

Is there a way to check if any combinations of the vectors can match the target vector (order doesn't matter)?

In this case, answers would be:

  • b,d,e
  • e,f

Clarifying: I need to know which combinations exactly match the target vector with no duplicates. Any answers that would repeat a value (e.x. b,d,e,f) would not work.

question from:https://stackoverflow.com/questions/65943841/r-function-to-see-if-combinations-of-multiple-vectors-match-a-target-vector

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The solutions shown in the question consist of non-overlapping vectors so we assume that that is a requirement so that we are looking to partition the target into disjoint vectors that cover it. If the vectors may overlap then instead of using = or == in the constraints involving A below use >=.

The assumed problem is known as a set partitioning problem and the problem with overlaps is known as a set covering problem.

Assuming the list of vectors L and the target shown in the Note at the end form the objective (all one's), incidence matrix A of vectors, animals and the right hand of the constraint equations rhs derived from the target and run the linear program shown.

If a solution is found then we add a constraint that will eliminate it in the next iteration by insisting that at least one of its zeros be one. We iterate 5 times (i.e. up to 5 solutions) or until we can find no more solutions.

We show a solution using the lpSolveAPI package and then in the section after that repeat it using the CVXR package.

lpSolveAPI

library(lpSolveAPI)

animals <- sort(unique(unlist(L)))
A <- +outer(animals, L, Vectorize(`%in%`))
rownames(A) <- animals
nr <- nrow(A)
nc <- ncol(A)

rhs <- rownames(A) %in% target

lp <- make.lp(nr, nc)
set.objfn(lp, rep(1, nc))
for(i in 1:nr) add.constraint(lp, A[i, ], "=", rhs[i])
for(j in 1:nc) set.type(lp, j, type = "binary")

soln <- solns <- NULL
for(s in 1:5) {
  if (!is.null(soln)) add.constraint(lp, 1-soln, ">=", 1)
  if (solve(lp) != 0) break
  soln <- get.variables(lp)
  solns <- c(solns, list(names(L)[soln == 1]))
}
solns   
## [[1]]
## [1] "e" "f"
##
## [[2]]
## [1] "b" "d" "e"

CVXR

An alternative to lpSolve is CVXR. We use nc, A and rhs from above. Below we find up to 5 solutions.

library(CVXR)

x <- Variable(nc, boolean = TRUE)
objective <- Minimize(sum(x))
constraints <- list(A %*% x == matrix(rhs))

solns <- soln <- NULL
for(i in 1:5) {
  if (!is.null(soln)) constraints <- c(constraints, sum((1 - soln) * x) >= 1)
  prob <- Problem(objective, constraints)
  result <- solve(prob)
  if (result$status != "optimal") break
  soln <- result$getValue(x)
  solns <- c(solns, list(names(L)[soln == 1]))
}
solns
## [[1]]
## [1] "e" "f"
##
## [[2]]
## [1] "b" "d" "e"

Note

L <- within(list(), {
  a <- c("giraffe", "dolphin", "pig")
  b <- c("elephant" , "pig")
  c <- c("zebra","cobra","spider","porcupine")
  d <- c("porcupine")
  e <- c("spider","cobra")
  f <- c("elephant","pig","porcupine")
})
L <- L[order(names(L))]
target<- c("elephant" , "pig","cobra","spider","porcupine")

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...