Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
467 views
in Technique[技术] by (71.8m points)

r - Benchmarking: using `expression` `quote` or neither

Generally, when I run benchmarks, I wrap my statements in expression. Recently, it was suggested to either (a) not do so or (b) use quote instead of expression.

I find two advantages to wrapping the statements:

  • compared to entire statements, they are more easily swapped out.
  • I can lapply over a list of inputs, and compare those results

However, in exploring the different methods, I noticed a discrepency between the three methods (wrapping in expression, wrapping in quote, or not wrapping at all)

The question is:
Why the discrepency?
(it appears that wrapping in quote does not actually evaluate the call.)

EXAMPLE:

# SAMPLE DATA
  mat <-  matrix(sample(seq(1e6), 4^2*1e4, T), ncol=400) 

# RAW EXPRESSION TO BENCHMARK IS: 
  # apply(mat, 2, mean)

# WRAPPED EXPRESSION: 
  expr <- expression(apply(mat, 2, mean))
  quot <- quote(apply(mat, 2, mean))

# BENCHMARKS
  benchmark(raw=apply(mat, 2, mean), expr, quot)[, -(7:8)]
  #    test replications elapsed relative user.self sys.self
  #  2 expr          100   1.269       NA     1.256    0.019
  #  3 quot          100   0.000       NA     0.001    0.000
  #  1  raw          100   1.494       NA     1.286    0.021


# BENCHMARKED INDIVIDUALLY 
  benchmark(raw=apply(mat, 2, mean))[, -(7:8)]
  benchmark(expr)[, -(7:8)]
  benchmark(quot)[, -(7:8)]

  # results
  #    test replications elapsed relative user.self sys.self
  #  1  raw          100   1.274        1      1.26    0.018
  #    test replications elapsed relative user.self sys.self
  #  1 expr          100   1.476        1     1.342    0.021
  #    test replications elapsed relative user.self sys.self
  #  1 quot          100   0.006        1     0.006    0.001
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Your issue is that quote does not produce an expression but a call, so within the call to benchmark, there is no expression to evaluate.

If you evaluate the `call it will actually get evaluated, and the timings are reasonable.

class(quot)
[1] "call"
>class(expr)
[1] "expression"


 benchmark(raw=apply(mat, 2, mean), expr, eval(quot))[, -(7:8)]
        test replications elapsed relative user.self sys.self
3 eval(quot)          100    0.76    1.000      0.77        0
2       expr          100    0.83    1.092      0.83        0
1        raw          100    0.78    1.026      0.78        0

In general, I tend to create a function that contains the call / process I wish to benchmark. Note that it is good practice to include things like assigning the result to a value.

eg

 raw <- function() {x <- apply(mat, 2, mean)}

In which case it looks like that there is a slight improvement by eval(quote(...)).

benchmark(raw(), eval(quote(raw()))

                test replications elapsed relative user.self sys.self 
2 eval(quote(raw()))          100    0.76    1.000      0.75     0.01        
1              raw()          100    0.80    1.053      0.80     0.00        

But often these small differences can be due to overheads in functions and may not reflect how the performance scales to larger problems. See the many questions with benchmarkings of data.table solutions, using a small number of replications but big data may better reflect performance.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...