Ra Timing Tests
Here is a table of timing tests which compare Ra 1.0.6 to R 2.6.2.
(Note: this will be updated for the latest version of Ra shortly.)
Back to Ra homepage
TestName N R262 ----Ra106/R262----- jit2/jit1
secs jit=0 jit=1 jit=2
convolve 1600 36.1 0.71 0.03 0.03 0.9
base/TAOCP.R 80 19.3 0.73 0.06 0.03 0.5
looped.dnorm 800000 10.3 1.06 0.08 0.08 1.0
ROCR/auc 2000000 23.9 0.91 0.07 0.07 1.0
dd.for 667 121.0 0.97 0.85 0.83 1.0
dd.for.prealloc 667 87.9 0.75 0.07 0.06 0.8
dd.for.tabulate 667 85.2 0.74 0.05 0.03 0.7
dd.fast 667 3.3 0.97 0.97 0.97 1.0
dd.fast.tabulate 667 0.9 1.00 1.01 1.02 1.0
while x <- x + 1 4000000 10.5 0.79 0.06 0.06 1.0
while x <- x + 1i 4000000 10.0 0.83 0.85 0.84 1.0
repeat x <- x + 1 4000000 12.7 0.85 0.09 0.09 1.0
repeat x <- x + 1i 4000000 15.8 0.87 0.88 0.88 1.0
vadim1 1 20000000 5.5 0.56 0.14 0.14 1.0
vadim2 i 20000000 5.7 0.62 0.52 0.53 1.0
vadim3 i-1 20000000 20.9 0.82 0.07 0.08 1.1
add1 x <- x + 1 20000000 26.8 0.90 0.08 0.09 1.1
vadim4 x[i-1] 20000000 37.9 0.85 0.07 0.07 1.0
vadim5 x[i] <- 1.0 20000000 82.9 0.92 0.03 0.03 1.0
vadim6 x[i] <- x[i-1] 20000000 184.6 0.63 0.03 0.03 1.0
For the specifics of each test see the R code here. In summary:
convolve Convolution example in the "Extending R" manual.
base/TAOCP.R From R base library. Integer arithmetic.
looped.dnorm From one of Luke Tierney's bytecode compiler docs.
ROCR/auc From the ROCR package.
Has a for loop with real arithmetic and subscripts.
dd.* Distribution of determinant from V and R "S Programming" p154.
Has nested for loops.
Some variants of the routine are compared.
while* Compare jitted and non jittable while loops.
The "x+1i" loop is not jittable because it has a
complex (not real) loop condition and body, but
note that Ra is nonetheless faster than R.
repeat* Same, but for repeat loops.
vadim* From Vadim Ogranovich post to r-devel.
They are listed in order of execution time on R 2.6.2.
"add1" is an extension to Vadim's tests.
The biggest gains are seen when subscripted assignments
are jitted. Note the improvements in speed even when
jitting is not enabled.
The columns are defined as follows:
N Number of loop iterations
R262 Time in seconds for R 2.6.2.
jit=0 Ra 1.0.6 with jit(jit=0) divided by R 2.6.2 time.
Shows speed improvement of Ra without jitting, relative to R.
Smaller time ratios mean that Ra is faster.
jit=1 Ra 1.0.6 with jit(jit=1) divided by R 2.6.2 time.
Arguably the most important column.
jit=2 Ra 1.0.6 with jit(jit=2) divided by R 2.6.2 time.
Note that jit=2 is currently experimental.
jit2/jit1 Ra 1.0.6 with jit(jit=2) divided by Ra 1.0.6 with jit(jit=1).
Shows improvement with experimental extra optimization on "for" loops.
Only affects nested "for" loops.
The standard deviation for all the time ratios is less than 5%. The measurements were made on a 3 Ghz Pentium D running Windows XP.
The code to generate the results above is listed here (which is also included in the Ra sources). This code was run for the various configurations in the table. The relative times were then calculated manually using the R2.6.2 time as a reference.
The loop count N is large for all the above tests. Readers interested in results with lower loop counts should look here. For results on previous versions of Ra, look here.
The tests tend to show the jitter in a good light. This is because they make heavy use of arithmetic with vectors in loops, do not call C or Fortran routines, and use a high number of loop iterations. Let me know if there are other tests you would like to see.
To Ra homepage