ra.png

Ra Timing Tests

Here is a table of timing tests which compare Ra 1.3.1 to R 2.13.1.
                                    vanilla    compiled      jit=1
TestName                        N      secs       ratio      ratio

convolve                     1600     20.20         4.0       22.4
otago.wrapper                2000     14.36         3.6       12.3
base/TAOCP.R                   80      8.89         3.2       11.7
looped.dnorm               800000      5.64         3.4       14.0
ROCR/auc                  2000000     14.29         4.2       14.7

dd.for.c.wrapper              667     83.32         1.1        1.0
dd.for.prealloc.wrapper       667     60.42         1.9        2.4
dd.for.tabulate.wrapper       667     37.50         3.8       15.3
dd.fast.wrapper               667     71.39         1.0        1.0
dd.fast.tabulate.wrapper      667      0.49         1.0        1.0

while  x <- x + 1         4000000      5.90         4.9       11.0
repeat x <- x + 1         4000000      7.86         6.3       10.0
for.if                      20000      0.78         1.1        0.9
while  x <- x + 1i        4000000      5.87         2.1        1.0
repeat x <- x + 1i        4000000      9.99         3.6        1.0

vadim1 1                 20000000      0.92         0.3        1.6
vadim2 i                 20000000      1.33         0.6        0.6
vadim3 i-1               20000000      9.60         2.3        9.6
add1   x <- x + 1        20000000     15.08         2.9       11.5
vadim4 x[i-1]            20000000     18.08         2.6        8.8
vadim5 x[i] <- 1.0       20000000     52.31         4.8       27.2
vadim6 x[i] <- x[i-1]    20000000     76.15         5.0       23.4
x[i,1]                   10000000     12.78         1.2        8.1

luke.la1.wrapper         20000000      0.98         3.0        1.1
luke.la2.wrapper         20000000      1.14         3.3        1.1

dirk1                        5000      0.89         1.2        0.9
dirk2                      300000      8.76         4.8       26.6

The columns are defined as follows:
N               Number of loop iterations

vanilla secs    Time in seconds for standard Ra 1.3.1 code
                Often a little less than R 2.13.1 times (not shown).

compiled ratio  Time ratio after compiling the code with Luke's compiler.
                Large numbers mean that the compiler is doing well.

jit=1 ratio     Time ratio after jit-compiling the code.


For the specifics of each test see the R code here. In summary:
convolve      Convolution example in the "Extending R" manual.

otago.wrapper From Ross Ihaka's Otago talk
              (the "straightforward" coding, with large loop counts).

base/TAOCP.R  From R base library. Integer arithmetic.

looped.dnorm  From one of Luke Tierney's bytecode compiler docs.

ROCR/auc      From the ROCR package.
              Has a for loop with real arithmetic and subscripts.

dd.*          Distribution of determinant from V and R "S Programming" p154.
              Has nested for loops.
              Some variants of the routine are compared.

while*        Compare jitted and non jittable while loops.
              The "x+1i" loop is not jittable because it has a
              complex (not real) loop condition and body, but
              note that Ra is nonetheless faster than R.

repeat*       Same, but with repeat loops.

vadim*        From Vadim Ogranovich post to r-devel.
              They are listed in order of execution time on R 2.13.1.
              "add1" is an extension to Vadim's tests.
              The biggest gains are seen when subscripted assignments
              are jitted. Note the improvements in speed even when
              jitting is not enabled.


luke.la*     Examples from Luke's compiler help page (with large loop counts).
             For this kind of code the compiler outperforms the jittter.

dirk*        Examples from Dirk's blog.


The measurements were made with a 32 bit Ra build running on a Intel Q 820 1.73 GHZ machine running Windows Vista 64. The standard deviation for all the time ratios is less than 5% (except for the luke.la* tests, not sure why).

The loop count N is large for all the above tests. Readers interested in results with lower loop counts (on an older version of Ra) should look here.

The tests are somewhat skewed in that they tend to show the jitter in a good light. This is because they make heavy use of arithmetic with vectors in loops, mostly do not call C or Fortran routines, and use a high number of loop iterations.


To Ra homepage