I just tried the same code in R’s data.tables and Julia’s DataFrames, and the results are a bit surprising.
I just did a quick analysis of volatility and returns, starting with my usual program — R.
It involves a large data set, CRSP, (compressed 979M) kept in a data.table. Thinking about using Julia more, I thought this would be a good experiment.
I initially used daily data, and because the sample has 33,617,369 rows and six columns, it is quite representative of the work I do.
I’ve made comparisons with Julia before: Is Julia ready for prime time? and Which numerical computing language is best: Julia, MATLAB, Python or R? ,
In any case, I used R version 3.6.0 and Julia 1.4.1. In case anybody complains, yes, this is not the latest version of R, but it’s such a pain to upgrade it on all my systems that I never managed to do it. Besides, it should make any difference here.
Anyways, here are the 2 main calls in the two languages:
The R code looks more readable, unusual, as Julia’s code is typically much better looking.
One core only, and Julia took 2.8 seconds, R 3.1 seconds. That did surprise me. Data tables is supposed to be quite fast. We have a benchmark site that regularly compares such data operations, finding that R’s data.table is several times faster than Julia’s data table in most cases. I can’t explain why, but I did only only use 1 core.
But, if I take the total time, including loading the data in, timed by:
R took 11.7 seconds and Julia 29.7 seconds. The reason is, of course, it takes forever to load Julia packages.
When I google such timing results, the answer is usually it doesn’t matter because one starts Julia once. Once in the REPL, everything is fast.
Fair enough, except I have quite a bit of code that only runs in command line calls only.
Besides, waiting a third of a minute before the program has loaded is quite a pain.
In a good thing I didn’t try to plot in Julia. Not only does using Plot take a long time, plot() reliably crashed on me, so bad I had to do killall julia and then also kill the plot window.
I am porting my main risk library from R to Julia, so may end up using her for regular work, like updating extremerisk.org daily, especially if the startup times improve.
© All rights reserved, Jon Danielsson, 2020