Archive

coding

I have always thought that maintaining a code library is a great way to keep up your chops in a language you do not use everyday. A lot of people use listservs and coding fora for this, but code libraries have the advantage that over time you learn a lot about one particular topic or subject.

Recently my interest has turned to nonparametric statistical techniques, and for the most surprising reason, that nonparametric estimators can be visualized. Something about a smooth effects curve flanked by 95% confidence intervals has stuck in my imagination for a few months now. I have been plotting starting a new code library, but cannot make up my mind as to what programming language it should be in.

Here are the choices I considered, and the relative pros and cons.

  1. Matlab/Octave: Matlab is proprietary and given that I will not be in academia in the near future, this could be a short-lived choice. But coming from a programming language that is an also-ran when assessed the number of users metric, the large reach is enticing. Plus, Matlab ages slowly, so current versions are good for at least a few years. Matlab comes with excellent optimization libraries, and it is my aim to be able to explore those carefully should I write the library in Matlab. Matlab has great graphics and this is important for nonparametric statistics. The main con of Matlab is that it is slow and I am not very sure how it scales for large data.
    My idea is to write first a fully Matlab version of the toolbox, as libraries are called in Matlab, and then fold in MEX code as hopefully I get better at it.
    The Octave implementation is nice and also has a MEX-like system. I think the problem with Octave is that the quality of the toolboxes drops off fairly quickly.
  2. Python: I have only recently begun using Python. Computer scientists love it, and it is open source. Some economists have recently begun to write econometrics code for Python — most notably John Stachurski at ANU who provides a fully featured advanced undergraduate textbook in econometrics with Python code examples. The code I have come across and attempted to write has turned out very neat, and this is important to me. The problem here is that there is already a project very much of the nature that I had in mind being written here.
    Another point in favor of Python is that it has great IDEs, including
    hooks for Visual Studio which I have recently discovered.
  3. R: Now you’d think this would be my first choice to write a statistics package. But I don’t like R (yet). I find the syntax unreadable, and the way functions are scattered around the vast numbers of packages impossible to keep track of. R is also (very) slow, although with C++ code folded in apparently it can be made much faster. I don’t know.
    I have an ongoing project where I am translating the empirical examples from Wooldridge’s book to R, but I think that R is best suited as a scripting language leveraging the work others in providing pre-package routines.
    I have access to the Revolution Analytics repackaging RevoR, and I like the IDE, but I remain unconvinced. R does have a fantastic library for nonparametric econometrics,
    the np package.

It is worthwhile spending some time making sure that the language chosen suits the project, because while switching is possible, it becomes more unlikely over time making dynamically inefficient equilibria likely.

Advertisements

The latest buzz in the technical computing circles, seems to be about the language Julia. A language aimed to be a crossover between R and Matlab, it has been generating a lot of interest around the interwebz, for example here, here, here, and here – and those are just the ones that showed up in my Google Reader.

However, Julia has only been available for a few platforms till now, OS X, FreeBSD, and Linux. Last week, a Windows binary was made available, and I decided that this was too much temptation to resist.

The steps to your writing your first program and running it under Windows 7 (64-bit) are simple. Download the tarball, unzip, and run the “julia.bat” file for interactive mode, or write your .jl programs for non-interactive mode.

Here is a simple example

# filename: first.jl
# purpose: first program in Julia
# author: informationmatrix
# date revised: 15th august 2012

dA = 2+2
println(dA)
sH = "Hello World!"
println(sH)

Run this program on the command line using

julia "./julia-examples/first.jl"

And that is it.

A lot is being made of the fact that Julia’s just-in-time interpreter makes striking speedups possible, and I hope to explore those speedups for econometric problems in the near future. By the way, the resemblance of the code to Matlab is pretty striking but we are asked to ignore the similarity.

I have days where my typing gets so bad that I am reduced to two finger typing. It is very off-putting. I am not a touch typist, but, I have gotten to be fairly fast typist over the years. Still, some days, it feels like the keyboard and I are not friends.