Archive

Monthly Archives: August 2012

I have always thought that maintaining a code library is a great way to keep up your chops in a language you do not use everyday. A lot of people use listservs and coding fora for this, but code libraries have the advantage that over time you learn a lot about one particular topic or subject.

Recently my interest has turned to nonparametric statistical techniques, and for the most surprising reason, that nonparametric estimators can be visualized. Something about a smooth effects curve flanked by 95% confidence intervals has stuck in my imagination for a few months now. I have been plotting starting a new code library, but cannot make up my mind as to what programming language it should be in.

Here are the choices I considered, and the relative pros and cons.

  1. Matlab/Octave: Matlab is proprietary and given that I will not be in academia in the near future, this could be a short-lived choice. But coming from a programming language that is an also-ran when assessed the number of users metric, the large reach is enticing. Plus, Matlab ages slowly, so current versions are good for at least a few years. Matlab comes with excellent optimization libraries, and it is my aim to be able to explore those carefully should I write the library in Matlab. Matlab has great graphics and this is important for nonparametric statistics. The main con of Matlab is that it is slow and I am not very sure how it scales for large data.
    My idea is to write first a fully Matlab version of the toolbox, as libraries are called in Matlab, and then fold in MEX code as hopefully I get better at it.
    The Octave implementation is nice and also has a MEX-like system. I think the problem with Octave is that the quality of the toolboxes drops off fairly quickly.
  2. Python: I have only recently begun using Python. Computer scientists love it, and it is open source. Some economists have recently begun to write econometrics code for Python — most notably John Stachurski at ANU who provides a fully featured advanced undergraduate textbook in econometrics with Python code examples. The code I have come across and attempted to write has turned out very neat, and this is important to me. The problem here is that there is already a project very much of the nature that I had in mind being written here.
    Another point in favor of Python is that it has great IDEs, including
    hooks for Visual Studio which I have recently discovered.
  3. R: Now you’d think this would be my first choice to write a statistics package. But I don’t like R (yet). I find the syntax unreadable, and the way functions are scattered around the vast numbers of packages impossible to keep track of. R is also (very) slow, although with C++ code folded in apparently it can be made much faster. I don’t know.
    I have an ongoing project where I am translating the empirical examples from Wooldridge’s book to R, but I think that R is best suited as a scripting language leveraging the work others in providing pre-package routines.
    I have access to the Revolution Analytics repackaging RevoR, and I like the IDE, but I remain unconvinced. R does have a fantastic library for nonparametric econometrics,
    the np package.

It is worthwhile spending some time making sure that the language chosen suits the project, because while switching is possible, it becomes more unlikely over time making dynamically inefficient equilibria likely.

Advertisements

For a number of reasons, hosting a blog on wordpress.com is not a very satisfactory option for me. Not only does it allow no changes to the CSS, installation of packages is severely limited. I have decided to move to my old domain, that I let lapse last year.

I intend to have it up and running by later today, once I have figured out which hosting plan to buy. The market for web-hosts is crowded and the product is not very well differentiated across sellers.

The workflow for today is roughly going to be:

  1. Buy web-hosting and set up website and wordpress blog.
  2. Download and compile OCaml for 64-bit Windows. This has a significant number of steps and might take a while.
  3. Code up the econometrics “Hello World” – generate some data and compute the regression coefficients, the standard errors and the t-statistics and write them to the console and to a text file.
  4. Blog about it on the new blog.

One of the advantages of having two machines running the identical OS is that when I configure one of the machines using a procedure with many steps, in addition to recording it in my notebook, I get to review the procedure while repeating it on the other machine, and so remembering it better.

The latest buzz in the technical computing circles, seems to be about the language Julia. A language aimed to be a crossover between R and Matlab, it has been generating a lot of interest around the interwebz, for example here, here, here, and here – and those are just the ones that showed up in my Google Reader.

However, Julia has only been available for a few platforms till now, OS X, FreeBSD, and Linux. Last week, a Windows binary was made available, and I decided that this was too much temptation to resist.

The steps to your writing your first program and running it under Windows 7 (64-bit) are simple. Download the tarball, unzip, and run the “julia.bat” file for interactive mode, or write your .jl programs for non-interactive mode.

Here is a simple example

# filename: first.jl
# purpose: first program in Julia
# author: informationmatrix
# date revised: 15th august 2012

dA = 2+2
println(dA)
sH = "Hello World!"
println(sH)

Run this program on the command line using

julia "./julia-examples/first.jl"

And that is it.

A lot is being made of the fact that Julia’s just-in-time interpreter makes striking speedups possible, and I hope to explore those speedups for econometric problems in the near future. By the way, the resemblance of the code to Matlab is pretty striking but we are asked to ignore the similarity.

I usually don’t admit to habits or dependencies, but I love coffee. It has taken me a few years to admit this to myself. That clear-headed feeling you get after chugging through a large Americano at just the right temperature is indescribable. The funny thing is that I forget this every few weeks and suffer through terrible slump days not resorting to this simple remedy.

I have days where my typing gets so bad that I am reduced to two finger typing. It is very off-putting. I am not a touch typist, but, I have gotten to be fairly fast typist over the years. Still, some days, it feels like the keyboard and I are not friends.