Archive

Uncategorized

While I try to find a new home for defunct blog, I will be blogging here.

Advertisements

Not being a gamer, I was not aware that some gaming keyboards, in particular, my recently acquired Cooler Master Quickfire Rapid I, have a Windows key lock that prevent exiting from gaming mode in case that the Windows key is pressed accidentally I use the Windows key quite frequently on Ubuntu to emulate the Windows “Snap to Sides” feature that maps Ctrl+Win+RightArrow to snap a window to the right of the screen, for example. I was not able to figure out why it was not working with my new keyboard. Turns out the Windows key lock is turned on by default, and can be turned off using Fn+PrtSc.

I spent some amount of time trying to figure out if I could get Google Analytics to track the traffic on my WordPress.com account, but it seems that this is only a feature available using WordPress Business plan, or if I migrate the site to WordPress.org.


I also spent some time figuring out how to set up a new account and property within a Google Analytics account. And then there is the question of configuring useful views of the data within a property (Account > Property > Views). For now, I have created two accounts within my GA account, one for my main site and the other for all the blogs I have on services such as tumblr and this site on WordPress.com.

Spark is the flavour of the season when it comes to distributed computing frameworks, and I have been caught up in the excitement. I have narrowed down the set of Spark resources to three, which I am going to try and use over the next 3 months to try and learn Spark:

  • The first is Advanced Analytics with Spark, which is hot off the presses, but has already gotten very good reviews on Amazon. Not surprising given that Sean Owen is one of the authors.
  • The second, of which I have already the first few chapters, but intend to systematically re-read over the next month, is Learning Spark. I must admit, I am somewhat intrigued by how much Matei Zaharia has managed to achieve at (what I am guessing is) a relatively young age.
  • Lastly, there is a brand new edX.org course on Spark that has just started. It is a little disappointing that the course uses Python 2.7 and Spark 1.3, instead of moving to the impending release of Spark 1.4, which also supports SparkR and Python 3.4. I think that the Spark guys are holding off on releasing the latest version of Spark till the Spark summit later this month.

In any case, Spark is an important new technology for data analysis, and significant improvement over the disk-only storage model of Hadoop MapReduce. That is not to say that Spark is the only in-memory distributed computing framework that can do ad hoc querying, machine learning, and graph processing on big data — there is the newly promoted to top-level project Apache Flink, but Spark definitely appears to have a significant head start. I look forward to learning more about Apache Spark.

It has been a long time since I blogged here, so here is a quick update:

  • Coffee is still as important to me as it was, but I no longer forget to consume it as part of my daily routine.
  • I still struggle with typing. I have tried several different keyboards over the past 3 years, including the Microsoft Ergonomic 4000 (of which I have gone through 3), the Cooler Master Rapid-i tenkeyless, which I got recently, and the Logitech K400r.
  •  I have been constantly reminded of Julia sporadically over the years, as people experiment with it, and as the tooling and support for Julia has gotten better over. The second JuliaCon is going to be held in Cambridge, Mass this month. I have not experimented with Julia since the time the last blog post mentioning Julia was written on this blog.
  • I still struggle with the pros and cons of multiple computing devices, and over the last 3 years, smartphones have only added to that struggle. I now own 4 (!) laptops, 2 mobile phones, and a (defunct) desktop PC. 3 of the 4 laptops now run Ubuntu (14.04, 14.10 and 15.04 (!)), and the other one runs Windows 7. One of my phones runs Android Lollipop, and the other runs Windows 8.1 (looking forward to the Windows 10 update).
    I constantly forget which machines I have updated and which ones contain a particular piece of software. I have myriad unwritten rules about which machines are to be used for what, but most of those are compromised in favor of just doing what is most convenient.
  • I did buy the domain, and set up a new blog on that page, but that blog never contained anything interesting at all, and I eventually shut it down, but not before it was auto-renewed for a year, and I vowed to use it for something productive, but didn’t.
  • I never got anywhere with OCaml or F# or any of the other functional programming languages that I wanted to learn at that point. I don’t think I ever tried after writing that blog post.
  • I did end up investing a fair amount of effort in learning how to, and maintaining a couple of R and Python package projects. They are not terribly complex, but it was helpful to understand the process of package creation, documentation, and distribution. For those who are looking to create R packages, even though RStudio makes packaging R code extremely simple, might benefit from Hadley Wickham’s excellent new book R Packages.
  • I completely abandoned trying to work with Matlab or Octave since I discovered NumPy and I even thought about translating the best known example of the use of Matlab — Andrew Ng’s machine learning course on Coursera —  using NumPy and SciPy, but have not gotten around to it yet.
  • I would say that I am better at R than I used to be. I will leave it at that. I did manage to do a lot of work that integrates C++ code with R using Rcpp.
  • I never took up the work on understanding reinforcement learning and dynamic programming, even though I have had problems that I could potentially solve using those methods.

I have some idea of dynamic programming problems, based on my graduate level macroeconomics courses. But what I am trying to figure out are the kinds of machine learning problems that DP is most effective in providing insights into, and the kinds of insights that DP can provide. 

I aim to collect some simple examples over the next few weeks that demonstrates the things that you can get out of setting up problems as reinforcement learning or DP problems. Stay tuned.