Hi there! My name is Herb Susmann. I'm a Data Scientist at Silent Spring Institute. On the side I'm interested in data journalism, motion design and fortune cookies.

Optimizing R for Big Data

I was fortunate enough to be able to present at UP-Stat 2014 on some of the things I've learned about writing performant R code while I was working on speeding up rlme. The talk seemed to be a big hit - I was awarded "Best Student Presentation"!

Until I have time to write up my talk, here are my presentation slides:

Programmer Networks the Social, Hacks the Computational

Each week the student newspaper here on campus (The Lamron) puts together a piece, the “Invasion of Privacy”, on a member of the school community. I was chosen to be invaded a few weeks ago, check out the result!

Programmer Networks the Social, Hacks the Computational

Photo credit: Evan Goldstein, Assoc. Photo Editor, The Lamron.

Raytracing in Bash

Last semester I took an introductory course in raytracing.

We practiced an iterative development cycle in which we built up more and more complex ray tracers over the course of the semester. The very first ray tracer was pretty simple: it had to be able to intersect rays with simple geometric objects and display the results, but there didn’t have to be any lighting calculations or anything yet.

Once I figured out the assignment in OCaml, I decided to give it a shot entirely in Bash!

(Well, not ENTIRELY in Bash. I will admit I shelled out to bc for floating point operations. A friend pointed out you could do floating point in Bash by having seperate variables for the integer and decimal components, but that’ll have to wait for version two)

It prints out the raytraced image directly to the console using special unicode characters and coloring through escape codes. Here’s what the result looks like:

Bash Ray Tracer

Not exactly pretty, but it works! (if only there was a way to change the line spacing in gnome-terminal).

Here’s the results from a scene rendered with my OCaml raytracer to compare against:

Bash Ray Tracer

Feel free to download the script: raytracer.sh.

Sending SMS from the Command Line with Mutt & Gmail

We have a genetic algorithm at work that takes quite a while to run. I set it up right before I left work to run while I was gone for the night, but wanted a simple way to know when it was done. In twenty minutes I hacked together a simple way to get a text message when the program finished running.

The idea is to use Mutt to email the email to SMS gateway that my cell phone service provider runs. (Wikipedia has a full list of all the gateways.)

Setting up Mutt with Gmail

First I needed to set up Mutt access my Gmail account. Fortunately, The Ubuntu Help Wiki has a tutorial on how to do just that.

Sending an SMS

Once Mutt is set up, sending an SMS is dead simple. My cell phone provider is Verizon; to send an SMS you simply email the 10 digit phone number at vtext.com

  
    $ echo "This is a text!" | mutt 5553423218@vtext.com
  

To get a text message once my program was done running, I just ran

  
    $ ./genetic-algorithm && echo "Algorithm completed" | mutt 5553423218@vtext.com
  

Faster SSH with Multiplexing

SSH has a Multiplexing feature that will reuse existing open connections to a server when starting new connections. In other words, you can open one session with a server, and all subsequent connections will piggyback on the first. This means the later connections can be established much faster. This is useful if you only need to run one quick command on a remote server, as no time will be wasted waiting for authentication and login to occur.

Multiplexing can be enabled by default in the SSH config, or used on a case by case basis. I’ll describe how to do the latter.

First, you need to start a master SSH session that later connections will be slaved to. You specify a control file with the -S flag; for security reasons, this file should be somewhere only readable by you. (e.g. /tmp is not a good place for the file)

$ mkdir ~/.ssh/controlmasters
$ ssh -M -S ~/.ssh/controlmasters/user.server user@server

To open another slaved connection, use the -S flag:

$ ssh -S ~/.ssh/controlmasters/user.server user@server

To see how much faster the slaved connection is to a normal SSH session, I timed how long it took to run a command using each method.

First, without multiplexing:

$ time ssh server "date"
Wed Mar 20 18:48:02 EDT 2013

real  0m0.116s
user  0m0.012s
sys   0m0.000s

Then with multiplexing:

$ time ssh -S /root/.ssh/controlmasters/user.server server "date"
Wed Mar 20 18:48:25 EDT 2013

real  0m0.031s
user  0m0.004s
sys   0m0.000s

The command ran 3.7 times faster with multiplexing than without.

For more information and a discussion of the downsides of using SSH multiplexing, check out SSH ControlMaster: The Good, The Bad, The Ugly.