Tuesday, March 20, 2012

Installing xgraph on Mac OS/X Lion

After years on Thinkpads running Linux, I decided to take the plunge on an Apple Mac Air. Macs have always had a premium price for the hardware, and I never considered the cost justified, since you're buying a lot of software with that money and I have all the software I need with Linux. But the deal with the Macs is they simply work, and the Air is a beautifully designed machine.

After searching for a quick data plotting application, I was frustrated to find I still couldn't find anything which beat the old tried-and-true: xgraph, written at Berkeley in the 1980's It produces nice plots, data sets clearly differentiated in nice colors against a neutral grey background, with a super-simple data format trivially processed with scripting. The code's beauty is its simplicity and it's excellent defaults: no widgets or pull-down menus to deal with. At work I can tell people think I'm a bit strange for using such ancient code, but I get the last laugh when I toss well-scaled plots up on my screen in a fraction of the time it would take them using their tools.

The primary competition is almost as venerable: gnuplot. Gnuplot is even better, but I've never wrapped my head around the syntax. So I stick with xgraph. For publication-ready stuff I still use xmgrace, descendent of xmgr, another very old code. I'd love to hear options which do a better job in less time than either xmgrace or xgraph.

Anyway, I had to compile the xgraph source in OS/X. This was especially important to me because I've done a few custom modifications to the C source for myself (in particular, providing a "-aspect" command line option which keeps sets the autoscale ranges of the x and y axes to the same span, and also changing the tick-labels to a "%g" format).

First issue was getting it to compile. "dialog.c" contains a procedure called "getline". This is in conflict with a procedure in the standard library, so I changed the two references in "getline" in dialog.c (the definition and call) to "getline_xgraph". Now it compiled.

But then I faced a tougher problem: I was getting a persistent segmentation fault when I ran it. gdb wasn't turning up the problem, either.

The clue was in the compiler warnings: xgraph plays fast and loose with casting between addresses and integers. Sometimes that works, sometimes... So in desperation I checked the gcc man page and started reading, looking for an option which treated all integers as longs. I didn't think this would work, because external libraries might not cooperate.

I didn't find it. But what I found instead was the "-m32" option to compile the code in 32-bit mode even on a 64-bit machine. Hmm... seemed promising. I added this to the head of the "CFLAGS" definition in Makefile (Makefile having been generated with "configure", so this isn't the best approach). I tried it and, ding-ding-ding-ding, the code ran. My little victory shout was probably out of place, since I was on Caltrain at the time...

So now it works.

There's still an issue that when the window is resized the plot can become corrupted. But that's quickly fixed by minimizing then restoring the window.

Here's a screen grab, comparing speed from a ride to smoothed speed as processed with my Ruby script: plot

Sunday, March 11, 2012

fun with JavaScript

After adventures in Perl, then Java, then Ruby, I've been immersed in JavaScript, the language of web UI's.

Writing command line apps is straightforward. For example if the user wants to analyze a specific ride, include the ride ID as a command line argument. Done. Now get on to the juicy data analysis.

But with a user interface, things are much more complex. A bunch of my Strava app ideas begin the same: user specifies a ride, app maps the ride for user reference, various analyses are done and/or actions are taken.

In the Strava API some methods require an authorization key from logging in, while others do not. Among those that do is the method which returns a reduced set of coordinates for a route to allow mapping without transferring the full set of data for the ride. This is important for maps, because it substantially reduces the bandwidth requirement, not only loading the data from Strava, but then subsequently uploading it to Google for map generation. And obviously anything which results in activities being modified, like implementing my motorized segment filter to "clean up" ride data where I kept my computer on while in a train or car, also requires logging in.

So I set for myself the task of implementing the following:

  1. User logs in to Strava.
  2. User is presented with a scrollable list of recent rides from which one can be selected. Alternately, the URL for a ride can be entered in a text entry widget.
  3. The ride is mapped (with the Google Maps API).

This was all a surprising amount of work. There's always details. For example, when you download a list of recent activity info from Strava, it sends the latest 50 rides (no runs), but includes only the ID and the name. I also want to see the time and distance. But to get those, I need to make a separate request, one per ride. This takes a lot of time, so I want to at least display the name first, then fill in the data and time as those become available. Fortunately, the jQuery Ajax method works asynchronously, so I can fill in the ride names (which I know) quickly, then as the queries are answered, fill in the distance and data. This was a lot of time, for what is just one small part of the big picture.

I still need to debug it before making anything public. Everything is still a bit rough around the edges.

Then what? Well, I do have that motorized segment filter I want to implement as a first step. It requires accessing the full dataset, then applying smoothing functions, then estimating power (if Strava hasn't already estimated it for me), then scanning the data for potential transitions to/from a motorized vehicle, then analyze whether the gaps between these transitions are consistent with typical human power. It's a decent amount of bandwidth and a good amount of number crunching. If I convert it to Javascript, I dump all that in the lap of the user. If I keep moving with my Ruby implementation, then it stays with the server. I'm leaning toward the Javascript: even an iPhone can match the number-crunching speed of a Cray computer from 1984, and I'm not sure about running all this on random web servers. But maybe I should give the server-side solution a try, since the server is a guaranteed high-bandwidth connection to Strava. This way the client would need to access only the decimated data stream needed for mapping. So I could try both.