StravaToCSV : It's Ruby's turn
StravaToCSV has become my "test app" for various programming languages (Perl, Java, and now Ruby). And for that it works fairly well: I need to process command line arguments, open an HTTP connection to Strava, download JSON data, convert it, then then output it as CSV. So there's a decent amount there.
This project went much smoother than my Java implementation. It was fairly quick, taking a bit longer when I wanted to avoid the program imploding when it was fed a bad activity specification.
This version takes as its only command line arguments activity numbers (for full activities) or activity-segment pairs, where the activity number is separated from the matched segment by a "#". It will sequentially load each of these, outputting a CSV stream with the header determined by what fields it finds in the first non-empty activity. It adds the activity number as the first column of the CSV stream.
At first I expected the JSON library to have some sort of validity checking method for the stream: does a stream represent valid data? But I didn't find any. Instead when attempting to convert data it will throw a "JSON::JSONError" exception if it can't figure things out. So I use the Ruby "begin... rescue... end" construct to check for this. This is about as simple as exception handling gets, in my experience.
Anyway, it seems to work. It's my first experience with Ruby, so I'm probably not doing things in the best fashion, but it's nice to see it do what I wanted, anyway.
I'm sure the Ruby coders are hurling at the mere sight of this... Feedback welcome!
#! /usr/bin/ruby require 'net/http' require 'json' # strip a path from the program name: for error messages progname = __FILE__.gsub(/.*\//,"") # loop through command line options # each should be a valid strava activity activities = [] ARGV.each do |arg| # check that arg is a valid format if arg.match(/^\d+#?\d*$/).nil? warn "ERROR: poorly formatted arg #{arg}" exit 1 end activities << arg end headers = [] activities.each do |activity| warn "#{progname}: processing activity #{activity}" url = "http://www.strava.com/api/v1/streams/#{activity}" # get the data structure from the file # get does not raise exceptions in Ruby 1.8, according to class documentation http = Net::HTTP.get(URI.parse(url)) # convert to data begin data = JSON.parse(http) rescue JSON::JSONError warn "#{progname}: Error parsing Strava activity #{activity}!" next end # if we don't have data, then there was an issue: go on to next activity if data.length.zero? warn "#{progname}: No data found for activity #{activity}!" next end # list the returned fields if headers.length.zero? headers = data.keys # print the headers, but special case for latlng, which must be split puts "activity," + headers.join(",").sub("latlng","lat,lng") end # iterate over the indices of the first field (0 ... data.values[0].length).each do |i| output = [activity] headers.each do |k| z = data[k][i] output << (z.nil? ? "" : z) end puts output.join(",") end end
Anyway, I'm getting a really positive feeling about Ruby. I don't feel like I'm fighting it: it's a fairly coherent design. With other languages I often feel they've been pushed beyond their original scope, that they've become a house of cards of layer upon layer forced in place to accommodate unforseen paradigms or needs. Java, with it's HTTP module, feels very much like that, as does all of C++. And Perl with it's ad hoc object handling and endless selection of CPAN libraries is just a big bucket of chaos. Maybe Ruby is moving in that direction now that the number of cooks has grown. But at least for this example I feel as if it all works together.
I ran some quick benchmarks, converting a recent long ride. Here's the execution times for the three versions, where I avoided internet activity while the code was running:
Perl: 32 seconds, then 42 seconds (two iterations)
Ruby: 38 seconds, then 44 seconds (two iterations)
Java: 30 seconds, then GSON threw an exception
Java barfed on me, I'm not sure why. I'd need to debug that... maybe the activity was too large, because it runs on small activities. But it doesn't surprise me: the syntax is all fairly opaque and that makes it prone to coding errors. Perl might have been faster than Ruby, maybe not... but in the Perl I cheated in downloading the URL with wget, which is optimized compiled code, while in the Ruby I'm using Ruby to download. I guess the main thing this proves is my AT&T/Yahoo DSL is slow, slow, slow.
Comments
We have a strava-parser too: https://github.com/torhve/Turan/blob/master/apps/turan/stravastreamparser.py (not written by me)
If you have any questions please contact me!
Step 1 is app which lets users map their rides on Google Maps. Of course Strava already provides this, so it's just a building block to what I want to do. But just creating a dialog to let riders scroll through their ride list and select one takes days worth of available time.
Please have a look at one of my exercises on my site:
http://turan.no/exercise/7226/Kvinnheradmeisterskap_Tempo
If you feel like you can achieve your goal working with us I'd be happy to recieve patches. Or even if you want to use parts of our code to do what you want, so we could share hard stuff, like signal smoothing, etc. I'd be very interested.