Tuesday, February 21, 2012

Ruby SmoothData class: now static-free

Last time I posted what I called a "SmoothData" class for Ruby. But even as I was hitting "publish" on Blogger I realized I wasn't being honest. There wasn't much "class" about it: it was a single static method which was similar to what one would write for any sequential procedural language. Indeed, since it was essentially ported from Perl, this shouldn't be a surprise.

Initially my attitude was "this function doesn't need to preserve any local state. There doesn't need to be multiple instances of it. It's just an algorithm an nothing more. Why make life too complicated just for a dogmatic adherence to the object oriented cult?"

But then I tried to use it and I realized... hmm... maybe I should reconsider.

The plan was to, for various measured Strava parameters (altitude, speed, power) to have both measured values and smoothed values. So, for example, for altitude, I'd have an altitude array and a smoothed altitude array, each a function of time, another array. I'm "remember" that altitude was smoothed with respect to time and not, for example, with respect to distance. Furthermore, I'd "remember" to calculate the smoothed values the first time I needed them, then after I could refer to the calculated values.

But then I realized this is exactly the sort of algorithm which could be improved with an object. What I want is something which I can initialize with arrays of raw x and y values, give it a smoothing time constant ("tau"), then if I ask for the smoothed value of y it can do the dirty work of figuring out if it needs to calculate that or refer to the saved value.

There's still that method though, the one which does the difficult work of calculating the array of smoothed y values. It could be done either as a static method or as an object method.

My first thought is it should be static: then I could call it from outside the scope of the class with arbitrary arrays, and it would return the smoothed array, without perturbing instance variables. But this was slightly cumbersome: I had to pass arguments to the smoothing function, and calling the function required prepending the class name to the function name.

So another reevaluation... did I really need to make the method static?

The bias in object-oriented programming is against static methods. They're like "gotos" and global variables in procedural languages. Sure, they have a place, but should be used only when really needed, because they are more often than not a good sign you're not doing things in the way for which the language is designed. This is a controversial topic: some people espouse the "get it done however you can get it done as quickly as possible and don't sweat the details" approach, but I prefer doing things after a certain model, because I believe it makes code easier to manage, expand, and reuse. In the end, investing more time up front saves time.

So I de-staticked the smoothing procedure, and things look much better now.

I stuck the code in this Google Docs folder. Here's how it looks:


class SmoothData
  def initialize(x=nil, y=nil, tau=0)
    @x = x
    @y = y

    # make sure tau is set to a floating point number to avoid integer division
    @tau = Float(tau)

    @ys = nil
  end

  def smoothData
    # if no x values, nothing to do
    return if @x.nil? || (@x.length == 0);

    # make sure x and y have the same length
    if @x.length != @y.length
      warn "smoothing attempted with unequal length of x and y arrays: adjusting y"
      # if x is shorter then y, then crop y
      if (@x.length < @y.length)
        @y = @y[0 ... @x.length]
      # if x is longer than y, then pad y with zeros
      else
        @y << Array.new(@x.length - @y.length, 0)
      end
    end

    @ys = Array.new(@x.length)

    # if no smoothing is requested, done
    if (@tau == 0)
      @ys = @y
      return
    end

    # otherwise run filter in both positive or negative directions
    [-1, 1].each do |d|
      xold  = nil
      yold  = nil
      ysold = nil

      # along each direction, we run an exponential convolution
      (0 ... @x.length).each do |i|

        # if this is the negative direction, use points starting
        # from the last, otherwise go in forward order.
        # So i is an initial counter but n is the point we're going
        # to process
        n = (d == 1) ? i : @x.length - 1 - i


        # for either the first point in the sequence or if there is
        # a gap much larger than the smoothing constant, use the unsmoothed
        # point
        if ((i == 0) || (@x[n] - @x[n - d]).abs > 100 * @tau)
          @ys[n] = @ys[n].nil? ? @y[n] : (@ys[n] + @y[n]) / 2
          xold  = @x[n]
          yold  = @y[n]
          ysold = @y[n]

        else
          # calculate the proper contribution of the new point with a running
          # average of old points
          # u is a normalized difference between this point and the preceding one
          u = (@x[n] - xold).abs / @tau

          # apply the exponential decay to z
          z = Math.exp(-u)

          # dy is the difference between the present point and the previous point
          dy = @y[n] - yold

          # the following works with non-uniform point spacing
          ysdir = ysold.nil? ? @y[n] : ysold * z + @y[n] * (1 - z) + (dy / u) * ((u + 1) * z - 1)
          
         # update ys
          @ys[n] = ys[n].nil? ? ysdir : (@ys[n] + ysdir) / 2

          # update "previous" points before going to next point
          xold = @x[n]
          yold = @y[n]
          ysold = ysdir
        end
      end
    end
  end

  def x=(xnew = nil)
    @x = xnew
    @ys = nil
  end

  def y=(yew = nil)
    @y = ynew
    @ys = nil
  end

  def tau=(tau = 0)
    @ys = nil
    # make sure tau is set to a floating point number to avoid integer division
    @tau = Float(tau)
  end

  def resetYS
    @ys = nil
  end

  def x
    @x
  end

  def y
    @y
  end

  def tau
    @tau
  end

  def ys
    if @ys.nil?
      smoothData
    end
    @ys
  end
end
Here's an example, using the interactive ruby shell (irb):
% irb
irb(main):001:0> require 'SmoothData'
=> true
irb(main):002:0> sb = SmoothData.new
=> #
irb(main):003:0> sb.x= [1, 2, 3, 4, 5]
=> [1, 2, 3, 4, 5]
irb(main):004:0> sb.y= [0, 0, 1, 0, 0]
=> [0, 0, 1, 0, 0]
irb(main):005:0> sb.x
=> [1, 2, 3, 4, 5]
irb(main):006:0> sb.y
=> [0, 0, 1, 0, 0]
irb(main):007:0> sb.ys
=> [0, 0, 1, 0, 0]
irb(main):008:0> sb.tau= 1
=> 1
irb(main):009:0> puts sb.ys
0.183939720585721
0.183939720585721
0.367879441171442
0.183939720585721
0.183939720585721
=> nil

In this case, I created a new SmoothData object with no parameters. It thus had no initial data. Then I sent it arrays for x and y. I still hadn't initialized the smoothing constant, so no smoothing was applied. When I asked for "ys", I just got y back. Then I set the smoothing constant to 1. Then when I asked for "ys", I got a smoothed representation of the data. The initial function was symmetric, and the smoothed function was also symmetric, as I wanted.

No comments: