Friday, June 20, 2014

more Vector - Powertap power comparisons

After the last comparison, which was a ride with some steep hills in San Francisco, I did a few more rides comparing Vector to Powertap.

On that previous ride, my maximal power curve had been somewhat depressed on the Vector. Further investigation showed it appeared to be due to two points where cadence anomalously dropped on the Vector (but not the Powertap). Since Vector multiplied force by cadence by crank length to get power, if any of the three is off, the power is off.

The first of these two rides was a spin around Mountain View at lunch, stopping along the way to drop a package at the post office, then pick up some cherries at the local market (Ava's), then after a bit of a loop, a stop at Trader Joe's for figs and apricots. I made good use of my handlebar bag, combing some solid power efforts with fresh summer fruit for the afternoon.

Here's the maximal power curve from that ride:


The next day I rode into work with SF2G, the Dawn of the Dead variant of Bayway. This involved some decent tempo riding, but no maximal efforts. It really was a gorgeous morning and the ride was quite enjoyable.

The maximal power curve follows, although to make this I removed a single time point from the Vector since it had an anomalous power spike. The Vector seems slightly prone to these. If you're looking at best power over the last 6 weeks, for example, it seems there's a good chance the 1 second power will be contaminated. But


In both of these cases the result is basically what I'd expect: the Vector reporting slightly more power than Powertap. No signs of any funny cadence signals.

One comment on these results: you need to be really careful when comparing data from two meters on two Garmin units, especially with auto-pause set, because the two units don't report powers for every time point. Vector, in particular, will report no power when it's idle, presumably to save precious battery capacity. So you need to make sure any sort of analysis you do is robust against missing data. A nice thing about the maximal power curves for efforts up to the longest continuous efforts in a ride is they tend to focus on periods of continuous pedaling rather than periods with a lot of idleness, such as waiting at traffic lights. I've been bitten by this when looking at time-smoothed data and seeing strange differences which turn out to be due to one unit or the other having missing data.

Returning from work, I took the train, but still had to climb Potrero Hill with its peak 24% grade. Here's the power versus time for that ride, where I smoothed the power by 3 seconds (Powertap needs it more than Vector, since Powertap power tends to be spiky since it doesn't average over pedalstrokes like Vector, SRM, Quarq, etc):


Again everything looks good. There's some minor differences. The match isn't as good if I multiply Vector power by 97% in an attempt to adjust for drivetrain losses the Powertap doesn't see. But with the two units claiming only around 2% accuracy, that standard is too strict.

So in summary, the agreement is generally quite good. I think one potential source of variability is when you do the back-pedaling calibration. I think there you want a light touch on the pedals: try to float the feet. It seems a blocky back-pedal here could result in a power zeroing. Additionally, the power might not be as good the first ride after re-installation. Vector recommends doing a few sprints. Also there is an occasional one-second power spike which can wreck Strava's peak 1-second peak power. Since the best 3+ second sprints likely don't contain that power spike, though, they are likely good. Additionally I observed a strange cadence drop during one uphill effort on two brief occasions (1-2 seconds). This affected my power metrics for this effort.

I think Garmin could address the power spike issue with post-processing. For example, consider the case where I have 5 data points, and the power on points 1, 2, 4, and 5 represents a relatively tight distribution, but 3 is substantially higher. It may be plausible that the transition from points 2 to 3 would be a dramatic increase in power, but then points 4 and 5 would likely either also be a high power or would likely be close to zero power. It would be highly unlikely that power would spike for one pedal stroke then by the following pedal stroke return to a previous range, then stay there for an additional stroke.

Another approach for power spikes would be to consider the more detailed power through the pedal stroke. If the spike comes from an anomalous blip from a sensor, then it's unlikely that pedal stroke will exhibit a realistic force-versus-time pattern. The software could check for anomalous force-versus-time during the pedal stroke and isolate the anomalous portion.

Fortunately such intelligence is in the pods, not the pedals, and so it could be addressed with firmware, rather than hardware. I fully expect that the pedals will only continue to improve with additional updates.

Then there's that cadence blip. There again it's the domain of the pods. The pods have the accelerometers used to determine cadence. For some reason a pedal stroke may have been undetected. Why? I'm not sure. I don't know their algorithm. They watch gravity rotate relative to the noninertial frame of reference of the pedal pod. Perhaps there's an algorithm tweak which could make this more robust. Or perhaps future pedal pods will have improved hardware for cadence detection. It would be fun to dig into the firmware and try some stuff. Maybe if Garmin goes open source, alas.

So overall I'm certainly willing to use these pedals for power tracking. I recognize 1-second power becomes unreliable, but it's not reliable (to a lesser degree) with Powertap either, due to the fractional-pedal-stroke issue.


Fabrício Souza said...

what software you use to compare two power values?


Fabrício Souza said...

what software you use to compare two power values?


djconnel said...

Sorry for the long delay. I use my own Perl scripts.