Last time I showed a comparison of Powertap to Vector power with a least-square line through the points. I noted this was "hard to interpret". Here's why.

A traditional least-square fit begins by assuming that the x-values are perfectly known, the y-values are uncertain. It additionally assumes that the y-values are normally distributed, with a Gaussian probability distribution, with the error distribution the same for all points, or with variances inversely proportional to the weighting factors for a weighted least-square fit.

This seems like a lot of technicality, but it introduces subtle biases in the result when comparing two values whcih contribute relatively equally to the error. For example, consider my power comparison for the Garmin Vector to the Powertap:

The slope was 0.97. Therefore, if I flip the axis the slope should be 1.03%. If the Vector is 3% lower than Powertap than the Powertap is 3% more than Vector, on average. But that's not what I get. I get 99%:

How can that be? The Vector is 3% lower than Powertap and Powertap is 1% lower than Vector?

The answer is in assumptions of the least square fit. Consider fitting the following,where P_{V} is Vector power and P_{P} is Powertap power:

P_{V} = k_{VP} × P_{P}

Here k_{VP} is a coefficient relating the P_{P} power to the P_{V} power.

The error associated with this estimate is thus as follows:

E_{VP} = ∑ ( k_{VP} × P_{P} - P_{V} )^{2}

To set the value k_{VP} to minimize this error, I differentiate with respect to that coefficient:

∂ E_{VP} / ∂ k_{VP} = ∑ 2 P_{P} ( k_{VP} × P_{P} - P_{V} )

= 2 k_{VP} ∑ P_{P}^{2} − 2 ∑ P_{P} P_{V}

I then set this derivative to zero (the usual minimization or maximization strategy for analytic functions). This yields the following solution:

k_{VP} = (∑ P_{P} P_{V}) / ∑ P_{P}^{2}

Note this is most strongly influenced by points with larger values of P_{P}. In this fitting algorithm, the errors are assumed to be in P_{V}, which is on the "y-axis". P_{P} values are considered "reliable" and are used to weight the sums. However, I had no reason to assume P_{P} were more or less reliable than P_{V}.

If I instead go through the same exercise assuming the Vector powers are reliable and the Powertap values are not, I get:

k_{PV} = (∑ P_{P} P_{V}) / ∑ P_{V}^{2}

The product of these ratios will be 1, as expected, if the data are truly proportional, but if not, the ratio will no longer be one. That means the result depends on which power I assert is the more reliable. If I pick one at random, my conclusion about the ratio of Vector to Powertap power differs. This obviously isn't so good.

Note the culprit is the factor which appears both in the numerator and the denominator. In the k_{VP} formula that is P_{P} while in the k_{PV} formula that is P_{V}. To "balance" the fit between the two powers I should pick a common value for the two formulas. For example, I can choose the sum of the two powers (equivalent in this context to using the average). Thus I get a new pair of coefficients:

k_{VP}' = (∑(P_{P} + P_{V}) P_{V}) / ∑ (P_{P} + P_{V}) P_{P}

k_{PV}' = (∑(P_{P} + P_{V}) P_{P}) / ∑ (P_{P} + P_{V}) P_{V}

It's trivially obvious that P_{VP}' P_{PV} = 1, as I'd expect.

Here's the result of applying this formula to fitting the Vector power versus the Powertap power, comparing to the traditional least-square fit:

A summary of the slopes is as follows:

weighting term | Vector/Powertap | comment |
---|---|---|

P_{P} | 0.9696 | least-square P_{V} vs P_{P} |

P_{V} | 1.0057 | 1 / L.S. P_{P} vs P_{>V} |

P_{V}+P_{P} | 0.9874 | 1 / L.S. P_{P} vs P_{V} |

1 | 0.9838 | <P_{V}> / <P_{P}> |

Note the first line is the traditional least-square fit. The second line is fitting Powertap versus Vector, but then taking the reciprical to get the opposite. The third is the proposed balanced weighting scheme. Here it gives the same result to listed precision (but not exactly) as the average of the first two results, which is also clearly balanced. The fourth line is a simple ratio of averages. This places equal emphasis on the lower power points. It is also clearly balanced, and gives a result closest to the proposed scheme.

These fits have been to a one-parameter curve: a line of adjustable slope and zero intercept. A more flexible least square fit has an additional intercept term yielding two fitted coefficients. But the same approach can be applied there. The key indicator is that fitting y versus x should yield a slope equal to the reciprical of the slope yielded by fitting x versus y. The formulas described here still apply. First shift the points so the average values of x and y are zero. Then the formulas listed here apply to the slope. Slope is translation-invariant. You can then shift the points back and trivially calculate the intercept.

If the two axes have different units or if the uncertainty in the points is estimated to be different, then the following can be used:

k_{VP}' = ∑(P_{P} / σ_{P} + P_{V} / σ_{V}) P_{V} / ∑ (P_{P} / σ_{P} + P_{V} / σ_{V}) P_{P}

where σ_{P} is the uncertainty in P_{P} and σ_{V} is the uncertainty in P_{V}. If σ_{P} = σ_{V} then this the new terms obviously cancel.

## 2 comments:

Dan, Michael Barnes here, former LKHC participant. Sorry to miss you at Diablo TT, I was sick that day. BTW, nice mention of you in new Summerson book.

OK, down to business. I taught Econometrics about 20 years ago, but much of it has stuck with me. A few thoughts:

1) You are assuming that he relationship here is really is linear, i.e. percent power loss is consistent throughout a wide power range. That may not be true. This can matter when you are trying to tease apart differences when true values are close to one.

2) Least squares parameter estimators are BLUE (best linear unbiased estimator), where "best" means minimum variance, if dependent variable error terms are i.i.d.--identically independently distributed.

3) If you add the additional assumption that the y-error terms are also normally distributed, then least squares estimators are BLU (best unbiased estimators) because then the least squares estimators are MLE (maximum likelihood estimators) and those achieve the Cramer-Rao lower bound, and you can't do any better than that.

4) The distinction between BLUE and BLU tends to go away with large sample sizes--that the law of large numbers.

5) If the independent variables are measured with error, your parameter estimates are not consistent. That means they don't tend to get better (more precise) with more data. That's a real no-no. Errors in independent variables also tend to reduce the size of parameter estimates.

6) More on errors in variables here:

http://en.wikipedia.org/wiki/Errors-in-variables_models

Thanks, Michael! Good comments. The key factor here is the classic linear least-square fit treats the x and y axes differently, and if you have no basis to pick one versus the other, a balanced fit such as the one I proposed makes more sense.

Yes -- I assume the relationship is linear (more than that -- strictly proportional). But this is really just an attempt to disprove the null hypothesis, that being that both measure power perfectly, and that since there is a reasonable estimate for the power lost to the drivetrain, the Vector should generally measure a few % higher than the Powertap. I don't observe that, but the deviation is consistent with the claimed accuracies of the units.

I'm curious to compare the Vector with my (as opposed to the borrowed one I used here) Powertap. Static testing had demonstrated that one to read low. But Powertap claimed static testing is unreliable. I don't believe them. A comparison with Vector compared to the comparison of this unit with the same Vector will be insightful. However, I'm obviously a lot slower than DCRainmaker in conducting experiments :).

Post a Comment