Thursday, June 19, 2014

balanced alternative to least-square linear fit

Last time I showed a comparison of Powertap to Vector power with a least-square line through the points. I noted this was "hard to interpret". Here's why.

A traditional least-square fit begins by assuming that the x-values are perfectly known, the y-values are uncertain. It additionally assumes that the y-values are normally distributed, with a Gaussian probability distribution, with the error distribution the same for all points, or with variances inversely proportional to the weighting factors for a weighted least-square fit.

This seems like a lot of technicality, but it introduces subtle biases in the result when comparing two values whcih contribute relatively equally to the error. For example, consider my power comparison for the Garmin Vector to the Powertap:


The slope was 0.97. Therefore, if I flip the axis the slope should be 1.03%. If the Vector is 3% lower than Powertap than the Powertap is 3% more than Vector, on average. But that's not what I get. I get 99%:


How can that be? The Vector is 3% lower than Powertap and Powertap is 1% lower than Vector?

The answer is in assumptions of the least square fit. Consider fitting the following,where PV is Vector power and PP is Powertap power:

PV = kVP × PP

Here kVP is a coefficient relating the PP power to the PV power.

The error associated with this estimate is thus as follows:

EVP = ∑ ( kVP × PP - PV )2

To set the value kVP to minimize this error, I differentiate with respect to that coefficient:

∂ EVP / ∂ kVP = ∑ 2 PP ( kVP × PP - PV )
= 2 kVP ∑ PP2 − 2 ∑ PP PV

I then set this derivative to zero (the usual minimization or maximization strategy for analytic functions). This yields the following solution:

kVP = (∑ PP PV) / ∑ PP2

Note this is most strongly influenced by points with larger values of PP. In this fitting algorithm, the errors are assumed to be in PV, which is on the "y-axis". PP values are considered "reliable" and are used to weight the sums. However, I had no reason to assume PP were more or less reliable than PV.

If I instead go through the same exercise assuming the Vector powers are reliable and the Powertap values are not, I get:

kPV = (∑ PP PV) / ∑ PV2

The product of these ratios will be 1, as expected, if the data are truly proportional, but if not, the ratio will no longer be one. That means the result depends on which power I assert is the more reliable. If I pick one at random, my conclusion about the ratio of Vector to Powertap power differs. This obviously isn't so good.

Note the culprit is the factor which appears both in the numerator and the denominator. In the kVP formula that is PP while in the kPV formula that is PV. To "balance" the fit between the two powers I should pick a common value for the two formulas. For example, I can choose the sum of the two powers (equivalent in this context to using the average). Thus I get a new pair of coefficients:

kVP' = (∑(PP + PV) PV) / ∑ (PP + PV) PP
kPV' = (∑(PP + PV) PP) / ∑ (PP + PV) PV

It's trivially obvious that PVP' PPV = 1, as I'd expect.

Here's the result of applying this formula to fitting the Vector power versus the Powertap power, comparing to the traditional least-square fit:


A summary of the slopes is as follows:

weighting termVector/Powertapcomment
PP0.9696least-square PV vs PP
PV1.00571 / L.S. PP vs P>V
PV+PP0.98741 / L.S. PP vs PV
10.9838<PV> / <PP>

Note the first line is the traditional least-square fit. The second line is fitting Powertap versus Vector, but then taking the reciprical to get the opposite. The third is the proposed balanced weighting scheme. Here it gives the same result to listed precision (but not exactly) as the average of the first two results, which is also clearly balanced. The fourth line is a simple ratio of averages. This places equal emphasis on the lower power points. It is also clearly balanced, and gives a result closest to the proposed scheme.

These fits have been to a one-parameter curve: a line of adjustable slope and zero intercept. A more flexible least square fit has an additional intercept term yielding two fitted coefficients. But the same approach can be applied there. The key indicator is that fitting y versus x should yield a slope equal to the reciprical of the slope yielded by fitting x versus y. The formulas described here still apply. First shift the points so the average values of x and y are zero. Then the formulas listed here apply to the slope. Slope is translation-invariant. You can then shift the points back and trivially calculate the intercept.

If the two axes have different units or if the uncertainty in the points is estimated to be different, then the following can be used:

kVP' = ∑(PP / σP + PV / σV) PV / ∑ (PP / σP + PV / σV) PP

where σP is the uncertainty in PP and σV is the uncertainty in PV. If σP = σV then this the new terms obviously cancel.


Michael Barnes said...

Dan, Michael Barnes here, former LKHC participant. Sorry to miss you at Diablo TT, I was sick that day. BTW, nice mention of you in new Summerson book.

OK, down to business. I taught Econometrics about 20 years ago, but much of it has stuck with me. A few thoughts:

1) You are assuming that he relationship here is really is linear, i.e. percent power loss is consistent throughout a wide power range. That may not be true. This can matter when you are trying to tease apart differences when true values are close to one.

2) Least squares parameter estimators are BLUE (best linear unbiased estimator), where "best" means minimum variance, if dependent variable error terms are i.i.d.--identically independently distributed.

3) If you add the additional assumption that the y-error terms are also normally distributed, then least squares estimators are BLU (best unbiased estimators) because then the least squares estimators are MLE (maximum likelihood estimators) and those achieve the Cramer-Rao lower bound, and you can't do any better than that.

4) The distinction between BLUE and BLU tends to go away with large sample sizes--that the law of large numbers.

5) If the independent variables are measured with error, your parameter estimates are not consistent. That means they don't tend to get better (more precise) with more data. That's a real no-no. Errors in independent variables also tend to reduce the size of parameter estimates.

6) More on errors in variables here:

djconnel said...

Thanks, Michael! Good comments. The key factor here is the classic linear least-square fit treats the x and y axes differently, and if you have no basis to pick one versus the other, a balanced fit such as the one I proposed makes more sense.

Yes -- I assume the relationship is linear (more than that -- strictly proportional). But this is really just an attempt to disprove the null hypothesis, that being that both measure power perfectly, and that since there is a reasonable estimate for the power lost to the drivetrain, the Vector should generally measure a few % higher than the Powertap. I don't observe that, but the deviation is consistent with the claimed accuracies of the units.

I'm curious to compare the Vector with my (as opposed to the borrowed one I used here) Powertap. Static testing had demonstrated that one to read low. But Powertap claimed static testing is unreliable. I don't believe them. A comparison with Vector compared to the comparison of this unit with the same Vector will be insightful. However, I'm obviously a lot slower than DCRainmaker in conducting experiments :).