Tuesday, April 29, 2014

Maximal Power curves: thoughts on deweighting, depreciating, or retiring old data and the "do no harm" principle

Back in January and February I had a series of posts here on fitting maximal power curves using a heuristic model and a weighted nonlinear least-squares fitting procedure to fit to do an envelope fit, the curve passing through a series of "quality" power points rather than passing through the middle of points. The argument was that we don't produce our best possible power for all durations, but rather typically for a few durations, and so to predict what our maximal power is for a given duration, we need to interpolate or extrapolate based on the durations for which our efforts represented the best we could do. The weighting scheme was to assign a high weight (for example, 10 thousand) to errors of points falling above the modeled curve, and a lower weight (1) to points falling below the curve. This caused the curve, after an appriate number of iterations, to float to the top of the data point cloud, where it would essentially balance on the points consistent with the highest predicted powers.

The obvious issue with this is that best-efforts from long in the past may no longer represent present fitness.

The way Paul Mach @ Strava dealt with this in implementing a maximal power curve for that website was to put an adjustable expiration data on points, for example a default of 6 weeks, which is the time constant for determining chronic training stress in the Coggan formulation. This makes enormous sense, but it's a bit arbitrary. All of a sudden a power point goes from being 41.9 days old to being 42.1 days old and the entire maximal power curve undergoes a big jump. Obviously there's been no change in my fitness over those 4.8 hours. Yet the maximal power curve might have predicted I was capable of substantially harder efforts at the earlier time than the later time.

The obvious approach is follow Coggan's example in the CP calculation of depreciating old points exponentially. An expontial weighting scheme with a 42 day time constant, for example, would be consistent with the CTS formula which uses the same weighting.

But the devil's in the details.

One goal on the fitting of maximal power curves is the "do no harm" principle. That is if I add additional activities to an existing data set, the resulting maximal power curve should never be unambiguously lower (it my be lower at certain points and higher at other points, however: for example, in the CP model, a strong short-duration effort may increase AWC but decrease CP). This is the justification for an envelope fit. If I go out for an easy recovery spin, with low power numbers, that doesn't imply that my maximal power is lower, it simply fails to prove that it's higher.

Suppose I have a recent ride with some high power numbers for a given duration, contributing to my maximal power curve. Then I add an old ride with slightly higher numbers at the same duration. IF these old data replace the newer by virtue of being higher, but are then assigned a low weight due to being old, the result may actually be a lower maximal power curve due to the recently uploaded old data having little influence. So if you are going to deweight old activities, you need to fit to more than just the highest-value points for each duration: you need to fit for all values for each duration, weighting each by an age term.

But now my easy recovery spin, for which under the existing system all of the points would be ignored, now has finite influence on the maximal power curve, dragging it down some amount. The weighting used in the envelope fit will reduce the influence, but it will still be there.

Another example would be if no activities are registered for a long interval, the existing points become deweighted, but since they all deweight together, the maximal power curve stays unchanged. Then a new activity is added, for example the same example of an easy ride, and all of a sudden the maximal power curve would be dramatically reduced. The rider has lowered his maximal power curve. This may be an undesired result.

An alternate possibility is to depreciate powers. With this approach, old activities contribute at full weight, but at lower power. Exponential is the simplest approach, because the entire curve can be depreciated before a new activity added (the amount to incrementally reduce a given value is proportional to the value, independent of age). However, this is rather unsatisfying as even activities just a few days old are significantly attenuated. Other depreciation rates could be considered, for example cosine-squared, which stays flat and then more rapidly attenuates to zero, although in a fashion smoother than the Strava approach of simply discarding activities.

But basically I'm not seeing that any of these techniques are worth the complexity. The Strava approach of using a fixed time window for activities is transparent, simple, and computationally efficient. That's probably the best approach.

No comments: