### introduction

I spent 2 weeks analyzing fitting power-duration curves, inspired by a Veloclinic post on an analytic form for an anergy source with both rate and capacity limits (I don't know why I found this enlightening, as you get the same equation from simple capacitor-resistor networks in electrical engineering, but I never made that bridge). Anyway, Veloclinic has continued to move ahead with feedback models, but he remains committed to a model where the aerobic energy system has fixed capacity. I have stuck with more heuristic approaches where the aerobic system has time-dependent capacity. I still have a limited anaerobic capacity, but that's because attempts to replenish the anaerobic system over time would become hopelessly entangled with the aerobic power component. So simplify.

I used two components with some success: an aerobic component and an anaerobic component. This is all that the CP model uses, and it oversimplifies each. Yet despite this it is the most commonly used model. Simplicity yields popularity. I tried adding a third constraint to better match "sprint" power in the 0 - 10 second range, but I gave up on this since the addition of coefficients for this time range resulted in numerical instabilities. So I'll say the model is limited in its ability to match results in this range.

The key improvement in my approach over what I've seen elsewhere is the use of an envelope fit. The envelope fit says measured points are never the best possible effort at a given time, but rather represent imperfect efforts, and the goal is to find the power which could be produced with a perfect, or at least near-perfect effort. So I encourage the fitting algorithm to fit the peaks on the power-duration power data rather than to cut through the center, as conventional least-square fitting does. I did this in GoldenCheetah by searching for two "quality points" to do the CP model fit.

But for more complex models, this doesn't work as well. So I worked out an iteratively weighted least square fitting algorithm, where points falling above the model curve were assigned a weight of 10 thousand, while points falling under the model curve were assigned a weight of close to 1 (exactly 1 if the spacing to to surrounding points is 2.5% of time). This worked extremely well.

For the CP model a linear fit can be done on work versus time. However, since I'm interested in fitting power, not work, I need to additionally weight this fit by a factor 1/time. This more strongly weights points at short duration which have lower values of work but higher values of power.

For the Veloclinic-based model linear fits won't work so I used a weighted nonlinear least square fit. Here I fit power directly, not work. But nonlinear least square fits are done iteratively and require good initial guesses. So I used the CP model to generate initial guesses.

### fitting procedure

- Decimate the original data: go through and prune all points falling closer than 2.5% of the time associated with the previous point. This speeds up the numerics enormously with negligible effect on the result.
- Assign weighting values to points. For point n with time
t
_{n}, followed by point n + 1 at time t_{n+1}and preceded by point n - 1 at time t_{n-1}, the weighting factor is ln(t_{n+1}/ t_{n-1}) / 2 ln(1.025). For the first point the weighting factor is ln(t_{n+1}/ t_{n}) / ln(1.025). For the last point the weighting factor is ln(t_{n}/ t_{n-1}) / ln(1.025). - Do a CP fit:
- Divide the weighting factor, for CP fit only, by t
_{n}. - Calculate the mechanical work for each point, W
_{n}= P_{n}t_{n}, where P_{n}is the measured power. - Initiate a list of effective weights to the base weighting factors already described.
- Iterate until result stops changing or until an iteration limit is reached:
- Do a weighted least-square fit of W
_{n}versus t_{n} - Update the effective weighting factors w
_{n}to be the base weighting factor if the prediction is over P_{n}, or some larger value (for example 10 thousand) if the point is above the prediction. The fitting coefficients are CP (slope) and AWC (intercept).

- Do a weighted least-square fit of W

- Divide the weighting factor, for CP fit only, by t
- From the CP fitted values derive initial guesses for the nonlinear model:
- P
_{2}= CP - τ
_{2}= 25 thousand seconds - α
_{2}= 0.5 - P
_{1}= 1-second power (or whichever power has the shortest time) - τ
_{1}= AWC / 2 P_{1}

- P
- Initialize a list of effective weights to the base weighting factors already described (these have not been divided by t
_{2}, only adjusted for time interval). - Initialize a set of damping terms δ
_{n}= 1. - Iterate until result stops changing or until an iteration limit is
reached (for example, 256 iterations):
- Construct a Jacobian matrix of derivatives for a function f of parameters P
_{1}, τ_{1}, P_{2}, τ_{2}, and α_{2}, where the derivatives are for the dependence of power on the logarithm of the parameters rather than on the parameters directly. Rather than put the formatted equations here, I'll post my Perl code after this summary. - Calculate a list of modeled powers f
_{n}= P_{1}( τ_{1}/ t_{n}) ( 1 - exp[ -t_{n}/ τ_{1}] ) + P_{2}/ ( 1 + t_{n}/ α_{2}τ_{2})^{α2 } - Construct a list of residuals R
_{n}= P_{n}- f_{n}. - Calculate the matrix J
^{T}W J, where J^{T}is the transpose of the Jacobian matrix, W is the weighting matrix (diagonal with elements w_{n}, the effective weighting terms), and J is the Jacobian matrix itself. - Invert this matrix: (J
^{T}W J)^{-1} - Multiply by W J
^{T}R to yield the correction vector for the vector of the log of the parameters: Δ = (J^{T}W J)^{-1}W J^{T}R - If the number of iterations exceeds some number, for example 32,
then reduce the damping terms δ
_{n}using: δ_{n}→ δ_{n}^{η}δ_{min}^{1 - η}, where η controls the rate of damping adjustment and is for example 0.98, and δ_{min}is the asymptotic damping factor, for example 0.5. - Create an effective set of damping values δ'
_{n}= δ_{n}/ (1 + |Δ| ), where |Δ| is the vector magnitude of unitless vector Δ. - Multiply each parameter by exp[δ'
_{n}Δ_{n}], where n is the index associated with the model parameter, to get revised parameter values.

- Construct a Jacobian matrix of derivatives for a function f of parameters P
- The fit is done! Be happy and reap the benefits of a well-fitted power-duration curve.

### Jacobian

The Perl code for calculating a row in the Jacobian matrix is as follows, where `$n` is the index of the point. There's a hash which contains flags for whether each component should be included, allowing for the solution of subsets of the complete model. This code also calculates the weighting factors, the "Veloclinic" power, and components of the power:

# calc some common values my $t = $time->[$n]; my $p = $pmax->[$n]; my $r1 = $t / $tau1; my $r2 = $t / ($alpha2 * $tau2); my $e1 = exp(-$r1); my $z1 = (1 - $e1) / $r1; my $z2 = 1 / (1 + $r2) ** $alpha2; # calculate VC power my $pvc1 = $p1 * $z1; my $pvc2 = $p2 * $z2; my $pvc = $pvc1 + $pvc2; push @pvc1, $pvc1; push @pvc2, $pvc2; push @pvc0, $pvc0; push @pvc, $pvc; # weight: push @w, my $w = (($pvc < $p) ? $weightFactor : $w0->[$n]); # derivatives: # solve for the log of the parameters, rather than # the parameters themselves, to avoid negative values # d ln x = dx / x => dx = x dln x => d f / d ln x = x df / dx my @r = ( # P1 derivative: $parameter_hash{p1} ? $p1 * $z1 : (), # tau1 derivative: $parameter_hash{tau1} ? $p1 * ($z1 - $e1) : (), # P2 derivative $parameter_hash{p2} ? $p2 * $z2 : (), # tau2 derivative $parameter_hash{tau2} ? $p2 * ($alpha2 * $r2) / (1 + $r2) ** ($alpha2 + 1) : (), # alpha2 derivative $parameter_hash{alpha2} ? $alpha2 * $p2 * ( $r2 / (1 + $r2) - log(1 + $r2) ) * $z2 : (), ); # weighted row my @rw = map { $_ * $w } @r; # Jacobian matrix: push @J, \@r; # weighted Jacobian: push @Jw, \@rw; # residual push @R, [ $pmax->[$n] - $pvc, ];

### discussion

So why do this fit? Two reasons: one is to interpolate the attainable power of efforts for which previous quality efforts do not exist. The other is to have a quantifiable measure of fitness. The latter goal will always be a stretch, but if the data are of sufficient quality (good efforts at a sufficient number of sufficiently spaced durations) the interpretation might be:

parameter | description |
---|---|

P_{1} | peak anaerobic power |

τ_{1} | sustainability of anaerobic power |

P_{2} | peak aerobic power |

τ_{2} | sustainability of peak aerobic power |

α_{2} | rate of power loss in endurance limit |

where the power at time t is predicted to be:

P|

_{t}= P

_{1}( τ

_{1}/ t) ( 1 - exp[ -t / τ

_{1}] ) + P

_{2}/ ( 1 + t / α

_{2}τ

_{2})

^{α2}

So what next? It's tempting to say implement this in GoldenCheetah, but it seems Damien has that ground already well covered with his model.

## 2 comments:

Always great reading your posts Dan

Thanks, Ryan -- and have fun in GC!

Post a Comment