power model summary

introduction

I spent 2 weeks analyzing fitting power-duration curves, inspired by a Veloclinic post on an analytic form for an anergy source with both rate and capacity limits (I don't know why I found this enlightening, as you get the same equation from simple capacitor-resistor networks in electrical engineering, but I never made that bridge). Anyway, Veloclinic has continued to move ahead with feedback models, but he remains committed to a model where the aerobic energy system has fixed capacity. I have stuck with more heuristic approaches where the aerobic system has time-dependent capacity. I still have a limited anaerobic capacity, but that's because attempts to replenish the anaerobic system over time would become hopelessly entangled with the aerobic power component. So simplify.

I used two components with some success: an aerobic component and an anaerobic component. This is all that the CP model uses, and it oversimplifies each. Yet despite this it is the most commonly used model. Simplicity yields popularity. I tried adding a third constraint to better match "sprint" power in the 0 - 10 second range, but I gave up on this since the addition of coefficients for this time range resulted in numerical instabilities. So I'll say the model is limited in its ability to match results in this range.

The key improvement in my approach over what I've seen elsewhere is the use of an envelope fit. The envelope fit says measured points are never the best possible effort at a given time, but rather represent imperfect efforts, and the goal is to find the power which could be produced with a perfect, or at least near-perfect effort. So I encourage the fitting algorithm to fit the peaks on the power-duration power data rather than to cut through the center, as conventional least-square fitting does. I did this in GoldenCheetah by searching for two "quality points" to do the CP model fit.

But for more complex models, this doesn't work as well. So I worked out an iteratively weighted least square fitting algorithm, where points falling above the model curve were assigned a weight of 10 thousand, while points falling under the model curve were assigned a weight of close to 1 (exactly 1 if the spacing to to surrounding points is 2.5% of time). This worked extremely well.

For the CP model a linear fit can be done on work versus time. However, since I'm interested in fitting power, not work, I need to additionally weight this fit by a factor 1/time. This more strongly weights points at short duration which have lower values of work but higher values of power.

For the Veloclinic-based model linear fits won't work so I used a weighted nonlinear least square fit. Here I fit power directly, not work. But nonlinear least square fits are done iteratively and require good initial guesses. So I used the CP model to generate initial guesses.

fitting procedure

Decimate the original data: go through and prune all points falling closer than 2.5% of the time associated with the previous point. This speeds up the numerics enormously with negligible effect on the result.
Assign weighting values to points. For point n with time t_n, followed by point n + 1 at time t_n+1 and preceded by point n - 1 at time t_n-1, the weighting factor is ln(t_n+1 / t_n-1) / 2 ln(1.025). For the first point the weighting factor is ln(t_n+1 / t_n) / ln(1.025). For the last point the weighting factor is ln(t_n / t_n-1) / ln(1.025).
Do a CP fit:
1. Divide the weighting factor, for CP fit only, by t_n.
2. Calculate the mechanical work for each point, W_n = P_n t_n, where P_n is the measured power.
3. Initiate a list of effective weights to the base weighting factors already described.
4. Iterate until result stops changing or until an iteration limit is reached:
  1. Do a weighted least-square fit of W_n versus t_n
  2. Update the effective weighting factors w_n to be the base weighting factor if the prediction is over P_n, or some larger value (for example 10 thousand) if the point is above the prediction. The fitting coefficients are CP (slope) and AWC (intercept).
From the CP fitted values derive initial guesses for the nonlinear model:
- P₂ = CP
- τ₂ = 25 thousand seconds
- α₂ = 0.5
- P₁ = 1-second power (or whichever power has the shortest time)
- τ₁ = AWC / 2 P₁
Initialize a list of effective weights to the base weighting factors already described (these have not been divided by t₂, only adjusted for time interval).
Initialize a set of damping terms δ_n = 1.
Iterate until result stops changing or until an iteration limit is reached (for example, 256 iterations):
1. Construct a Jacobian matrix of derivatives for a function f of parameters P₁, τ₁, P₂, τ₂, and α₂, where the derivatives are for the dependence of power on the logarithm of the parameters rather than on the parameters directly. Rather than put the formatted equations here, I'll post my Perl code after this summary.
2. Calculate a list of modeled powers f_n = P₁ ( τ₁ / t_n) ( 1 - exp[ -t_n / τ₁ ] ) + P₂ / ( 1 + t_n / α₂τ₂ )^α₂
3. Construct a list of residuals R_n = P_n - f_n.
4. Calculate the matrix J^TW J, where J^T is the transpose of the Jacobian matrix, W is the weighting matrix (diagonal with elements w_n, the effective weighting terms), and J is the Jacobian matrix itself.
5. Invert this matrix: (J^TW J)^-1
6. Multiply by W J^TR to yield the correction vector for the vector of the log of the parameters: Δ = (J^TW J)^-1W J^TR
7. If the number of iterations exceeds some number, for example 32, then reduce the damping terms δ_n using: δ_n → δ_n^η δ_min^{1 - η}, where η controls the rate of damping adjustment and is for example 0.98, and δ_min is the asymptotic damping factor, for example 0.5.
8. Create an effective set of damping values δ'_n = δ_n / (1 + |Δ| ), where |Δ| is the vector magnitude of unitless vector Δ.
9. Multiply each parameter by exp[δ'_n Δ_n], where n is the index associated with the model parameter, to get revised parameter values.
The fit is done! Be happy and reap the benefits of a well-fitted power-duration curve.

Jacobian

The Perl code for calculating a row in the Jacobian matrix is as follows, where $n is the index of the point. There's a hash which contains flags for whether each component should be included, allowing for the solution of subsets of the complete model. This code also calculates the weighting factors, the "Veloclinic" power, and components of the power:

      # calc some common values
      my $t  = $time->[$n];
      my $p  = $pmax->[$n];
      my $r1 = $t / $tau1;
      my $r2 = $t / ($alpha2 * $tau2);
      my $e1 = exp(-$r1);
      my $z1 = (1 - $e1) / $r1;
      my $z2 = 1 / (1 + $r2) ** $alpha2;

      # calculate VC power
      my $pvc1 = $p1 * $z1;
      my $pvc2 = $p2 * $z2;
      my $pvc  = $pvc1 + $pvc2;

      push @pvc1, $pvc1;
      push @pvc2, $pvc2;
      push @pvc0, $pvc0;
      push @pvc, $pvc;

      # weight:
      push @w, my $w = (($pvc < $p) ? $weightFactor : $w0->[$n]);

      # derivatives:
      # solve for the log of the parameters, rather than
      # the parameters themselves, to avoid negative values
      # d ln x = dx / x => dx = x dln x => d f / d ln x = x df / dx

      my @r =
        (
         # P1 derivative:
         $parameter_hash{p1} ?
         $p1 * $z1 :
         (),

         # tau1 derivative:
         $parameter_hash{tau1} ?
         $p1 * ($z1 - $e1) :
         (),

         # P2 derivative
         $parameter_hash{p2} ?
         $p2 * $z2 :
         (),

         # tau2 derivative
         $parameter_hash{tau2} ?
         $p2 * ($alpha2 * $r2) / (1 + $r2) ** ($alpha2 + 1) :
         (),

         # alpha2 derivative
         $parameter_hash{alpha2} ?
         $alpha2 * $p2 * ( $r2 / (1 + $r2) - log(1 + $r2) ) * $z2 :
         (),
        );

      # weighted row
      my @rw = map { $_ * $w } @r;

      # Jacobian matrix:
      push @J, \@r;

      # weighted Jacobian:
      push @Jw, \@rw;

      # residual
      push @R,
        [
         $pmax->[$n] - $pvc,
        ];

discussion

So why do this fit? Two reasons: one is to interpolate the attainable power of efforts for which previous quality efforts do not exist. The other is to have a quantifiable measure of fitness. The latter goal will always be a stretch, but if the data are of sufficient quality (good efforts at a sufficient number of sufficiently spaced durations) the interpretation might be:

parameter	description
P₁	peak anaerobic power
τ₁	sustainability of anaerobic power
P₂	peak aerobic power
τ₂	sustainability of peak aerobic power
α₂	rate of power loss in endurance limit

where the power at time t is predicted to be:
P|_t = P₁ ( τ₁ / t) ( 1 - exp[ -t / τ₁ ] ) + P₂ / ( 1 + t / α₂τ₂ )^α₂

So what next? It's tempting to say implement this in GoldenCheetah, but it seems Damien has that ground already well covered with his model.

Search This Blog

On Bicycles, and.... what else is there?