Thursday, February 6, 2014

testing the 5-parameter vs 4-parameter forms of power-duration model

I wanted to test the robustness of the fitting algorithm a bit more. To review, my model is the following, based on a Veloclinic model:

P = P1 ( τ1 / t ) ( 1 - exp[-t / τ1] ) + P2 / ( 1 + t / α2 τ2 )α2

The two parameter options are:

  1. a 5-parameter model, where I fit P1, τ1, P2, τ2, and α2 independently.
  2. a 4-parameter model, where I fit P1, τ1, P2, and τ2 independently, but assume a value for α2.

For this post, the assumed value of α2 in the 4-parameter model was 0.5.

I needed test data against to fit the model. In the first test, I used an older form of my model, where I set the coefficients the same for every test set. To generate the data I assumed a series of rides, each of constant power, each rider of a randomized duration with the durations chosen to span the data range. Each ride was of equal "quality": the power for the full duration of the "ride" was the power from the model.

For this first test, I wanted to test the robustness of the model. I used a relatively small number of rides (approximately 4), so there were an equal number of "quality points" to which the envelope fit had to optimize coefficients.

For this, first I show the results of the 5-parameter model. In each of these animations there are 30 frames: 30 fits to 30 randomized data sets. Ignore the "Pvc" versus "Pvc0" distinction: they are the same (that's a relic from the 6-parameter model).
5-parameter model

Then the 4-parameter model:
4-parameter model

Neither model should fit the data perfectly, since a different form of the model was used to generate it. When it works, the 5-parameter model does well, but not substantially better than the 4-parameter model. The 5-parameter model has a relatively high rate of crash-and-burn, however.

Next I tried a new example set. Here I used more rides (around 10 instead of 4), and I used the present form of the power-duration model to generate the points. Instead of just changing the ride times for each set, I also randomized coefficients τ1, τ2, and α2. Here's the results:

5-parameter model:
5-parameter model

4-parameter model:
4-parameter model

With more "quality" points, the 5-parameter model is able to fit the curves better when it works, but it too frequently fails to converge. The 4-parameter model makes a comprimise in the long-time range, but still does "fairly" well there (by eye): within a few watts on the prominent points.

I then increased the number of rides to approximately 40. This is an extremely high-quality data set. If the 5-parameter model is able to match anything it should match these.

5-parameter model:
5-parameter model

4-parameter model:
4-parameter model

Here the fit quality improvement of the 5-parameter is more evident. The shape of the power decay in the long time range is better matched compared to the 4-parameter version when the randomized α2 falls far enough below the assumed value of 0.5. But if I were fitting my own data, I'd consider these 4-parameter fits to be decent. And occasionally the 5-parameter model fit still fails, getting confused and wandering off into a wild good chase in the 5-dimensional parameter space.

Of course, the choice I made for α2 was arbitrary. I could have chosen something different than 0.5. So a hybrid approach is to fit the data, and if the fit doesn't look good, to try a different α2 and do it again. Based on my experiments, this is what I think I'd recommend. In some cases, for data of high quality extending out to long times, this may be important, but for many other cases the focus is on shorter times and one α2 will be substantially as good as another.

No comments: