Friday, December 7, 2012

CIM time comparison: 2011 versus 2012

I downloaded the 2011 and 2012 complete results via the CIM website so I could do a more detailed time analysis. After all, the averages were slower this year, but how did this compare for different types of runners? For example, were faster and slower runners both slower, and if so, by the same fraction?

Since there was a different number of runners in the two years, I converted ranking (based on chip time) into a number from 0 to 1. I created bins for each runner and gave him the number representing the center of the bin. So, for example, if there had been two runners, they would receive rankings 0.25 and 0.75. I chose 1000 values in the range 0 to 1 (0.0005, 0.0015, 0.0025, ..., 0.9985, 0.9995) and interpolated the times for each of these normalized rankings for each year. Then I took the difference, in minutes. I plot this versus the 2012 time here:

time comparison

I noted the time difference tended to be proportional to time, and I fit a line through the data with a single fitting parameter. That line shows the proportionality: the time difference 4.0% of the 2012 time.

But there are confounding effects. The average age of 2011 finishers was 40.74 year, while the average age for 2012 was 40.94 years, 0.2 years more. Note ages were listed to only 1 year precision and this is less than that, but given the large number of runners the difference should be fairly accurate. Then there's the male-female ratio: 43.05% of the finishers were female in 2011, but 45.85% were female in 2012. Each of these trends is expected to increase the average finish time.

So I calculated the natural logarithm of each riders time, then averaged these together for all all the riders in each 5-year sex-based division (for example, M30-34). I assumed the age difference in the two years within a given division was insignificant. Two divisions (W75-79 and M80-84) had finishers in 2011 but not 2012, so I eliminated these. For each division, I calculated a difference in the average logarithm of time (I use logarithm instead of time to avoid over-weighting runners with slower times). Then I set the number of runners in each of these divisions to the average of the 2011 and 2012 values. I then calculated a weighted net average of the difference in the log of time for 2012 versus 2011.

First, every division except women 60-64 was slower in 2012, so that was fairly overwhelming evidence the course was slower this year. But part of the difference was still apparently due to the demographic shift: the percentage drops from 4.0% to 3.7% after this process.

In my particular case, it appears I might have expected to be 7.7 minutes faster last year, which would have put me at 3:17:54.

So I conclude the course conditions were responsible for 3.7% time difference 2011 versus 2012. It will be interesting to try and analytically estimate how this might have been.

So how does this compare with what I calculated last time? There I concluded the time difference was 4.07% of my total time, using just the mean and standard deviation reported by MarathonGuide. However, that analysis failed to consider the demographic shift between the two years.

No comments: