Tuesday, March 31, 2015

pro cycling victories: is the early season predictive?

The most excellent Inner Ring blog recently posted statistics on World Tour and Pro Continental professional cycling wins so far in 2015: the link is here. These numbers have come into focus due to Tinkoff-Saxo's owner, Alexei Tinkoff, sacking long-time team manager Bjarne Riis. Bjarne has been managing the team, and its predecessors, since it was founded as Team CSC in 2000.

Cycling is novel relative to most team sports in that there's no definitive overall standing. The UCI repeatedly cooks up various rankings but these are always consigned to the depths of trivia: it's a sport which is very much in the moment, who wins the race today, with certain races mattering more than others. So on what's important and what's not as important is more a matter of consensus than anything else. No two people will fully agree, except perhaps those who think the Tour is the only game in town. And those "fans" are generally derided by the "purists".

It's an interesting matter: not many teams at the top level will say they're targeting wins in January and February. The races "which matter" are from mid-March through October. The season starts in earnest with Paris-Nice and Tirreno-Adriatico, then the first monument, the one-day Milan-San Remo. Races before this are either of primarily regional interest (for example, The Tour Down Under) or are considered preparatory races where a top level rider may take the win if it's available but may simply be training through the race, for example going into the first stage already tired from heavy training, or may even do additional training after relatively short and easy stages. There's always riders who will cherry pick relatively low-level races to build up UCI points or win total, recognizing that the higher level races are out of reach, but most of the top tier riders have their focus later. This attitude is taken to its extreme in the "Le Tour, C'est Tout" attitude which was popularized by Lance Armstrong, but no rider can peak all year, and so you've got to pick your battles, and it's not likely that battle will be Debai or Ruta del Sol.

Anyway, here's the numbers from the blog (plot stolen from the InnerRing blog):

Team Victory Totals

You can see that Tinkoff-Saxo is indeed having a sobering season so far: only two wins. Meanwhile Team Sky, another team with aspirations to Tour de France victory, has sixteen, only two short of perennial classics juggernaut Etixx-Quickstep.

Even if you assume most of these races are of low significance, the question is whether they are predictive of performance later in the year. Obviously you'd expect some correlation: Sky and Quickstep aren't going to roll over and die. They're big budget teams with top-notch riders, while FDJ with its enormously lower budget isn't going to move to the front of the standings. But Riis, with his traditionalist focus on "the races which matter", no doubt is less concerned about the team's victory total in these early days of 2015.

So to see if there's a correlation between early wins and mid-to-late-season wins, I looked at data published on the blog last year. Here's that result:


What we see here is a dominant performance by Quickstep, leading the pack in both the early season and mid-to-late season. But among the rest it's less clear. Notable is that Tinkoff-Saxo (TS) improved considerably after April. This is as you'd expect for a team focusing on the Tour and the Vuelta. Sky on the other hand underperformed later versus earlier. This in part reflected the team's disappointing results in the Grand Tours. Astana was another team with Grand Tour ambitions, and indeed it won the biggest of all, the Tour de France with Nibali producing a super-impressive performance in that race, including its controversial cobbles stage. Astana (A) is another team which improved dramatically from April and beyond. You also see Katuscha and FDJ up there.

The correlation coefficient for this plot is 0.686 including Quickstep, but only 0.405 without. Clearly Quickstep was the most successful team at generating wins, but remove them and there's little evidence a strong pre-season is necessarily predictive of how the season will evolve from there. If this were the plot from 2015 you'd say the sacking of Riis was a remarkable success. Yet last year Riis was the Tinkoff-Saxo team manager for the full season. The result represents his focus on the mid-season races.

Tinkoff wants a more scientific approach to team management, similar to the Sky model. That's fine. But the message here is don't panic just because a favorite team isn't winning in Jan and Feb.

As an aside, I was curious if that correlation coefficient of 0.405 was statistically significant. So I pruned out all the races won by Quickstep, randomly distributed the remaining races in each group to one of the remaining 16 teams, and calculated the correlation coefficient for the two series. I repeated this one million times. I got a fairly normal distribution of correlation coefficients with values exceeding the target 0.405 a fraction 6.1% of the time. So while there's a clear correlation present here statistically it fails to disprove the null hypothesis that, except for Quickstep, there is no correlation. In any case the message is that the early season isn't so important after all.

No comments: