Tuesday, December 31, 2013

blog posts per year: 2013 update

As the year ends, it's time to check on my post rate for 2013... I plot the accumulated posts per year by month:

posts per month

2013 is a purple line. There's a weak period following my injury in June, but then I had a burst of inspiration, and I ended up right up there with 2009 and 2011. 2010 was the most productive year, whereas 2012 was a slow year.

Here's the totals by year, along with a trend line I established last year, as well as a revised trend line:

posts per year

2013 was tied with 2009 for 2nd most after 2010, while 2011 was only one post less for 3rd. I recovered from the reducing post trajectory I seemed to be following last year.

The new fit uses a somewhat different formula consisting of two components: a hyperbolic tangent for the upward transient followed by an exponential decay. My 2012 formula uses a linear ramp for the upward transient, but hyperbolic transient is better, as it saturates:

posts = 169.2 tanh [ 1.13 ( year - 2007.89 ) ] exp [ -0.063 ( year - 2007.89 ) ]

Here's months where I've posted at least 15 times:


I'm looking forward to 2014...

Monday, December 30, 2013

some 2014 New Years resolutions

It's approaching the end of 2013, and it's time for some New Years resolutions.

I used to be against New Years resolutions, because I felt if there was something worth doing, it should be done immediately, not held off until New Years for the purpose of providing a resolution. Indeed, that may be true, but it's still worthwhile to take time to reflect at the end of the calendar year and think about changes worth making.

So with that in mind, some resolutions:

  1. to eat more apples
  2. to get my hair cut -- it's getting long
  3. to work from home more often. This requires definable goals for the day. I spend too much time on the Caltrain commuter rail.
  4. to ride from home to work at least 52 times during the year, barring issues such as the injury which got in the way of me doing so this year. This is not a top-priority goal, as riding to work gets in the way of other goals, such as running at lunch, or going to yoga class after work, since it gets me to work later. But 52 times is a good target.
  5. to continue attending yoga classes through the year. Holiday travel has me on a 2-week void in yoga classes. I'd been on a good run, inspired by recovering from my injury.
  6. to do a mountain bike event
  7. to run a 50 km trail race. My previous long is marathon (42 km).
  8. to go camping more than in 2013.
  9. to do a self-supported bike tour with Cara.

Sunday, December 29, 2013

running km per day: more trend analysis

Yesterday I did a run followed by a hilly hike, each approximately 10.5 km. This put a 21 km point on my trend analysis plot. I then redid the plot, including my least-square regression of an exponential curve. Here's the result, along with two other curves I'll explain in a bit:


This was my longest day so far, but it was just one day. However, the result was a profound change in the exponential trend line.

Small aside: in the plot I did yesterday, I made a small error, which was to assume when you fit a curve K exp(α t) to time-series data, if &alpha is in units 1/week, then this represents a 100×&alpha% per week increase. This is a good approximation only for small values of &alpha. I corrected this error in the text of yesterday's blog (not the plot), and did it correctly in this plot.

Anyway, the problem with the exponential trend line is least-square fits are highly influenced by outliers, especially when they occur at the edge of the data. So yesterday's long day had an exceptional influence on the parameters. This suggests the parameters aren't so good.

So I returned to an established analysis mode: CTS and ATS. These are typically applied to data related to the work done during cycling events, but distance and work in running and hiking are strongly correlated, so applying them directly to distance seems suitable here. ATS (acute training stress) running exponential average with time constant 7 days, while CTS (chronic training stress) is a running exponential average with time constant 42 days. The shorter time constant represents fatigue and responds more rapidly to changes in daily distance, the longer time constant represents fitness and responds slower.

I see ATS spiked at San Diego 3 weeks ago. After that I recovered a but, then ATS ramped up, hitting an even higher spike at the point 0 (yesterday). It's now at 7 km/day. CTS, representing fitness, is only at 3 km/day, however. This implies I'm still ramping up mileage, building endurance. Were I to taper for a marathon, for example, I'd want ATS to dip below CTS by race day.

Saturday, December 28, 2013

running km per day trend analysis

With December travel now complete following completion of my post-injury physical therapy, I've been focusing the past weeks on running versus cycling. I decided to analyze how my trend has been in running distance.

It's common to plot running distance versus week, but this can yield artifacts. For example, if the week starts on Sunday versus Monday, it could shift a Sunday long run a full week. This is too crude a time precision for good trend analysis.

So instead I plotted distance per day, including running and hiking but not walking (since walking is so ubiquitous in daily activity it's hopeless to try and track it without wearable sensors). I then fit an exponential curve using an unweighted least-squares fit to analyze the trend.

The plotted data extend back to when I started running again post-injury. I had done a few treadmill workouts well before this, in August, but these left me hobbled, and so I basically restarted from scratch on 31 Oct due to the encouragement of my physical therapist (Dave @ Potrero Physical Therapy).

So I'm basically at 6.4 km per day right now with a rapid rate of increase over the trend period (it's shown as 23.8%/week, but that should really be 26.9%/week, since the plot makes a linearization assumption of the exponential function which is invalid at this rapid rate of increase). That increase makes sense with very intermittent running, but I won't be able to sustain it, obviously. But it's good to know the trend has been upward and the distance number gives me a good baseline to judge future running as I think about going after a trail run in January.

Friday, December 27, 2013

2014 Low-Key Hillclimbs pages posted today


Every year it seems like more work than the year before, and that's probably because it is more work than the year before. But I just finished preparing a workable draft for the 2014 Low-Key Hillclimbs web pages. They're here. The default page remains 2013, as I want to keep the results from this year up at least through the end of the calendar year, which is just a few days away.

Why more work? Two reasons. One is the increased prevalence of new courses. In the past, when I'd select climbs for the next season (with rider input) I'd intentionally go for a mix: the annuals (Montebello and Hamilton), a few favorites on a roughly 3-year cycle (like Old La Honda, Kings Mountain Road, Sierra Road, Bohlman, Welch Creek, previously Diablo before we dropped it due to hassles from the rangers), and others which were good for rarer repeat visits (like Soda Springs, Page Mill, Highway 9). Then each year we'd add in a few, maybe two or three, fresh climbs just to stay creative. This way I'd just find old copies of the data for each climb, update as needed, and paste in the data. Add a new photo and I'm basically done.

But starting last year I went to "Coordinator's Choice" and coordinators like to exercise their creativity by picking new climbs. This is awesome: I love doing fresh climbs we've not visited before. But it means I have my work cut out for me, in particular creating a new route profile and generating climb stats.

The other time cost has been the "time saving" GPS climbs. These are for climbs where it's impractical or politically impossible to set up an actual event. So we invite riders to do the routes on their own and submit GPS data. But with these climbs, which are invariably fresh courses so far, I not only need to do the usual new course things, but then I need to set up the GPS checkpoints and test these against real-world data.

It's not that big a deal for any given week, but 9 weeks total in the series, it adds up. Still, I really do love it. I feel like I'm creating something good and valuable.

If I'm looking forward to any weeks the most, they are weeks 4, 5, and 7. Week 4 will be a fun tour of the Berkeley Hills. In the spirit of Paul McKenzie's brilliant Nifty Ten-Fifty, it's more of a "Nifty Lite": 3825 of timed climbing (5000 feet of climbing in all). These short, steep climbs are a blast, but they're typically too short for a full event, so combining six of them in this way is something enabled only by the GPS scoring (manual timing would be way too much work).

The next one I really look forward to is the Los Gatos Bike Racing Club's week: week 5. It's a fantastic climb in the Santa Cruz Mountains west of Loma Prieta. These are rugged, lightly trafficked roads which have a very different character than the more popularly traveled roads north of San Jose. They have a feeling of rugged remoteness which is fantastic.

And finally I'll point out my climb: week 7: San Bruno Mountain (W). This is a home climb for residents of San Francisco, is easily accessible by public transit, and offers fantastic views from the summit. It's short by Low-Key standards so results will be on the challenging side, but we always manage.

So I look forward to the fall! It should be a fun series. But in the meantime, don't forget the upcoming MegaMonster Enduro in February!

Friday, December 20, 2013

Marin Headlands: Miwok and Marincello

An amazingly nice mountain bike loop in the Marin Headlands, just across the Golden Gate from San Francisco, is the Miwok trail - Marincello trail climb combination. The two are connected by the Old Springs trail descent with its fun series of modest steps. The return from the Marincello summit is the Bobcat Trail descent.

A profile of the climbs with the Old Springs descent in between is here:


The smoothed grade versus distance is here: grade

The grades omit the transitions at the bottom and top. Still, they don't do full justice to the difference. Miwok is an undulating grade, with a series of steeper portions, while Marincello is more of a steady grind with a brief recovery followed by a final short, steep bit at the end. Marincello is a smoother surface: there's some ruts on Miwok. But both trails are easily rideable on a road bike. The Old Springs descent is a bit rough going on a road bike but it's still not a problem.

This is an awesome loop and easily extendable with Coastal Trail from/to Conzelman for Golden Gate Bridge access, or to the north via the more technical portion of Miwok and the steeper trails surrounding.

A Strava route is here.

Thursday, December 19, 2013

Mount San Bruno, Price to summit

Pen Velo's annual New Years Day San Bruno Hillclimb climbs San Bruno mountain from the east side. The climb has two sections, one up Guadalupe Canyon Road, then there's a loop down and under that road onto Radio Road, then the final, steeper climb to the summit. I did a profile of that road awhile ago, to describe that race:

San Bruno (E)

But there's multiple ways up the mountain, 3 (at present) completely paved, perhaps a fourth to kick in when an unfortunate housing development marring the north side of the mountain is completed (Mount San Bruno, except for the very top, has failed to enjoy the general level of protection of development Bay area mountainsides have received). The two major approaches are the two sides of Guadalupe Canyon Road, the other side being the west side, from Daly City. This climb begins in earnest at Price Road.

An advantage of the western approach is freedom from traffic lights. The eastern side has a traffic light at Carter Road. Additionally the western approach avoids the descent to pass under Guadalupe Canyon. You get on Radio Tower Road by passing through a gate on the side of the road. With an event permit, hopefully that gate could be opened.

Here's the profile of the western approach. The top portion of the climb, Radio Road, is shared. In this profile I drew the grade for Radio Road as an average over a longer stretch, rather than as a tangential peak grade:

San Bruno (W)

My climb rating rewards continuous climbing, so this side gets a benefit versus the eastern side for the lack of the intermediate descent. But net climbing is less, and there's some steeper grades on the eastern side of Guadalupe Canyon, so the eastern approach comes out ahead in climb rating.

Wednesday, December 18, 2013

2013 was the steepest Low-Key Hillclimb series yet

On a short flight from Malta NY to Philadelpha, I decided to take a quick look at the net climbing statistics from Low-Key Hillclimb years. To do this, I took the stats for each climb from the last year the climb was used (rather than the stats claimed for each given year) since on some climbs there were revisions as better data became available. I summed up the climbs for each climb for which there were finishers, omitting "X" weeks.

Here's a plot of the result. I superpose lines representing constant average grades from 3% up to 10%. plot

1995 has the most total distance and climbing since there were 12 climbs that year, including both Mt Hamilton, Mount Diablo, and Soda Springs, all long climbs. It falls fairly average on the average grade spectrum. 1998 had the least climbing despite climbing Mt Hamilton twice. The series started out short and had two climbs canceled, leaving only five.

Since we stopped doing Mount Diablo after 2009, climbing in the series has generally been less. 2013 wasn't exceptional in climbing, but distances were relatively short (2nd shortest total series in history), making average grade the highest so far by a small margin. What may be surprising is the average grade is still relatively modest at 6.48%. Flattish portions, or for that matter descending like on Mount Hamilton, provide a lot of dilution, even if the 15%+ grades which were plentiful this year provide a disproportional fraction of the pain.

Here's some numbers:

net meters
net km
avg meters
avg km
avg grade

Tuesday, December 17, 2013

In Malta, NY

I'm in Malta NY on a business trip. It was a cold night.

I wanted to get some food supplies. I knew there was a PriceChopper food market nearby, so I went to the desk. It's a cold morning. Weather underground says 0F now (8am) but it was probably colder then. Fortunately I brought layers.

Even earlier, in the pre-dawn darkness at 6:15 am, I'd seen a woman going toward the lobby wearing cold-weather running gear. "You're running outside?" I'd asked. "If so, it's been good knowing you..."

"I used to live in Lake Placid. I'm used to it," she responded confidently.

So now it was my turn. I asked directions for Price Chopper.

"You go here, then around this traffic circle...." It was clear I was getting driving directions.

"No -- I'm walking."

She looked at me incredulously. "It's a half mile away. Do you want me to call you a taxi?"

"I'll be fine..." I responded, and left.

Sidewalks were partially shoveled from snow two days ago. This made then somewhat treacherous. There was one set of footsteps on the sidewalk I was on, despite heavy car traffic and a strip mall being directly across the road. I had to wait quite awhile to wait for a gap in traffic to scamper across, and only because a car stopped for me. Half the trip was through long access roads and across extended parking lots because land is cheap and squandered.

It was clear bipedal motion was considered an atavism, or at best a way to and from the parking lot.

At the check-out of the food store, each shopper was given a 10 second speech cheerily informing them that their shopping card resulted in them getting an "X-cent" discount on a gallon of gas. I have no interest in gallons of gas.

If 300 million people listen to a 10 second speech about savings on gas once per week, that's around 60 human lifetimes squandered per year. That made this speech around twice as deadly as lightning strikes.

It was a nice walk. I was heavily dressed and was sweating from my combination run-walk. I don't get much opportunities to experience real cold in San Francisco, where we complain about sub-50F. 0F is a different beast. In a way, since it's viewed as a challenge and not an annoyance, easier to take, at least as a visitor. But the contrast between that runner and everything else was striking.

Monday, December 16, 2013

December travel

December for me this year is dominated by travel.

Last weekend and the last part of the preceding week was consumed by a trip to a company internal conference in San Diego. San Diego is cool: I'd previously been there for a Christmas bike tour supporting Hostels International maybe four years ago. But that was just in-and-out. This was first time spending real time there. The hotel was near the convention center, so it had excellent access to a bike-ped trail along the bay. At 6 am the first day was a 4 km running race for conference attendees along the path. That was fun: my first "speed work" since my injury, and all things considered I did okay, finishing 5th. In total Wed PM - Sun AM I managed 3 yoga classes (two in a local studio, one affiliated with the conference) and 4 runs (the race, a stair climbing session in the hotel stairwells, and two long runs). This overindulgance in running took a certain toll, and my legs are still a bit tired a week later. I look forward to getting my running legs back.

Then after returning on Sunday, on the following Saturday (two days ago) it was back on a plane for an important family event in Philadelphia. I was on a Saturday night red-eye, since it required late-in-the-game flight changes, and changing flights during the holiday season is difficult, more difficult every year.

Today I'm on a plane again up the east coast for a work trip. I'll be in the frigid Albany area until Wednesday, then it's back to Philadelphia Wednesday night.

From there: a train to New Jersey for more family visits, back to Philadelphia, and eventually back to San Francisco.

It's great seeing family, of course, and the work trip is what it is. New York state has its charm, anyway. On the fitness side, however, all of this is a real challenge. At Philadelphia, instead of hopping in a taxi at the airport, I took the regional SEPTA train, then walked around 1.5 miles to my destination, reversing that this morning on the return. This is what I call "incidental excercise": getting excercises in tasks which need to be done anyway. It's gotten a lot of attention in the wearable activity sensor market, but no technology is needed. The walk to the main train station is a nice one along the river. Had I been in a taxi, it would have been a totally unrewarding trip.

The Albany area will be a bigger challenge. There's a yoga studio near my hotel. I'd like to get some running in but I'm afraid it will be simply too cold, with temperatures dropping as low as -4F in the forecast for my stay there. Maybe I'll find a treadmill in the hotel. Once back to Philadelphia, then in suburban New Jersey, I'll get some running in for sure: hopefully my legs suffered no sustained damage from my San Diego excesses, and I'll be able to get some aerobic work in.

You'd think the running would be better in "rural" New Jersey than in urban Philadelphia, but this is very much not the case. There's great trails along the river in Philadelphia, and the steps at the Art Museum are a must-do for anyone who's seen a Rocky movie. Philadelphia is a great place to run.

So running, some yoga... not the best preparation for the San Bruno Hillclimb on 1 Jan near San Francisco. I'm still on the fence about whether to do that. It's a great tradition, but going with only two decent rides in my legs in the previous month doesn't sound like a winning plan. But maybe I should just do it and not worry about how well I place.

Saturday, December 14, 2013

cumulative SF2G rides by year

With a huge amount of travel this month, I'll get only two real rides in, and no SF2Gs. So it's a good time to make an accounting of my SF2G totals.

I last did so at the end of 2011, so somehow 2012 slipped through the cracks. I have a rough goal of averaging one per week, but haven't attained that yet. It's important for me to have a goal to kick my butt out the pre-dawn door to ride the more than 70 km into work.

I try to keep a running total when I upload rides, but since I usually do this at work, I'm always in a rush and relying on memory to do so occasionally fails. So I'm forced to go through my Strava record and count.

Here's the plot, which starts when I signed up for Strava in 2010:

SF2Gs by month

2012 started with happy memories of New Zealand riding. I had a down-time in May when I had back pain, but overall it was a solid riding year until I redirected my focus on running for the Sacramento Marathon (CIM).

At the start of 2013 I was still running, in anticipation of running the Napa marathon in March, but my legs just weren't recovered from Sacramento. I was getting zinging pains kicking in around mile 15. So I took a break and refocused on riding. This resulted in a very solid SF2G MArch, then a bit of a taper as I started doing weekend events: Murphy Mack's Spring Classic in March, Devil Mountain Double in late April, the Berkeley Hills Road Race in May, then the Memorial Day tour (a training ride) in late May. But everything came crashing to a halt on a stupid bike-path crash dodging an erratic walker in June. I only really emerged from this in August, slowly getting back up to speed, with my last physical therapy session this past week. November saw my return to riding Low-Key Hillclimbs (ending Thanksgiving). Then December has been a mess, as it so often is, with a return to running most notably in me doing a solid block last weekend at a conference in San Diego, including a 4 km "fun run" race.

Anyway, the moral of the story is I need more consistency in 2014. In particular, don't get injured. And to average 1/week, you need to target 2/week, because stuff always happens, and some weeks there will be none.

Friday, December 13, 2013

Montebello and Mount Hamilton: climbing speed trend in Low-Key Hillclimbs

Montebello Road and Mount Hamilton Road are the two climbs we've done pretty much every year in the Low-Key Hillclimbs. They are thus the best source of data on speed trends in the series.

For men and women solo riders, I took the geometric mean of rider times for each of the climbs each time they were done. Hamilton was done twice in 1998, while Montebello was skipped that year, but every other year Montebello was week 1, Hamilton on Thanksgiving.

Here's the result, with men in blue and women in pink (original, I know):

avg time data

There's some interesting trends. In the 1995-1996-1997 as the series got more popular the average speed dropped for both climbs. 1998 was a slight down year for turn-out, but there's no Montebello data. There were two Hamiltons that year: the first was week 1 and it went off as normal, with faster times for men and slower for women. The second one, on Thanksgiving, was even quicker, but that one was broken into two portions due to a motorcycle crash, with the times added, so riders got additional recovery.

When the series started up again in 2006, times were faster than even in 1995. Again turn-out was small. Series turnout built through 2009, and as it did, average times increased. Starting in 2009, however, times have come down every year for men and the trend has been downward for women. 2013 was the fastest year yet on both climbs for men, and were relatively fast for women (there's many fewer women then men, so the result depends more heavily on who happens to come that year, yielding substantially more variation).

Running regressions, the rate of improvement is substantial: between 1.1% per year to 2.0% per year depending on the climb and whether you look at men or women.

So the end result is just because you score less than you may have in the past, you're not necessarily slower. The fields have been getting faster. In general, the more popular the Low-Key Hillclimbs, the growth comes preferentially from relatively slower riders, and average times increase. And, as has been the trend since 2009, as turn-out drops, the speeds increase.

Thursday, December 12, 2013

updated annual trends in Low-Key Hillclimb turnout

Part-way through the 2013 Low-Key series, I did a blog post on the downturn in attendance versus last year. For completeness, with the series done for the year, I wanted to update that plot.

Here's the numbers through end of 2013. I plot a trend line from the 2009 peak through 2013. There's a loss in average finishers of 6.5% per year, with the rate of loss visibly accelerated the previous two years:

But one change the past two years has been the GPS timed events. These have started out a bit slowly, with Kennedy Fire Trail last year attracting only 45 finishers, still above expectations and still what I'd consider a great success.

Then this year we extended it to two GPS timed events: Portola Valley Hills, week 4, had 69 finishers. Montara Mountain, week 8, had 51 finishers, despite a somewhat remote starting location (at the coast) and a quite challenging dirt climb (too hard for most cyclists to do on a road bike).

So the GPS climbs dragged the numbers down a bit. I plot the turnout for non-GPS climbs here:

The decline is less, down to 4.2% per year.

As enjoyable as the lower stress associated with smaller numbers may have been, with a schedule for next year which is less top-heavy on grade, I'd expect to see the numbers rebound a bit, barring regions trends in the popularity of road cycling.

One relatively constant in the series has been Mount Hamilton on Thanksgiving. There was additionally an October Mt Hamilton to open the 1998 series. Mount Hamilton is a better example than Montebello, perhaps, because we've always been willing to relax the 150 rider limit at Mt Hamilton.

Results from Mount Hamilton also peaked in 2009, but have held fairly steadily since. Finish rates are dependent on weather, however, This year the weather was excellent.

Wednesday, December 11, 2013

2013 Low-Key Hillclimbs: rider score variability and the scoring algorithm

One of the goals of ths scoring system was that rider scores varied least from week-to-week. Of course, this is simply accomplished: just give each rider a score of 100 each week, \ then variation is zero. But of course that's not what's wanted. So an additional goal is that scores are roughly proportional to rider speed in a given week.

I'll consider three scoring schemes here for the Low-Key 2013 data:

  1. score 1 is 100 × median time / rider time
  2. score 2 is 100 × a reference time / rider time
  3. score 3 is 100 × (a reference time / rider time)slope factor

Here the reference time for the week is a geometric average for all solo riders adjusted for the rider division (male, female, hybrid-electric) and the slope factors are calculated \ for each week based on how spread out the rider times are, but have a weighted average of one.

I then calculated for each rider doing at least two climbs the standard deviation of their scores, for each score, and took the root-mean-square average of these standard deviations\ . The result of this was the following for the three scores:

  1. score 1 : 4.47
  2. score 2 : 4.07
  3. score 3 : 3.74

So the first score resulted in the most variability in scores for a given rider, the second (calculating a reference time adjusting for rider quality) reduced the variation, and the\ third score (adjusting for score slope) reduced the varation even more.

This comparison is related to an analysis of variance. The analysis of vatiance calculation is based on the assumption there are multiple, independent sources of variation. In thi\ s case, one source of variation for a given rider is how he rides from week to week. This is a desired source of variation: we want riders to score better when they ride better.

Another source of variation is who happens to show up for a given week. Mostly faster riders? Mostly endurance oriented riders? This is an undesired source of score variation. A\ rider shouldn't be penalized in a given week just because the endurance oriented riders stayed home.

Another source of variation is how much the climb spreads out the riders. If a hill is particularly steep, the faster riders will be proportionately more faster than they would be \ if the road was primarily flatter, or included descents where faster climbing ability failed to be of much benefit. This is another source of undesired variation.

The assumption is since independent sources of variation are generally uncorrelated, each tends to increase the total variation, and so the scoring system with the least total varia\ tion for a given rider will generally have the least amount of undesired variation, and is thus preferred.

Tuesday, December 10, 2013

Low-Key Hillclimbs 2013: weekly score parameters

In the Low-Key Hillclimbs scoring, I calculate two parameters for each week's climb: a "rider quality" parameter which describes the average strength of riders in the climb, and a "score slope" parameter which describes how spread out the riders are. These parameters are determined from rider identification and score alone. Only riders who do more than one climb contribute to these calculations, because these riders provide a basis for comparing one climb to the next. After a single climb, if riders finish close together, than it could be due to the fact the riders are similar in ability. But if the same riders do two climbs, then assuming the riders don't naturally spread or converge in ability, then if they score closer together in one of the climbs then it might be assumed this is due to the nature of the climb, for example that the climb where they finished closer together had shallower grades where wind resistance was more important, or maybe even descents where descending speed is only partially correlated with climbing speed.

Here's the results for the 2013 climbs. First I show "quality", where I've mapped the actual variable used in the code to something close to average rider score:


The week with the riders with the lowest score here is Montebello. This is historically typical. The first week tends to draw a broader variety of riders. More dedicated riders tend to be stronger, or if they weren't strong to start with, they get stronger than those who chose to ride only the early climbs.

The quality score increases through the first three weeks, the third week being the intimidating Bohlman climb. The score again peaks with Lomas Cantadas week 7, with an even more select group moving on to accept the 7x challenge and climb Marin Ave. Interestingly Montara's score wasn't much above 100. Then Mount Hamilton, week 9, was another popularist favorite, attracting a broader range of riders.

Here's the slope score:


A slope score of less than one means the scores were spread out and need to be compressed. A slope score of more than one means the scores were compressed and need to be spread out. The conjecture that this is related to steepness of the climb is fairly well borne out. The 7X challenge had an extreme slope score: not only was there an issue with rider speed but also motivation, since for many it was challenge enough just to make it to the top of Marin Ave. Curiously Portola Valley, week 4, was next. It seems on the series of short, steep climbs, which challenged recovery as well as anaerobic power, the difference between the fastest and slowest was amplified substantially. Week 8, Montara, was next. This was the steepest point-scoring climb in the series. Next was Bohlman, week 3, which was next on the steepness scale. Lomas Cantadas, which has some very steep sections, ends up with a slope score more than one due to the dilution effect of the descent.

On the other end, weeks 5 (Black Road) and 6 (Patterson Pass) had slope scores well above one. These climbs both had some steepness, but also had extended sections of relatively gradual grade. Mount Hamilton was next, where the descents dilute the time cost of the climbing, and where drafting can be a considerable factor on the first climb.

It's encouraging when the numbers resulting from the analysis correlate with identifiable features of the climb in a way which was anticipated when the algorithm was first developed in 2011.

Sunday, December 8, 2013

2013 Low-Key Hillclimbs: examining the score algorithm

With the 2013 Low-Key Hillclimbs now over, it is a good chance to reflect on the scoring scheme and to see if it accomplished its goal of making similar relative performances on substantially different climbs score similarly.

To check this, I took the score from each week and adjusted it for the quality of the riders on that week. This should in theory result in a similar scoreing distribution as if riders of similar speed showed up each week. The rider quality adjustment is done to scores if primarily faster or slower riders show up certain weeks. For example, particularly challenging climbs like Montara tend to attract primarily stronger riders.

Then I plotted these scores versus rank. I used a normalized rank r which goes from 0 to 1, then applied a log-normal transformation to that number to map 0 to 1 to -infinity to +infinity.

Each week is scored using two adjustable parameters: a reference time and a slope factor. The goal of these parameters is to make each rider's scores during the series as tight as possible. The slope factor is needed because on some climbs, like Mount Hamilton, riders tend to finish relatively closer together due to the influence of the descents and on packs riding together up the first of the three climbs. Other climbs, most notably Portola Valley short-hills, riders tend to finish with a relatively larger spread of times. The 7x challenge, including the super-steep Marin Ave, also had a relatively broad spread of times, due no doubt to the fact some riders were forced to climb in survival mode up Marin.

You might think the slope factor would result in each climb having a similar score-versus-rank curve. This is similar, but not identical. Nor should they be identical: if they overlapped that would mean rank and only rank counted. If riders finish in a group I want them to have similar scores. Groups create plateus on a plot of score versus rank. But generally I want the curves to be such it's hard to differentiate one climb from another, except for plateaus.

Here's the result.


Montara was remarkable for some very high scores. But you can see from the plot that it's only the top 2 scores which are unusual. Montara attracted two particularly good dirt riders, and they finished close together, so it's appropriate they scored highly.

Actually the only curve which looks a bit weird there is Lomas Cantadas. I'll need to look into that. Here's the curves for Lomas Cantadas, Montara, and the aggregate curve from combining all weeks. It seems Lomas is a bit flatter than the others.


I thought perhaps this is due to the influence of so many one-time riders, 12/72, who failed to contribute to the slope term since the code had no basis for comparison for them since they did only one climb. So I recalculated the Lomas curve with only returning riders and plotted it as a dashed line. That is a bit better, but not much:


So I take a step back and look at the code, which I wrote back in 2011 for the "score slope" factors. Does it make sense?

sub iterate_score_slopes {
  for my $w ( @weeks ) {
    my $sum0 = 0;
    my $sum1 = 0;
    warn("iterating reference time for week $w.\n");
    # sum of the squares of the deviations of log scores from ratings
    for my $r ( @{$week_riders{$w}} ) {
      if ( $rider_statistical_weight{$r} > 0 ) {
        $sum0 += $rider_statistical_weight{$r} * $rider_rating{$r} ** 2;
        $sum1 += $rider_statistical_weight{$r} *
          log($reference_time_eff{$w} / $rider_time_eff{$w}->{$r}) ** 2;
    $score_slope{$w} = sqrt($sum0 / $sum1)
      if ($sum1);

  # normalize score slopes
  $sum0 = 0;
  $sum1 = 0;
  for my $w ( @weeks ) {
    if ($score_slope{$w} > 0) {
      $sum0 += $week_statistical_weight{$w};
      $sum1 += $week_statistical_weight{$w} * log($score_slope{$w});
  if ($sum0) {
    for my $w ( @weeks ) {
      $score_slope{$w} = exp(log($score_slope{$w}) - $sum1 / $sum0)
        if ($score_slope{$w});
      warn("normalized slope for week $w = $score_slope{$w}\n");

The key here is it is calculating a rating for each rider based on all of his results, then checking that the deviation of the rider's scores from that rating is minimized. This explains the shallower slope of the curve for Lomas Cantadas. It's not that the algorithm failed to yield the same curve, but that the riders who showed up for the climb were more similar (as judged by how they did in other climbs) than riders for other climbs tended to be. So the algorithm is good. To have tried to spread out the scores for Lomas more than they were, comparable to the spreads of scores from other weeks, would have been artificial, since the riders were of more similar abilities than climbs of other weeks.

So here's a plot of rider scores plotted versus the rider rating. This plot only makes sense for riders who've ridden at least two climbs, otherwise their rating equals their single score. The goal of the scoring is to have this cloud be as tight as possible with the two adjustable parameters per week.


It's a cluttered plot, so I isolate the two weeks of primary interest here:


Montara resulted in scores which deviated from rating more than most weeks, as expected, but here Lomas Cantadas seems quite typical. You can see David Collet's big score from Montara (brown point) but not Keith Hillier. Keith isn't shown because this was the only Low-Key he did.

In any case, the conclusion is in particular for Montara the scores were not anomalously high. There were two exceptional dirt riders there and they scored exceptionally highly. Was dirt too much of an influence on the series this year? I think it was a relatively high influence relative to past years, but part of the fun of Low-Key, like other races such as the Tour de France, is it's a bit different year to year. Just like the 2011 Tour was affected by the cobbles of Paris-Roubaix, an atypical influence, and the Tour of 2014 will feature an exceptionally long time trial, we can't expect every year to favor the same riders in Low-Key Hillclimbs.

Saturday, November 30, 2013

Bike Brands at the Low-Key Hillclimbs

Starting in week 5 I began asking riders to describe the bike they were on when they RSVP'ed for the following weekend's Low-Key Hillclimb. We used to write this down at check-in, but it made it a lot easier to search the data if I had it entered digitally. So if we had a rider on a red and yellow bike and couldn't identify him I could search for all descriptions with both "red" and "yellow" for the bike and realize there was only one such description. We also ask for jersey color, but still write that down at the start, since I for one have problems planning this sort of thing ahead of time. But maybe others plan their wardrobes better, so I may add jersey description to the RSVP form as well.

But anyway, I decided to check what bike brands people were riding this year. So for rider numbers for whom I had a bike description (one per rider, so not counting the same rider multiple times if he rode multiple weeks), considered his bike description and wrote I Perl code to try and identify the bike brand from the description. This involved checking spelling ("Canondale") and mapping models to brands ("Tarmac" to "Specialized", "Ghisallo" to "Litespeed", "CAAD"-anything to "Cannondale", etc).

If I couldn't identify the make, the rider isn't counted For example, some left the field blank, others put just colors, one person put "road bike"...

Here's the results. We're a lot closer to Morgan Hill than we are to Madison, and the results reflect this:

#  type
60 Specialized
29 Trek
21 Cervelo
16 Cannondale
15 Giant
8  Khalsa
7  BMC
6  Calfee
6  Felt
6  Scott
5  Bianchi
5  Colnago
5  Willier
4  Fuji
4  Leopard
4  Look
3  Kuota
3  Pinarello
2  BH
2  CoMotion
2  Kestrel
2  Motobecane
2  Orbea
2  PedalForce
2  Serotta
2  Seven
2  Steelman
2  Volagi
1  Argon
1  Ducati
1  Faggin
1  Fondriest
1  GT
1  Guru
1  Ibis
1  Kharta
1  Klein
1  LeMond
1  Litespeed
1  Merlin
1  Moots
1  Neuvation
1  Redline
1  Ridley
1  Ritchey
1  Scattante
1  Soma
1  Teschner
1  Time
1  Tommasini

Thursday, November 28, 2013

Low-Key Hillclimbs 2013: personal report

Another year of Low-Key Hillclimbs has come and gone, ready or not.

When the series began in the first weekend of October I had been riding again for two months (Aug-Sep) after missing most of June + all of July to a groin injury from a bike crash. My focus during this period was on physical therapy, however, and my ride tended to be short and low-intensity. My progress from late Aug - early Sept took a step back when I devoted riding time to watching America's Cup racing. I started ramping up again at the end of September but I was nowhere close to where I needed to be.

I traditionally coordinate Montebello, week 1 of the series, and this year was as usual. Weeks 2 and 3 I volunteered, riding climbs both weeks in advance of the participants in order to take split times along the course (Montevina + dirt week 2, Bohlman week 3). These were good, solid, hard climbs, and were a nice boost to my fitness. I added an Old La Honda Wednesday Noon Ride late October, finishing just over 20 minutes, and another early November, finishing 19:14. I rode both of these on my steel Ritchey Breakaway with Gran Bois 26 mm tires inflated to something around 80 psi. There's a boost going to my Fuji SL/1 with carbon tubular wheels and time trial sew-ups pumped to 140-150 psi. So 19:14 wasn't where I'd like to see myself on the Wed ride, a minute faster would have been nicer, but it was close enough.

Week 4 I rode Lane Parker's 52nd century of 2013 and birthday ride instead of doing the GPS-timed Low-Key that week. Getting a 100 miler in, even if it was difficult and slow, was a confidence boost. Then week 5 was a transportation challenge, so I rode to the finish to help with results, hitting the course from the north. This was another solid training ride and I followed it up with more climbing (two San Bruno repeats on Sunday, then that Noon Ride OLH the following Wed).

So I was no longer "out of shape", but I wasn't exactly in shape either. That takes longer, especially at my advanced age. But I couldn't wait any longer, I had exceeded the scoring limits of volunteering, and I had to put some real times on the board.

I started with Patterson Pass, week 6. It felt really good to go out there and mix it up again, but it was a bit frustrating being relatively uncompetitive. Our group had difficulty gelling on the early, relatively flat portion of the climb, and we lost solid time there. The rest of the way I could ride tempo but had no punch, and I got gapped toward the end. I ended up with 106.8 points. I hoped to improve in future climbs.

Lomas Cantadas was next, and I got what I was after. There I started conservatively, saving something for the steep finish to the climb, and caught two riders I've for years treated as benchmarks for a good climb: James Porter and Rich Hill. My score showed it: up to 113.2 points. That was a 4.4 point improvement in just one week. Solid.

But it would be the best I'd do in the series. The next week was Montara Mountain, and I lost rear-wheel traction twice, unclipping and walking with disastrous results, since I needed to walk all the way to the next flattish section to get back on the bike. Worse was I tried to clip back in a few times and failed, wasting more time. Additionally I just don't climb as fast on rough, loose dirt as I do on the road: I'm too distracted by line and form. My score showed the damage: only 98.6 points.

But I was happy: the climb had revealed a personal weakness, and gave me a goal for next year of improving my speed on the dirt. After all, I had been responsible for putting dirt climbs into the series, with the goal of exposing riders to a broader range of riding.

At the summit I spoke to David Collet, the winner, and he advises keeping my weight back and low on these sections. Back keeps the traction on the rear wheel, low keeps the center of mass low which reduces the likelihood of lifting the front when the front end pitches upward. Additionally, Paul McKenzie emphasized the importance of picking the good line. That I'd understood, but my problem was more what I did when I missed the optimal line. I tended to focus on that fact, rather than deal with the line I had. Indeed, I went with Cara to China Camp the next day to work more on my mountain biking, then the following day rode the mountain bike to work via Caltrain (where my rear shifter cap was knocked off by someone stacking bikes on the bike car, dammit). At work I worked on doing tight turns, as I need to learn to ride switchbacks with confidence. I will improve my mountain biking. It is a moral imperative.

With my two volunteer credits, I needed 3 "real" scores to get the five qualifying scores I'd need to place well in the series. At the end of week 8, Montara, I was in 20th place with, at that point, my two scores counting from Patterson Pass and Lomas, but Montara getting discarded. After Hamilton, which was today, I'd get that third score counted, so I wanted to do substantially better than I had at Montara. If I didn't ride at all Montara would have been promoted to a scoring week.

So Montara was Saturday, Sunday I did an excellent 8-km run (fairly slow but my longest of 3 real runs since my injury) and then did the China Camp ride with Cara, Monday I worked skill on the mountain bike, then Tuesday was an epic SF2G commute down the coast to wish riding buddy Daniel Chao a farewell to his commute down to the peninsula (he's starting work in San Francisco). So it was a solid block of riding. Wednesday, the day before Thanksgiving, I rode from Palo Alto to Mountain View in the morning, then went to a yoga class @ Yoga Tree in San Francisco that evening. It wasn't the best taper to Thursday, having done substantial mileage and climbing two days prior, but with Hamilton so long I wanted to acclimate my body to longer efforts, risking some fatigue.

I knew at Hamilton I'd face a challenge in scoring since, with the exception of Lane Parker's century ride, I was short on long rides going in, and Mt Hamilton is 30 km of mostly uphill. It's very easy to bonk there. A classic mistake I make there is to not take calories, and so I went against my normal policy of water-only in my bottles (sugar makes a polybotanical mess) and used some fairly concentrated Hammer Heed. This proved a smart move, as I never bonked on the climb, or even felt serious weakness.

But a bit of sugar water isn't going to overcome a deficiency in training, and my pace was simply slower on the climb than I wanted. Things went a bit south from the start, as I got stuck behind a rider who threw his chain just past the start line and unclipped. I chased back to the leaders, latching on to the tail end of the group, but then a rider two spots ahead of me let a gap open. It's really important on Hamilton to stick with the lead group on the first of the three major climbs which make up the route. I succeeded, the two riders behind me dropped, but then another rider apparently gave up and let a gap open. This one I couldn't close. I just had to dial it back and hope to find others to pace with.

And I did: it wasn't long before I was in a group of 4, and we were working well together. We crested the first climb. But here I failed: I let a gap open on the descent, and the riders behind me bridged up. I just didn't have any punch to follow, and was braking too much into the corners, which can be gravelly on this road. My Enve 250-based front wheel wheel is a bit dicey on descents, and I'm not practiced in descending with carbon rims in any case, and I ended up gapped off at the bottom of this descent and the start of the second climb.

From here my ride was mostly solo, and I ended up the whole of the third climb trailing two other riders who held a 15 second gap which I just couldn't shut down. The group of two became three, and then they opened it up a bit. There was some wind here and it was advantageous to be in a group for when the wind was a head. Fortunately it wasn't always: the road winds up the mountainside in spectacular fashion, the switchbacks providing some tailwind to partially balance the head. I rode a steady, yet unimpressive tempo here, finishing the climb in 27th place, scoring 109.5 points.

That's not terrible, however. Since I'm working on a fitness deficit I had to be happy with the fact I didn't fade too badly on the extraordinarily long, 30 km Mt Hamilton climb. For the San Bruno Hillclimb on 1 Jan, it's actually good to be a bit underprepared for the Low-Keys, then come out with an upward fitness trajectory and top off with some good climbing intensity in the weeks between Thanksgiving and New Years. But that's not going to happen for me, since I travel for work one weekend, and will be traveling to the East Coast for a family obligations in mid-December and again for the week including Christmas. This is a classic problem for me in December: travel is terrible for cycling. When I'm traveling I'll focus more on running. It's not what I want, however, for a confidence-boosting San Bruno result. I'm not complaining: family gets the priority, and these family trips are important to me. But it's just how it is.

So here I am sort of floating in this void. I have only one additional physical therapy session for my injury, then I'm done with that. I'm dedicated to continuing my yoga classes. I'm de-emphasizing the weight room which was a major focus from soon after my injury in June until October. I definitely want to get back to trail-running shape, as it's been way too long since I raced a trail run. So I'll see where I end up. I'm definitely not planning on slacking back and slipping into any sort of sedentary purgatory. I need some sort of competitive goal.

Friday, November 22, 2013

Low-Key hillclimbs consolidated results pages


I am a firm believer that old event results matter, and am always dismayed when I check for results of some past event or race and can't find them. The half-life for old results tends to be around 2 years: typically you can find last year, but go back two years, and it's 50-50. Since a big motivation behind Low-Key Hillclimbs was to show how things can be done better, I've made an effort to keep old results easy to find.

With this in mind, I've long had a vision that results should be in some sort of queryable data base. For example, want to find the best scores for women in the 20+ category? No problem. I can do that: I have command-line tools which allow me to quickly search a CSV file of scores from the entire series history. But to provide an on-line access to that would be nice.

But that would be a lot of work. As an intermediate step, I decided to write Perl code which would generate static HTML of the consolidated results for every climb Low-Key has done. Many climbs have been done only once, for example 5 of this year's climbs were new. Others have a rich history, most notably Mt Hamilton and Montebello, which are ridden every year (except in 1998 when Hamilton was ridden twice, Montebello skipped).

My code was similar, but far simpler, than the code I use to score the series. The increased simplicity was due to the scoring scheme and the lack of the need to calculate team rankings, overall score, most improved rider, etc: just calculate a score for each rider for each week. The scheme I used was a relatively simple one which I've used before: adjust women's time by a factor, find the median adjusted time for all men and women, calculate everyone's scores using that median as a reference for 100 points. This is simple and robust.

Then for each climb I just made a ranking. It's mostly fair, our start lines and finish lines haven't changed much with the notable exceptions of Quimby, which shifted due to sprawl, and Kings Mt Road. For Quimby I assigned the two start lines as separate climbs, but for Kings Mt Rd, the start lines were closer, and 1995 times thus have an advantage versus subsequent years.

A lot of this work was done during a 4 hour 20 minute Caltrain commute where the train was halted due to a down power line. It was a nice focused time to get it done. "A bad commute on the train is better than a good commute in the car", assuming you don't have a deadline.

Then I indulged in a little Javascript to randomize the banner for the page. This took a lot more time than I expected, but it's good to dust off my limited Javascript skills. Here's the result. Low-Key started way before Strava and part of the motivation was to document good times for various climbs. Finally, with results indexed by climb name, mission accomplished. But with Strava's exponential growth, that largely serves that function now.

Next would be cool to make pages for each rider, or better yet, to provide a PHP form which will do the same rather than maintaining static HTML.

Wednesday, November 20, 2013

history of Low-Key Hillclimb banners

I've been organizing the Low-Key Hillclimbs, with one extended break, since 1995, and as a cycling event it has always has always had its existence firmly planted in the internet. It was relatively early in that regard.

Web design hasn't really help up with the times, however. The pages are relatively simple HTML, although I indulged in some JavaScript in 1997, even with on-line ordering for T-shirts. I've only recently started using PHP, and only for Strava API interaction. I want to move toward more PHP in the future, but for now the HTML works fine.

One feature of every year is a banner image. Banner images are rather dated these days, but I still like them. Here's a brief history of the Low-Key Hillclimbs banner image. One note: the 1995-1998 pages were reconstructed, since they had been inadvertently lost. 1995-1997 were regenerated with original graphics, but 1998 was generated fresh in 2008 from 1998 result data. So the 1998 banner image is circa ten years later.

1995: Rendered with a crude X11 paint program, using silouette of Marco Pantani

1996: Bolder font, tightened, but otherwise similar

1997: Intensity was the theme (slight irony), against green background

1998: Original web pages no longer available; this was basically a copy of 2008

2006: The Comic Sans was supposed to represent Low-Key. I alpha-channeled year.

2007: I scripted the year, and changed the text color

2008: Copy of 2007

2009: Another copy of 2007

2010: I graded the text color and reduced opacity on year

2011: New design, using a metallic text tutorial for Gimp

2012: Copy from 2011; I actually coped and pasted 2nd 2

2013: No "3" to copy, so new design: colors more plastic, with nova in background, using Gimp

2014: Fun with Gimp render filters, Mt Hamilton paceline in background, and improved year placement

I am a big fan of compact file size, as I can't expect everyone to have maximum bandwidth 24/7. But of course file size budgets have increased over the years. 1995 was 4-bit GIF mapped to the Netscape Color Cube. By 2014, it's 24-bit color with 8-bit alpha channel. Still, compared to the vast majority of the web, the Low-Key pages are exceptionally compact. This policy paid off with the explosion of mobile device web access. I have no need for mobile versions of the Low-Key pages: they were already mobile-compatible.
banner size vs year

The 2014 design I did yesterday afternoon on my train commute, and finished it off this morning sipping tea (decaf black, mixed with a protein-fiber-vitamin drink mix and a bit of Stevia... yum, yum). Every year I go into it thinking perhaps that year is the last year for Low-Key. Every year needs to have freshness, needs the support of my co-conspirators, and needs to address the key concerns of all involved in things which can be improved. Every year needs to be better than the one before.

Tuesday, November 19, 2013

VAM analysis of climbing Marin Ave in Low-Key Hillclimbs 7X Challenge

Marin Ave isn't like most other climbs. With Marin, it becomes as much a matter of survival as of speed. Speed is desired as much because it ends the suffering sooner as because of the desire for any target time, placing, or Low-Key Hillclimb points. Yet in the case of Saturday's 7X challenge of course the placing and the points were still important.

A nice thing about metrology data is they tell a rich story if examined closely enough. In this case I didn't have a power meter, my PowerTap too heavy for timed hillclimbs, so I have to rely on other ways to judge my effort. On a climb this steep, VAM is a nice proxy for power, so I use that to judge my pacing.

VAM extracted by numerically differentiating measured altitude with respect to time is inherently noisy, especially on a Garmin Edge 500 where altitude is reported with 1-meter resolution. So to get meaningful numbers I convolved the result with a Gaussian of sigma 3 seconds. This smooths the VAM to something which works fairly well. A longer time constant would lose any resolution of the plateaus which are the ten cross streets delineating the 11 blocks which comprise Marin Ave.

Here's the profile, in all of its nastiness:


Of the first eight blocks, only one, the block to Spruce, is really steep. Marin Ave's heart lays at the end. In a way the first eight blocks are there just to prepare you for that. I knew this, having ridden Euclid twice before, and having studied the profile before that.

WIth this in mind, here's the VAM data from my ride on Saturday, where the data end where the final checkpoint is triggered, which is approximately 5 vertical meters below the finish (in order to provide a buffer for GPS error).

Dan Connelly

I hit the first block, to Shattuck, fairly hard. My legs felt good despite having made a maximal effort up Lomas Cantadas ending less than an hour earlier. But then I settled into a more sustainable effort. I was holding back, waiting, keeping something in reserve for what I knew to be coming.

On the steep block to Spruce my VAM naturally increased, as I tried to keep a reasonable cadence in what was already my lowest gear, 34-27. From here it was 3 more blocks to Euclid. I kept the gear spinning. I was gaining on the tandem a few blocks ahead. I wanted to make good time even if I was holding a little bit back to survive my approaching fate.

But when I hit Euclid, reality was still a sobering blow. You can see my VAM drop dramatically here as I first freewheeled, then softpedaled across the road. The grim inevitability of the pain which was to follow hit home. All I could do is postpone my fate, even if for only a small number of seconds.

Then I was there: on the brutal slopes of the first of the final three blocks, to Hilldale. My VAM jumped here simply because I had no other choice. To keep turning the pedals at a reasonable cadence meant my vertical speed had to increase.

I hit the next block without much hesitation. Again, I kept the pedals turning. Still brutal, it's slightly less steep than the preceding block, so my survival-mode VAM dropped a bit.

My legs were pretty much empty now, yet I had one block left. Again, I paused slightly at the cross-street, collecting the will for the final wall to freedom. My legs hurt, and they would soon hurt more, but only for a brief time. Then I would be done.

The last block I went at my maximum: the tandem was right there, having finished soon before. But I was near empty, I had nothing left. My VAM increased but not by much more than my cadence demanded.

So there it was. Of the 9 riders in my division (human-powered male solo riders) I had the third best time, a personal record of three tries. Success. Yet it's interesting to compare my effort to that of the fastest rider in the division, David Collet.

Dave Collet His effort follows some similar patterns to mine. But his VAM on blocks during the first 8 blocks was more consistent than mine. He was less conservative, pushing himself harder on the less steep blocks, saving himself less. At Euclid, the massive drop in VAM in which I indulged to prepare mentally and physically for the final three blocks isn't seen in his data. He crossed Euclid with relatively little delay, throwing himself immediately into the final three blocks. The same was true on Keeler: no pause, just the beginning of an impressive sprint to Grizzly Peak.

The crux of Main is the final three blocks which follow Euclid. These blocks are occluded from view by Euclid's width. All that's visible is the stop sign at Euclid's leading edge, and only when you crest this rise does the brazen monstrosity of the final three blocks confront you. I'd ridden Marin twice before, and studied the profile before that, so I knew what was coming. Yet knowing and experiencing aren't the same thing, and it intimidates me every time.

There's no way I'm going to match David's time, but it's clear from his data a bit less mental intimidation, a bit more confidence, would have improved my pacing. But easier said than done. There's not many riders who aren't scared of the challenge Marin provides. Such grades, if you can find them on urban roads, are usually found for one, two, or maybe three blocks. Eleven consecutive blocks of this is hard to grasp. But I hope to have another shot at the hill when I'm fitter than I am now, just getting my fitness back after a prolonged focus on rehabbing from my June crash.

Monday, November 18, 2013

Low-Key Hillclimbs week 7x: timing Marin Ave with GPS

Saturday was a double event of sorts for Low-Key Hillclimbs. We had the standard climb, up Lomas Cantadas in the Berkeley Hills, but we combined that with a bonus event, to do Lomas and Marin Ave, in that order, in the same day. We called it the 7X Challenge.

The climb of Lomas Cantadas was fun. I felt stronger than I had the week before, at Patterson Pass. The pace was very quick at the start, requiring a level of explosiveness I simply do not have now or maybe ever, and I drifted back. This cost me a bit on the short descent, where I was in slower traffic than the leaders, but I did well on the final steepest portion, passing riders who suffered from having stuck closer to the leaders in the early going. Among the riders without electric assist, I was 9th, a very good result for me this year, given my ongoing physical therapy to recover from my June crash and injury.

I helped with the finish line crew to get the numbers of riders finishing after me, then headed over to the parking lot near Grizzly Peak Boulevard where Cara had brownies and gingerbread in addition to water, juice, and other goodies. Paul McKenzie and Paul Chuck were there on their tandem, and they led a group to Marin Ave, to complete the 7X challenge.

Marin Ave on tired legs is a memorable experience. The first 8 blocks wear you down so when you crest Euclid, revealing the final three, steepest blocks ahead, it makes a rapid impression. But I survived these three blocks, ending up 4th out of 9 among men in the challenge standings (as I write this: riders may still upload their data until Monday evening). In addition there was one woman, one hybrid-electric bike, and, remarkably, Paul and Paul's tandem.

The timing of the event, not surprisingly, didn't go as smoothly as I'd hoped. It was better than Portola Valley Hills, for sure, but part of that was because there were so many fewer riders. The GPS timing was applied to both Lomas Cantadas and Marin Ave, the former to allow those who didn't do the Low-Key climb to still participate in the 7X Challenge. I'll look here at Marin Ave.

Here's the x-y position (the start line of Lomas is the origin) for all of the riders who did the Challenge course. Most of the data are excellent, but there are three clear outliers:

Marin positions

I plot only the data between the triggering of the start line at the traffic circle at the bottom of the climb and the triggering of the finish at the top of the climb. The obvious culprits are 409 (Edge 500), 405 (Edge 500), and 108 (Edge 800). This wasn't completely to the script that Edge 500's cause all the trouble, but they were still 2/3. In the case of 405 and 108, the data recovered just in time to trigger the finish line. In the case of 409 I had to apply a manual finish line move to have him trigger it (I could also have defined a manual trigger time for the rider: that's a code feature I want to add, since it would be faster than moving the line). I did that by referring to his altitude. I could also have looked at his speed, but the top was fairly obvious from altitude.

This brings up the question whether altitude provides a way to automatically trigger timing events in GPS-timed competitions. For example, if position is poor, but altitude is accurate, and I know the finishing altitude of a climb, then I can trigger the timer when the rider approaches that altitude (leaving some margin for error).

To investigate this, I plotted altitude versus distance for each of the riders. In this case, the distance starts when the rider first triggers the start of the climb. Two riders returned to the start and retriggered it, one after riding partway up Marin, the other up an alternate street. The timing code handles this fine: the time is taken from the last time the start line is crossed. But this isn't important to the purpose here: what's of interest is the start and finish altitude, and if the rider's altitude changes monotonically from start to finish.

Marin altitudes

The answer: clearly not. And interestingly it's some of the same suspects as last time: 405 (Edge 500) and 108 (Edge 800) have the worst altitude data. 409's profile is poor, but that's perhaps because distance is wrong: the altitude increases monotonically between limits consistent with the other riders.

So what we have is there's a good chance poor position is associated with poor altitude. Since the Garmin units use GPS-corrected barometric altimetry, that's not too surprising. If you're relying on GPS to calibrate the barometer, and the GPS is out on vacation, you're using an unreliable reference. Altitude can't naively be used as a backup for poor position data. Good altitude and good position go hand-in-hand.

Sunday, November 17, 2013

Low-Key scoring and X-weeks

Low-Key scoring got complicated when we started balancing the contributions from each week. The results from one week can affect the results of other weeks. This is fine. There's several goals in the scoring:

  1. the average of all rider scores should be 100.
  2. rider scores should be as consistent as possible week-to-week
  3. if I plot the logarithm of scores versus the logarithm of times in a given week, I get a straight line, the average slope for all weeks being one.

A trivial example for this might be the following:

Suppose rider A does climb 1 and gets a time. He's the only rider in climb 1. Climb 1 has been the only climb. He gets 100 points, consistent with the first goal.

Now rider A does climb 2 and gets a time. Again he's the only rider, but now there's been two climbs. He gets scores of 100 and 100. This is consistent with goals 1 and 2.

But suppose now I realize there was also a rider B in week 2. Rider B was 20% faster than rider A. Using the third goal, this means rider B should score 20% more than rider A in this week.

So what do I do? I could assign rider A scores of 100 and 100, as before, and give rider B a score of 120. But this would be inconsistent with the first goal. I want the average to be 100. The two scores for rider A are x. The score for rider y is 1.2 x. I then get the following algebraic equation:

x + x + 1.2x = 300

The solution is rider A scores 93.75 for weeks 1 and 2, but rider B's score in week 2 is 112.5.

However, here's where I get into a problem when I use "X-weeks". X-weeks are climbs which are scored like regular climbs, but don't affect the regular series scores. We had examples in the 1990's, but the first "X-week" in the "modern era" with this sort of scoring scheme was yesterday: The Lomas Cantadas - Marin Ave double.

Suppose the second week was an X week. The score from the first week was affected by what happened in the X-week. I can't let week 1 scores change due to the presence of week 2.

So if week 2 is an X week, the scores should be 100, 100, and 120, as I initially calculated. The first goal, that the average score should be 100, and indeed the third goal, that the average slope of scores versus time on a log-log basis should be one, should exclude X weeks. I will still calculate a slope for the X week, consistent with the second goal (rider scores should be as consistent as possible), in a more complicated case (in this trivial case I don't have enough scores to do so). But the result shouldn't feed into a series average.

This problem only manifested itself when I calculated the week 7x scores. I saw scores dropped for the week 7 climb, which was just Lomas Cantadas. That wasn't desirable.

The problem obviously gets more complicated when you have up to 10 sets of scores for on order 100 riders per week. I end up calculating statistical weights based on the number of riders per week, and the number of climbs per rider. I'm not sure the method I use formally optimizes to these stated goals. But in testing, it does much better than the previous, far simpler algorithm. I described some of this testing back in 2011.

But the "X-week" model may still have a few surprises in store in terms of unintended consequences. Hopefully I can isolate any before the end of the series.

Friday, November 15, 2013

polyline checkpoint enhancement for GPS timing during Low-Key Hillclimbs

As I initially described here, I have been developing an event model for GPS data for the Low-Key Hillclimbs. This allow us to do things which weren't possible before:

  1. dirt climbs: 2012 and 2013, where it's better to let riders do it on their own
  2. short-hills routes, where there's too many time points for practical hand-timing
  3. bonus climbs, supplementing the standard "event", in which riders get a chance to experience more challenges

A limitation of the model has been checkpoints are defined as fixed line segments. This provides a much better solution to event applications than does the Strava timing algorithm, which is optimized for users who within a few seconds define an arbitrary, abstract "segment" and the code is left to match rider data to the segment without much additional information. My model is set up for a course designer who is willing to carefully optimize the placement of a series of checkpoints, to improve timing accuracy and to reduce sensitivity to GPS errors and rider navigational choices consistent with the course goals. Strava's present algorithm will never serve the need of event organizers who want to verify course completions, potentially with time limits, and optionally timing riders over various portions: in events, there's simply too little margin for error. However, the simple model of a checkpoint as a single line segment can be limiting.

Checkpoints can serve two purposes:

  1. to provide optional split times
  2. to verify completion of a course, for example that the rider didn't take a short-cut

For split times, it makes sense to optimize the line strictly for the purposes of timing. If GPS data is flaky, and position errors are large, or GPS drops out during the ride, you don't want to assign split times to that rider associated with that checkpoint. But if the split times are just supplementary to the overall time, that's fine.

Additionally, for timing, you probably want the line to cross the road at a perpendicular angle. This way if the rider is to the left or right, the rider gains no advantage or disadvantage on splits associated with that line. This is basic stuff in start-finish line placement.

But if the point is just to verify the rider passed, there's no reason to worry about such things. The line can have generous width, angled to best avoid the roads presenting potential short-cuts. But with course verification, GPS data drops may still be an issue. And a single straight line may not be the best shape to catch the rider moving along the desired path.

To address this I extended the model to include polylines.

In the old code, a checkpoint was characterized by a line, which included a left and right boundary. The line needed to be crossed with the left boundary on the left, the right boundary on the right. It was encoded as a center point and a right point, with the left point a reflection of the right through the center. There was also an optional distance parameter, to allow tuning of the distance between the center point and the left and right boundaries, retaining the distance. This was implemented to facilitate extending the line if in practice GPS errors turned out to be larger than expected.

With polylines, checkpoints are instead represented by a list of such lines. In every preceding case, the list was length one. But the list could be arbitrary length. The code then goes through the list and checks in order if any of the lines is crossed, again with its right boundary on the right and left boundary on the left. If any is crossed by the path between two points in the user trajectory, the checkpoint is considered triggered

Polylines represent a surprisingly broad range of applications. The one which inspired their implementation is a series of sequential checkpoints along an unambiguous section of the route the rider must traverse. It's not important which of the checkpoints is crossed as long as at least one of the checkpoints is crossed. If GPS signal integrity is unreliable, and the signal might drop or large position errors might occur, this application can be used to verify course completion. Obviously this wouldn't be best for timing, since the time could be taken to any of a series of checkpoints which lie at different distances. If the user went through more than one of these sequential checkpoints, going through following lines would simply retrigger the same checkpoint. This would have relevance only if there was a time budget to either the next or the prior checkpoint.

But there's other possibilities. Another would be a checkpoint which consists of what is effectively a curved line. Segments could be connected, the left boundary of one matching the right boundary of the next. While the data representation I'm using doesn't easily facilitate this, it would be a simple enhancement to support it so in the course definition phase only the series of vertex points would need to be specified.

An application of this would be a course where the goal was to hit a landmark, from any direction, then move to an additional landmark. I could construct a polygon around the landmark consisting of three more more sides. The user would need to enter, or optionally leave this region to trigger the checkpoint. If the points were arranged clockwise looking down, each segment ordered left then right, the checkpoint would be triggered upon leaving, while if they were arranged counterclockwise, the checkpoint would be triggered upon entering.

Another would be a checkpoint where I wanted to avoid potential alternate paths, and space was tight. Consider roads A, B, and C. I want to make sure the rider is on road B which is between A and C. However, from a desired checkpoint the direction of minimum distance to A is in a different direction than the direction of minimum distance to C. I might put a two-part polyline, from between A and B, to the point on B, to between B and C. Or I could make it a 3-part line with a short segment perpendicular to road B. The idea is to shape the net to catch the rider to maximize the chance of catching the rider without catching riders on alternate paths.

Another application is a bidirectional checkpoint. Imagine a course with an out-and-back, where the rider must circle a loop at the turnaround. Perhaps I don't care if the loop is made with a left turn or right turn. I could put a 2-segment checkpoint at he apex of the loop, one for clockwise riders, the other, identical except the left and right boundaries are swapped, for counterclockwise riders.

Another application might be routes with time limits (time budgets) for road segments where the rider might spend time at a rest stop. Suppose I have a route from A to B to C, and there is a 1 hour limit from A to B, and another 1-hour limit from B to C. A rider spends 10 minutes at checkpoint B. If the rider were to take 45 minutes to go from A to B, then spend 10 minutes at B, then 55 minutes from B to C, then the rider would rather have the resting time at B assigned to the A-B portion. That would provide for 55 minutes against a 1-hour budget for both A-B and B-C. On the other hand, if the resting time were credited to the B-C portion, that would result in 45 and 65 minutes, the 65 minutes going over the 60 minute time budget, and there would be a 5-minute penalty against overall time. To do this, I would put in a polyline segment, one before the rest stop, the other after. Then riders would trigger it twice, and the code would pick the one which resulted in a minimum net time for the rider, as I described in my blog post on recursive course time determination.

So this extension of the model, from checkpoints consisting of a single line segment with left and right boundaries, to multiple line segments each with its own left and right boundary where the rider triggers the checkpoint by crossing any of them, has considerable power, especially when used in conjunction with the flexible, recursive course timing algorithm I described previously.

It's still somewhat limited, however. I require checkpoints themselves to be crossed in order. I still have no way to say "checkpoints A and B must each be crossed but in either order". For example, it's popular in alleycat races to have riders need to visit a series of landmarks but optimizing the order is part of the competition. While the code could trivially be modified to support this, combining this functionality with the existing one in an elegant fashion would be additional work. A slight generalization of this idea would be "N checkpoints, of which M must be crossed, where 0 < M ≤ N". For example, I might want to define a series of checkpoints along a route with unreliable GPS, but make sure the rider hit at least half of them. This comes closer to the Strava path matching algorithm.

Okay, I'd better stop now, otherwise I'm going to want to overhaul my code again, and it's already working in advance of tomorrow's bonus route of Lomas Cantadas and Marin Ave in the Berkeley Hills. I'm excited to see how it goes.

Sunday, November 10, 2013

Low-Key Patterson Pass: small groups and the prisoner's dilemma

The prisoner's dilemma is a description of a problem where you have two suspects are captured, accused of committing a crime, and are isolated in separate cells. If either admits to the crime while the other does not, the prisoner admitting to the crime is set free, the other given the most severe punishment (10 years in prison). If they both confess, they are given a light, 2 year sentence. If neither confess, they are held in prison for one year, but eventually freed for lack of evidence.

So if you're one of the prisoners, what do you do?

If they could collaborate, then the best approach would be for neither to confess. They'd esch serve a year in prison, which isn't great, but overall they'd serve only two years. That's much better than the alternates.

But they're isolated, so they can't collaborate. For each prisoner, what the other prisoner does is beyond their control. If the other prisoner confesses, then he is better off confessing as well, getting a 2-year sentence. If he does not, he'll get the maximum of ten years. On the other hand, if the other prisoner does not confess, then confessing means walking free, while not confessing means a year in prison.

So the best result for the group is cooperation. But if individuals act in their own selfish interest, then there will be no cooperation, and the net result is worse.

This is what happened in the group 2 small groups start at Patterson Pass yesterday. Groups were ordered by speed, based on prior scores or, in the case of a few brand-new riders, by considering results in other races, if any. The groups were thus fairly well matched within the group, but were still competing against each other. It was a mixture, therefore, of competing against the riders within your group and cooperating with the riders within your group.

The best overall result for the group was that everyone contribute on the early, relatively flat portion of the course. And by "contribute" I mean spend time at the front at a level of effort which will cause some fatigue, causing some loss of speed on the steep portion, but just a little. The payback in following the draft of other riders within the group, each of them also riding at a fatiguing pace when at the front, would more than compensate the loss.

However, without any way to enforce global cooperation, for each rider a better scenario is that everyone else does fatiguing pulls at the front, while that individual rider sits in and rests. Every rider in the group would prefer this to occur, so for every rider within the group the incentive is to sit in and rest, or if at the front to take a brief, low-power, non-fatiguing pull.

In group 1, one rider, Nils, hammers most of the way himself. This was good for the group, since Nils is freakin' strong on the flats, but it hurt Nils' result.

In group 3, there was a strong tandem with Paul McKenzie and Paul Chuck (this is Low-Key, so we allow all sorts of bikes to ride together). Tandems are well known for being strong on the flats, and in this case the tandem led the way along the flattish section, everyone else within the group benefitting.

But in group 2, we had no tandem, and we didn't have Nils. It was going to be up to each rider to contribute, and not enough did. I took the first pull, from the start line, and one other. That would have been more than enough given the length of the opening section and the number of riders (10) in the group.

But too many riders weren't working. They were playing to the prisoner's dilemma. As a result, group 2, which should clearly have been faster than group 3, lost substantial time to that group here, and of course we lost even more time to group 1. Our scores suffered as a result.

This is shown in the following plot. The plot uses only male solo scores, for consistency. I have two curves: the blue curve is the rider's best score in the most recent year for which that rider has a score, with older scores depreciated 1%/year. This is what I used for group assignments. The other curve is the average score for riders in the group on Patterson Pass.

group performance

group performance 2

Although group 2 still did better than group 3, the differential between group 2 and their qualifying score is the greatest among all groups. It's clear group 2 could have done better overall, even if the "selfish" strategy may have helped some of the individual riders. Group 3, in particular, did well at Patterson due to the hard work of the tandem there. (Note: I fixed this from an initial version which showed group 6 also scored low. In the revision, where I restricted the comparison to riders for whom old scores existed. It's important the same riders contribute to both curves.)

Thanks to Paul McKenzie, here's the average speed of the groups on the first 1.3 miles:


From Paul:

My conclusion from this limited data is that Group 2 was indeed soft pedaling at the beginning, and conversely, Group 4 put in quite a cooperative effort, putting about 24 seconds on Group 2 during the first part of the climb. Those in our group (Group 3) received the benefit of a nice tow to the base leaving them with fresher legs for the climb.

This may seem unfair to members of group 2. But we dug our own graves. A great thing about bike racing is it's multiple facets. It's more than just a physical contest, it's also a mental game. And the prisoner's dilemma is a classic mental game.

By the way, this situation was very similar to handicap racing, where slower riders are given a head start, and the first across the finish line wins. There the penalty is more obvious, however: you can see your competitors up the road, or approaching from behind. Here the results aren't obvious until the math is complete on the start and finish times. But they're just as real.