GPS accuracy comparison using Portola Valley Low-Key Hillclimb data

November 02, 2013

As I noted, when dealing with GPS problem cases in the Portola Valley Short-Hills version of the 2013 Low-Key Hillclimbs, I couldn't help but notice every one of the cases I grappled with was an Edge 500. This is anectdotal, so I wanted to take a closer look at the problem.

The initial plan was to scrape the HTML from the Strava pages with a Perl app, since the API doesn't provide computer type, but when this didn't work out for me since Strava requires user authentication to see this info bit I started thinking about PHP options but finally when I couldn't sleep last night I just went to the pages sequentially and transcribed the computer identifier from the browser. Brute force. Not elegant. I feel so dirty.

There were 69 riders @ Portola Valley who each reported the URLs of their Strava records. I then compared these using the root-mean-square average of the distance from the center of the lines the riders triggered the lines (units: meters). The ideal number isn't zero, because my lines are generally in the center of the road and riders are to the right, and in any case there's other sources of error than rider GPS, but "perfect" would probably be 2 meters or so.

Here's the average by computer type:

computer	n rms dl	rms rms dl
Osync Nav2Coach	1	2.90553
Forerunner 405	1	3.17032
Edge 305	2	3.78266
Strava iPhone app	5	6.5897
Edge 800	9	7.98625
Edge 705	4	8.95446
Mobile	1	9.02079
Edge 200	3	10.6593
Strava Android App	4	11.3408
Edge 510	8	18.5831
Edge 500	31	65.3486

I am told Scott Byers was using an Osync Nav2Coach, which Strava failed to identify. That unit clearly works extremely well: his score was close to ideal. Indeed, it's a testiment to the superb registration of Google's satellite maps, based on which I placed the lines.

It's not as bad as it looks for Edge 500, though. There's plenty of decent results from Edge 500. It's just it has a virtual monopoly on the really bad results.... curiously along with one screwed-up Joaquin from an Edge 510.

Detailed results (hopefully this renders properly) with numbers linked to Strava activities:

rank	num	rms_dl	name	computer
1	326	2.74391	Jeff Shute	Edge 305
2	49	2.85231	David Collet	Strava iPhone app
3	37	2.90553	Scott Byer	Osync Nav2Coach
4	58	3.11199	Andy Crews	Edge 705
5	132	3.17032	Stefano Profumo	Forerunner 405
6	156	3.22356	Todd Studenicka	Strava iPhone app
7	23	3.24592	Daniel Aminzade	Edge 800
8	204	3.34927	Bryn Dole	Edge 510
9	407	3.40899	Brandon Iles	Edge 500
10	65	3.41488	Giles Douglas	Edge 800
11	212	3.75249	Peter C Ingram	Strava iPhone app
12	125	3.90218	Frank Paysen	Strava Android App
13	83	3.9244	Rich Hill	Edge 500
14	53	4.12421	Tracy Colwell	Strava iPhone app
15	203	4.24395	Kevin Colagiovanni	Edge 500
16	12	4.29885	Will von Kaenel	Edge 800
17	411	4.32206	Lucas Pereira	Edge 705
18	412	4.32909	Kieran Sherlock	Edge 500
19	105	4.59216	Doug MacPherson	Edge 305
20	404	4.67041	Heidi Fraser	Edge 500
21	408	5.10658	Tom K.	Edge 510
22	152	5.14333	Daryl Spano	Edge 800
23	327	5.26018	Brandon Smith	Edge 510
24	150	5.27501	Gregory P. Smith	Strava Android App
25	413	5.36954	Liam Sherlock	Edge 705
26	414	5.59722	Tim Sullivan	Edge 500
27	415	5.68787	Jeff Weitzman	Edge 800
28	230	5.8452	Kris McQueen	Edge 500
29	171	5.94689	Phil Lovaglio	Edge 500
30	402	6.17513	Chris Evans	Edge 510
31	48	6.20068	John Clarke	Edge 510
32	147	6.41201	Marty Scott	Edge 200
33	401	6.42501	Gino Cetani	Edge 200
34	316	6.89468	Bogdan Marian	Edge 510
35	71	6.96707	Stephen Fong	Edge 500
36	166	7.1123	William Yee	Edge 510
37	135	7.35438	Mihai R.	Edge 500
38	207	7.46039	Robert Easley	Edge 800
39	95	8.01097	Mark King	Edge 500
40	27	8.66326	Kate Bergeron	Edge 800
41	223	8.72471	Eva Silverstein	Edge 500
42	410	8.8267	Paul McKenzie	Edge 800
43	403	8.96959	Scott Frake	Edge 500
44	35	8.97069	Sugar Brown	Edge 500
45	62	9.02079	Mike Davis	Mobile
46	133	11.6204	Alec Proudfoot	Strava Android App
47	14	11.7634	Rich McLovin Brown	Edge 500
48	406	12.9423	Martin Hyland	Strava iPhone app
49	328	13.2219	Ray Smith	Edge 500
50	160	13.2689	Luis Valente	Edge 500
51	151	13.9149	Kevin M. Smith	Edge 500
52	304	14.4409	Paul Cothenet	Edge 500
53	31	16.0769	Blue Brown	Edge 200
54	126	16.2337	Lisa Penzel	Edge 705
55	161	16.307	Greg Watson	Edge 800
56	114	17.2566	Shahram Moatazedi	Edge 500
57	79	18.3404	Bill Harkola	Strava Android App
58	73	19.1305	Chris Furgiuele	Edge 500
59	301	19.4436	Amy Bruski	Edge 500
60	300	20.7875	Billy Bob Brown	Edge 500
61	32	21.6406	Haba?ero Brown	Edge 500
62	400	30.3386	Michael Andalora	Edge 500
63	405	49.0466	Bruce Gardner	Edge 500
64	98	50.2294	Michael Kowalchuk	Edge 510
65	409	54.9151	Bill Laddish	Edge 500
66	122	109.659	Bart Niechwiej	Edge 500
67	209	146.149	Janet Gardner	Edge 500
68	318	186.928	Trish Pacheco	Edge 500
69	130	233.005	Mark Powers	Edge 500

So 11 of the worst 12 are Edge 500's. In contrast only 1 of the best 12 are Edge 500's.

31 of 69 are Edge 500's, so the probability of N out of 12 being Edge 500's, by luck alone, are (using the binomial distribution; Poisson statistics aren't good enough for Low-Key):

0	0.257%
1	2.09%
2	7.69%
3	16.7%
4	23.9%
5	23.4%
6	15.9%
7	7.41%
8	2.27%
9	0.411%
10	0.0335%

So the probability of, with luck alone, of no more than 1 in the first 12 being Edge 500 would be 2.4%. The probability of at least 11 of the final 12 being Edge 500 is 0.44%. The combined probability of both of these occurring is 0.011%.

My pick of the number 12 was a biased pick so this isn't really a fair comparison. But it's fairly clear the Edge 500 is particular prone to position error. This is perhaps not representative of new Edge 500's.

The Edge 500 was the most popular computer with 31. The Edge 800 was second, with 9. The third most popular was the Edge 510, with 8. If I do a ranking of all of the results, considering only Edge 500 and Edge 800, there are 40 total. In that ranking the Edge 800's rank 1, 3, 6, 9, 11, 16, 18, 20, and 28. So in that ranking, of the top 20 computers, 8 are Edge 800 and 12 are Edge 500. Of the bottom 20 computers 19 are Edge 500 and 1 is Edge 800.

Suppose I distribute 9 Edge 800's at random among 40 ranked slots. What's the probability at most 1 would be in the 20 lowest ranking slots (and at least 8 in the highest 20 ranking slots)? The number of ways to distribute 0 in 20 and 9 in 20 is 167960. The number of ways to distribute 1 in 20 and 8 in 20 is 2519400. So the number of ways to do either of these is the sum: 2687360. The number of ways to distribute 9 in 40 is 273438880 . The ratio is 0.983%. So the chance of this happening at random is 0.983%. This strongly suggests the Edge 800 is more accurate on average than the Edge 500. However, you can find plenty of good Edge 500 results.

So I establish the Edge 800 is likely better than the Edge 500. Is it better or worse than the Edge 510? 17 of the computers were either Edge 800 or Edge 510. OF those, the Edge 510s ranked 2, 5, 7, 9, 10, 11, 12, and 17. The Edge 800's ranked 1, 3, 4, 6, 8, 13, 14, 15, and 16. The Edge 800 did slightly better but it's too close to conclude anything from this.

There were 6 different Edge units at the Low-Key. There were 5 iPhones. The iPhones did better than 5 of the 6 Edge units; the 2 Edge 305's did better than the iPhone. Between the Edge units and the iPhones, there were 62 activities. The 5 iPhone activities ranked 2, 4, 9, 11, and 42 of 62. So of the top 11, the there were 7 Edges versus 4 iPhone apps. Of the bottom 51 there were 50 Edges and 1 iPhone apps. I won't calculate the probability of this occurring by chance: it's small.

It's interesting, because you'd expect a phone carelessly shoved into a pocket would be inferior to a specifically designed head unit mounted lovingly on the handlebars. On the other hand, the phones have two advantages. One is they are large. Larger = more room for an antenna. The first iPhone was infamous for its poor GPS antenna. I am told the antenna placement in the iPhone was late in the design process, so it was made to fit in available space, rather than being placed early in the process for better optimization. But iPhone users tend to upgrade their hardware, and I doubt there were any early-generation iPhones represented here. The other advantage phones have is they can access the cell towers and use those to help with position determination. Even if no GPS satellites are available, if the phone has access to at least 3 cell towers it can get a position fix. I don't know how much power the phones versus the Edge units are willing to devote to the GPS circuits.

Comparing the phones, the iPhone did a bit better than the Android, but I am reluctant to draw too many conclusions since Android runs on so many different hardware designs.

So lots of interesting stuff here. The conclusion is among the Edge units, the 500 has the most trouble. The 800 is clearly better than the 500 with high probability, and the 800 and 510 are close. By inference the 510 is better than the 500. The iPhone app does well, even in comparison to the Edge units. And older Edge models (the 305 and 705) seem to do about as well as the newer ones. There were no Edge 810's in the mix.

Comments

U. Block said…

Hmmmm... I had a 1st-generation iPhone. There was no GPS. None.

So yes... it probably was infamous for poor GPS reception. :)

I also think it's possible the Edge 500 has MORE room inside of it for an antenna than a recent iPhone. Space is extremely limited in there.

Thanks for the analysis.

November 2, 2013 at 7:08 PM

djconnel said…

Thanks for the correction! iPhones in common circulation circa 2010 seemed to produce poor results. These were perhaps the iPhone 3G or 3GS (Wikipedia).

November 2, 2013 at 7:18 PM

Unknown said…

I wonder if the rate of sampling is relevant on the Garmins (ie. every second vs "smart" sampling). I'm guessing this isn't exposed in the data from Strava as they've already processed it.

Re. logging in to scrape pages, I've had luck with CasperJS. It's handy for most sites being webkit, although I don't think it copes with HTML5 stuff such as local storage.

November 2, 2013 at 7:18 PM

Tom Anhalt said…

Can you follow up to ask how many of those Edge units are set on (not so) "Smart Recording"? I'm thinking that might have an effect.

November 3, 2013 at 7:12 AM

djconnel said…

Tom: good idea. DC Rainmaker suggested the same thing. I of course could determine that fairly easily.

November 3, 2013 at 7:19 AM

Robert said…

Very nice.

Do the 800's have a different default setting for "smart recording" than the 500's?

November 3, 2013 at 9:44 AM

djconnel said…

Great comment emailed to me from Patrick, who doesn't have a Google account:

Fascinating analysis, Dan (as always)! Thanks so much. I have owned two Garmin 500s. The first was absolutely horrendous, dropping segments constantly. I ended up exchanging it at REI eventually and the replacement is also poor ("poor" compared to the 500s of some frequent ride companions) but barely serviceable.

It so happens that my regular ride partner tracks her rides with a Garmin Forerunner 310XT. And the GPS tracking is consistently better and more accurate than any Garmin 500 in the group. When I dealt with Garmin customer service a number of times, they tried to convince me that all of the dropped and lost segments must be a product of "trees and cloud cover" which might be plausible except that the rider next to me using a 310XT never had any such problems...

Garmin then informed me that the Forerunner 310XT uses a "totally different technology" to lock into satellites. I gather this is the "HotFix technology" which Garmin integrates into many of their automotive products. Per DC Rainmaker, "Yes, the FR310XT has a newer chip than the FR305, including hotfix technology for quicker pickups."

Why the Edge 500 does not incorporate this technology too, I could only speculate. But I am guessing that the Forerunner is primarily intended as a trail running and lake swimming watch and needed a higher grade of GPS signal detection than Garmin believed the Edge series needed for cycling applications. Of course, the wide range of 500 results shown in your test (from horrendous to good) probably points most directly to quality control problems in their production more than anything else.

Discussing these issues a bit with Paul Mach who developed the SNAP tool and now works for Strava, I also noted that a huge underlying problem with Strava segments is that many segments are originally drawn or created with a lot of inherent GPS drift.

On a segment I cover almost every week (~4 miles), we tested this hypothesis a bit by creating as exactly parallel a new segment as possible to an old segment using the Forerunner 310XT data. The new segment timing pretty much exactly corresponds to stop watch timing, whereas the original one consistently gives times about 10-12 seconds longer than the new Strava segment or the wrist watch. I assume this is a function of GPS drift. Even though the times are longer for the same segment, Strava calculates the avg. speeds significantly higher for the original segment, presumably because it believes that more distance was covered in the longer amount of time.

November 3, 2013 at 5:42 PM

djconnel said…

Robert: both the 500 and 800 default to smart recording unless power is being recorded. Power analysis software which calculates normalized power can be confused if data are not provided every second. Same deal with maximal power curves. So Garmin decided that power analysis required uniform sampling. But they didn't anticipate Strava, where position detection would also benefit from high resolution. They figured position was just to draw maps of where you'd been.

November 3, 2013 at 5:44 PM

Michael Barnes said…

Dan, Interesting, thanks. A few thoughts:

On 1-2 week Santa Rosa Cycling Club tours, I started bringing a deep-cycle 12 v. marine battery and an inverter so that people could recharge all their devices (phones, cameras, Garmins, etc.).

I always encouraged people to leave their phones turned off, because in remote areas, the phone cranks up its power to attempt to ping non-existent towers. IPhones that were left on needed to be recharged daily, while my Garmin 500 would last almost a week of 70-mile daily rides.

On the last tour, some people used their iPhones for GPS bike apps, and had to leave their phones on, leading to a huge crowd of people wanting to recharge every night.

I suspect part of the relatively good quality of iPhone GPS is that it might have more battery power available, in addition to bigger antenna. I know GPS is a power hog, because when I use my Garmin 500 with my powertap wheel on a trainer, and turn off GPS function, the battery lasts forever.

Personally, I think the Garmin 500 is terrible from a ergonomic standpoint. I'm not sure mine is a great GPS unit, either. For awhile I used it while I ran laps on the local HS track (a tough test of a GPS unit, I admit) and in the tree-lines streets of the Berkeley Hills. The Garmin was all over the place--literally.

I always use an old-fashioned wired bike computer alongside the Garmin. Leonard Zinn says he just puts his Garmin in his jersey pocket. They are very useful as "black boxes" in case of an accident, which is a main reason I continue to use it.

It also really annoys me at how bad the Garmin is at providing ride information,especially splits, on the road without having to close out the current ride file. I'll take my old Cateye ATC 3000s any day. I have three, two have been working for almost 20 years, the third one finally died.

So maybe Garmins, like low-end digital cameras, will be replaced by smart phones and their apps. Smart phones are becoming the digital equivalent of the Swiss Army knife.

Michael Barnes
former (yet still appreciative) LKHC'er

November 10, 2013 at 7:22 PM

djconnel said…

Great stuff, Michael!

I agree with everything.

When I use my Android-based Droid Incredible to run Strava for nontrivial rides, I always run in Airplane mode, because of why you cite, and my battery is fatigued. I'm surprised how well it does.

On the Edge 500: I agree. I like to believe it was my suggestion of "Last lap power" that was responsible for getting even that into the unit. You used to have to go through history, which was really terrible. As it is you can see only last lap power: no scrolling through laps. But it's still very useful: when doing an interval I want to see distance, time, power, lap-average power, and last lap power. For last-lap time or distance I need to wait. I do, however, like the form factor: it's light and small, unlike the clunkier Edge 510. So "black box" is close to true.

On black boxes: I wish cars all had them, not just me.

November 11, 2013 at 5:10 AM

jonah said…

Dan this is great stuff. I wanted to ask for more detail on how you computed the error in each person's GPS data. In your post you said, "I then compared these using the root-mean-square average of the distance from the center of the lines the riders triggered the lines (units: meters)."

I've read this a few times and it's not clear to me what exactly you did. I'm also interested because I'm going to have my hands on both an Edge 500 and 510 next week and I wanted to quantitatively compare their GPS accuracy.

December 4, 2013 at 5:47 PM

djconnel said…

Based on a Google map, I defined "lines" riders needed to cross to complete the course. These lines had a center point, and a right-hand-edge point, and a left hand edge the same distance but opposite direction from the center as the right. So I expected the rider trajectories to intersect these lines somewhere across the road. The lines were much wider than the road, however, to accomodate GPS error.

So after determining the intercept of the rider trajectories with these "lines", I determined how far from the center the rider crossed each line. Of course the "perfect" answer isn't zero. But it is some small number of meters.

So I calculated the root-mean-square such distance over the multiple checkpoint lines for each rider, then compared results based on the GPS unit the rider was using. The Edge 500's tended to be the largest crossing distances from the line-centers.

If there's an error along the direction of travel, I wouldn't detect it: just lateral, perpendicular to the road direction, since my lines crossed the roads, versus running along them.

December 4, 2013 at 8:07 PM

Diablo Scott said…

I realize this post is a couple months old, but it would be really interesting to do a similar comparison with the altitude numbers.

January 2, 2014 at 4:12 PM

djconnel said…

That's a really good idea. Unfortunately I didn't set the upload script to transfer altitude on this one. But I later fixed that. I have altitude in the Montara Mountain dataset. I'd need to collect the computer types for those, however.

January 2, 2014 at 4:21 PM

Wihelm said…

GPS accuracy comparison using Portola Valley Low-Key Hillclimb data ... gpsforerunner.blogspot.de

November 15, 2014 at 9:45 PM

Search This Blog

On Bicycles, and.... what else is there?

GPS accuracy comparison using Portola Valley Low-Key Hillclimb data

Comments

Popular posts from this blog

Strava Suffer Score decoded

Marin Avenue (Berkeley)

hummingbird feeder physics