New Study Predicts Marathon Performance from Wearable Data

Scientists use mathematical models to accurately predict marathon times — and when you're overtraining — from recorded training data. App to come. Watch this space.

Preparing for a marathon, one of the most critical steps is making an objective assessment of your training to determine what you are (and are not) likely to be capable of doing in the race. But unless you’ve run a fair number of marathons (and have done one or more benchmark workouts late in your training against which to assess your fitness), this can be more guesswork than science. Worse, it’s guesswork that can be easily influenced by optimism that can turn the latter part of the race into a depressing slog if that optimism proves unwarranted.

Scientists from France and Finland, however, have come up with a way to make such predictions from smart-watch data logged during your last six months of training. In a study in today’s issue of Nature Communications (a companion to the prestigious journal Nature), they were able to estimate thousands of runners’ race times to within, on average, about 2 percent of the times they actually ran. To put that in perspective, that’s about 3½ minutes, one way or the other, for 3:00 marathoners: not perfect, but very good.

Nor is this model just for elites and fast people, says the study’s lead author, Thorsten Emig of the Université Paris-Saclay, France. His data, he says, includes runners as slow as 6:00.

There is, of course, plenty of scatter: the study applied to real-world runners in real-world marathons. Among other things, Emig’s model can’t account for such factors such as weather, GI distress, terrain, or the fact that we don’t always run our best. Not to mention that sometimes we have one of those breakthrough days when everything goes right.

The prediction becomes even better if it is used to predict PRs, which presumably only happen on good terrain, without GI distress, and in good weather. Emig wasn’t able to do that for all of his runners, but in a smaller experiment, he used a variant on his model to estimate marathon PRs based on PRs in shorter races. When he applied that to the pros, it proved stunningly accurate, predicting Mo Farrah’s 2:05:11 PR within 9 seconds, Haile Gebrselassie’s 2:03:59 to within 8 seconds, and Eliud Kipchoge’s 2:01:39 to within 56 seconds.

Photo: Nature Communications

Leaving the Lab for the Roads

Normally, Emig says, if scientists want to evaluate a runner’s potential, they put them in the lab and measure such physiological factors as their VO2maxes. But everyone knows that VO2max isn’t a good predictor of race performance. “The correlations are very bad,” Emig says. And while there are other variables such as running economy and lactate threshold that can be measured in the lab to fine-tune the estimate, the reality is simple: “Most people can’t go into the lab,” Emig says.

But with the advent of smart watches and websites such as Strava, he realized, you don’t need a lab to collect data. Instead, any runner who uses such devices already has a host of it from real-world performances — including hundreds of training runs — already available.

Emig isn’t an exercise physiologist. He’s a theoretical physicist with a strong bent toward statistical physics. But he’s also a marathoner who ran a 2:50 marathon at age 48. In other words, he’s a dedicated runner, with a great deal of experience with complicated mathematics.

With the help of students and coauthor Jussi Peltonen of Polar Electro Oy, Kempele, Finland (the maker of Polar watches and other fitness trackers), he collected data from 1.6 million training sessions by 14,000 marathoners — a whopping 20 million kilometers of training data, in total.

“It’s a little mathematical,” he says — more than a bit of an understatement. But the bottom line, he says, is that two of the pieces of information he could tease out of each runner’s data proved to be critical.

Photo: Nature Communications

Where Speed Meets Endurance

One is the runner’s speed when running at VO2max, sometimes referred to as maximum aerobic speed. “That’s typically the speed a runner can maintain for about six minutes,” he says, adding that even if you never actually run flat-out for that length of time, it can be estimated by how fast you are at longer distances.

The other critical bit of information is endurance, which Emig and Peltonen quantified with an endurance factor they call E1. That is simply the number of minutes you can run at 90 percent of maximum aerobic speed (roughly lactate-threshold pace), divided by 6. “This is a crucial second parameter,” Emig says.

It’s also highly variable. A runner with a maximum aerobic pace of 5 meters per second (5:22 per mile) could have an endurance factor of 12 (meaning they could hold 90 percent of that pace for 72 minutes), or as low as three, meaning they peter out after 18 minutes.

That, Emig says, spells the difference between a 2:40 marathoner and a 3:30 marathoner, even though both have the same maximum aerobic speed. “It’s all a matter of endurance,” he says.

To calculate these parameters and predict marathon times, he says, his model scanned running data from the six months prior to the marathon, looking for the fastest 5K, 10, and half-marathon times. From these, it estimated both maximum aerobic speed and and E1.

That assumes, of course, that the runner had actually done some hard training. But, Emig notes, if people never run hard in training, it’s likely that they may not push to the max in the marathon, either, so the predicted finish time might still be accurate.

Photo: Nature Communications

When You’ve Trained Too Hard

But that’s not all the model can do. It can also spot at least one potential sign of overtraining.

To do that, Emig says, he used a metric called TRIMP (TRaining IMPulse) to assign a point value to every run in the preceding six months.

TRIMP isn’t a new concept, and there are a confusing number of versions of it online. But the one used by Emig (calculator here) is a research tool that assesses the potency of each run via a complex formula based on the duration of the run and the fraction of maximum aerobic velocity at which it is run, heavily weighted to favor hard workouts. “You get more points for 50 minutes hard, than one hour of jogging,” Emig says.

For example, a male running 90 minutes at marathon pace would earn about 220 points. The same male doing a 90-minute easy run would earn about 150 points. (The figures are slightly different for women.)

Based on that, Emig says, he found that runners who rack up more TRIMP points tend to have greater endurance…but only to a point. Somewhere around 20,000 points, the benefit plateaus, and above 25,000 it falls markedly (by about 25 percent).

Unfortunately, there is not currently a website or app that can do all of these calculations for you. But that might change soon; Emig’s team is thinking of putting their marathon predictor (and a TRIMP point counter) online, possibly for free, possibly for a fee.

“We are thinking to make this available hopefully by the end of the year,” he says, “at least in an experimental version.” When he does so, he says, he will announce details on his research website: