By Jordan Rapp
“Sport is always desperate to empower people who can give them information—statistics, diet sheets or training programs—as though information is the only form of advantage. But it isn’t. The greatest competitive advantage is the ability to use existing information better than the opposition, to be trained in critical thinking. This, of course, belongs to a much longer-standing tradition: philosophy.” This passage comes from Ed Smith, a former professional cricketer and gifted writer for the New Statesman, is one of the more eloquent voices speaking out against the growing datafication of sport.
Smith recently wrote a very good article about Theo Epstein, the Major League Baseball general manager who helped deliver the Chicago Cubs their first World Series title in 108 years. Epstein is something of a curse-breaker, having previously helped crack the 86-year-old Curse of the Bambino and deliver the Boston Red Sox three World Series titles in a span of 10 years. In his profile of Epstein, Smith argues that what coaches and athletes are, primarily, is decision makers.
But although the role of sports scientists and medical professionals is increasing in prominence, the results very often do not support this shift in the balance of power. Toni Minichiello, coach of the Olympic Champion heptathlete Jessica Ennis-Hill from the U.K., was the focus of an article in The Telegraph titled “Society must not waste the wisdom of coaches.” Minichiello posted the article on Twitter with this comment: “Coaches in UK have been marginalized in funded sports. Medics/science leads and takes no accountability if it fails.” And this is increasingly true. There are ever more tools that claim to use “big data” to do “predictive analysis.” Virtually every tagline rehashes the old cliché “train smarter, not harder”—with a few of the latest buzzwords added.
The trader and statistician Nassim Nicholas Taleb offers this assessment of the idea of “intelligence”—“[Intellectual yet idiot]s think that intelligence is about noticing things are relevant (detecting patterns); in a complex world, intelligence consists in ignoring things that are irrelevant (avoiding false patterns).”
“Intellectual yet idiot” is a Taleb-ism used to criticize people who are supposed experts in a given field, yet who lack any sort of common sense or practical experience. Taleb is especially fond of the distinction “skin in the game.” Skin in the game means, fundamentally, that your mistakes cost you something. Sport—at least for athletes—is one of the great examples of literal “skin in the game.” But as Minichiello said, rarely do these increasingly prominent advisors bear any sort of responsibility when things go wrong. They are, of course, happy to claim a piece—a large piece—of the credit for success, claiming that it was better use of previously overlooked data, marginal gains and clever insights that won the race, not the athlete. But when the athlete fails, these same people disappear. Don’t hang the loss on them. It’s not their fault that [insert entirely predictable scenario here] couldn’t be accounted for in their modeling.
This isn’t to say that data isn’t valuable. Objective measurement and tracking of those objective measures over time can be hugely important, especially in a sport where you are racing against the clock. Timed events are great, because the winner is simply the person who goes the fastest. If it takes you less time to run up to the top of the same mountain, it’s much more likely that you are getting faster than that the mountain is getting smaller.
Power meters on the bike ushered in greater objectivity to training load measurement, since times for cycling—especially for flat-time efforts—are much more affected by environmental factors than laps around a track or, especially, laps in the pool.
But it’s also important to remember that they do not award podium placings for the person with the highest watts per kilogram or the lowest watts per CdA. Ultimately, on the bike, it’s about hours, minutes and seconds, no different than swimming or running. But even running has started to shift away from the classic measure of simply timing how fast you ran.
Running power meters claim to offer vastly improved metrics, but with very little supportive benefit. So far, no world records have been set on the track or road by athletes using running power meters to guide their training. While that will almost inevitably change, it will take more than one such performance to convince me that it was because of—rather than in spite of—this device.
To me, far and away the single best use of data is in the moment. How fast are you swimming right now? In this regard, GPS watches have been a true boon to runners, allowing athletes (like me) who don’t have convenient access to a track to make their own track anywhere they have a clear view of the sky.
Overwhelmingly, I use GPS simply like a pace clock—to tell me how fast I am running in the moment. I like having the data, but knowing what I did is not nearly as useful as having a guide to help me with what I want to do—e.g., 5 × (1 km at 3:20, 1 km at 3:50), which is a staple pre-Ironman workout for me. Yes, I’ve done this workout a bunch of times, so it’s nice to be able to compare it to prior efforts. But the real value is knowing how I am executing in the moment.
I use a power meter in a similar way. Overwhelmingly, I use it to look at the data while I am riding. In certain cases, I do some analysis immediately after the workout. And very rarely, I reference it against past performances of the same type. But typically, I only compare simple measurements.
The most common is another staple workout—a 30-minute all-out time trial. What I overwhelmingly care about is my average power over the 30 minutes. I might compare pacing (how I metered out my efforts), but I’ve been doing this long enough that I pretty well know how to pace it to get the best out of myself. And in all cases, the second most important piece of “data” is how I felt.
I actually discovered that my power meter was mis-calibrated because I felt quite good and rode pretty dang far on watts that were clearly too low. How you feel is a hugely important metric, and one that, in general, still can’t be tracked objectively in many of the most popular and supposedly advanced training logs and data capture packages. I’ve heard rumors that TrainingPeaks is going to add a field for a numeric RPE (rating of perceived exertion) score, but the fact that it has lagged behind metrics like left–right power balance is shocking to me.
There is very strong research in support of tracking RPE. If you can run the same speed at a lower level of effort, that’s a pretty good sign of increasing fitness. In this way, what I like about data is not the patterns it reveals; it’s the things it allows me to ignore. I like data when it makes my life less complex. I say this as someone who has engaged in massive data analysis projects. I love looking for the signal in the noise. But I also try to surround myself with people who point out when I am overthinking things and simply torturing the data until it cries uncle and offers up something that can at least pretend to be a signal. While I have certainly found patterns, post hoc, I’ve found those patterns most useful because of the reflection they offer of my holistic approach to training. In other words, a consistent and thoughtful approach to training yields nice graphs that reflect consistent training load and progression. But it’s definitely not clear—especially in a sport like triathlon, where you have three individual sports to manage as well as their interaction with each other—that you can use this in a predictive way. While consistent training leads to nice graphs, nice graphs don’t necessarily lead to consistent training.
This is because overwhelmingly, the data that is generated during training is outputs, not inputs. The only input you have is subjective—how hard did it feel? How much effort were you putting in? Heart rate is the notorious one that confuses people here. A low heart rate is typically—but critically not always—reflective of low effort. But heart rate is an output; it is the reflection of a whole host of biological systems that manifest in a single number—how many times per minute your heart is beating. That’s it. This is not to say that heart rate data is not useful, but it’s important to remember that your heart rate is simply your heart rate. Now, you can use that in all kinds of ways. But it’s always an output. You can seek to adjust your input—effort—such that you produce a certain heart rate. But that’s not the same as making heart rate an input. Lowering your heart rate is not the same as lowering your effort. If you want to see this demonstrated most convincingly, hold your breath. Your heart rate will automatically drop (until you start breathing again). Your effort, most assuredly, will not.
I do think large data sets can be useful to identify limits. There is strong evidence for the benefit of using analytical tools to adjust your “ramp rate,” which is how quickly you increase training. When your short-term training load (ATL)—measured over about a week—exceeds 150 percent of your long-term training load (CTL)—measured over about six weeks—that’s a red flag.
But what’s most interesting to me is that one of the seminal studies on this phenomenon used a totally subjective metric— RPE—to determine training load. For example, if over the past week you’ve averaged a 7 out of 10 RPE but for the past two months, you’ve been averaging more like 3 out of 10, that should be a warning.
I love data. I love analyzing it. It has led me to positive conclusions much more often than it has led me astray—but there are absolutely times when it did lead me astray. And when it’s been the most valuable, it was because it allowed me to ignore other factors. I like racing with power because when I race, I have only two things to pay attention to—time and power. That’s it. I like data because it allows me to be more fully engaged when I am training, which—unless my brain is searching for something to latch onto, which it often does—allows me to be less engaged when I am not.
This is not in my nature. But this has been the lesson that data has beaten into me over and over again. Data works best when it allows me not to be a scientist, but to be a philosopher. In this way—and in this way alone—data becomes an input. What’s the output? My decisions.