Should Whoop change its name to Whoops?

Last month, I started using a Whoop. I hoped that it would guide me toward better recovery. Unfortunately, the Whoop's activity tracking is too inaccurate and imprecise to be of much use. I sent it back for a refund.

Should Whoop change its name to Whoops?

The Whoop is a 24-hour activity tracker marketed to athletes. It purports to quantify “strain” and recovery. As it observes increased strain, it prescribes increased recovery. In theory, it’s a great idea.

“In theory there is no difference between theory and practice. In practice there is.”
Benjamin Brewster

Working out with a Whoop

For each workout with the Whoop, I also wore a chest-strap heart rate monitor. The chest strap was a Garmin HRM-Run and I recorded the data with a Garmin Fenix 2. I wanted to compare the Whoop data to data from a traditional heart rate strap.

Test Workouts

Here’s a breakdown of the workouts I did and what each device recorded:

Duration & % Max HR Avg HR Max HR Calories RPE
~2h <= 70%
Garmin 133 150 643 Self: Easy
Whoop 135 169 1699 Whoop: "Significant"
~2h <= 60%
Garmin 123 145 457 Self: Very Easy
Whoop 136 176 1664 Whoop: "Significant"
~90' 65-90%
Garmin 140 188 615 Self: Moderate
Whoop 120 165 966 Whoop: "Reasonable"
~60' 65%-90%
Garmin 139 189 255 Self: Hard
Whoop 134 186 766 Whoop: "Reasonable"

As you can see, Whoop’s accuracy–the average heart rate–is okay in all but the third workout. But its precision (gauged by maximum heart rate and stochasticity in the charts below) is horrible.

Heart Rate Recordings

Next up, here are the comparative heart rate recordings for the last three of the workouts above. (The first workout’s recordings are both too stochastic to compare.)

A very easy recovery session

On the left is the Whoop data. It's incredibly stochastic (i.e. jagged) and the calories are grossly overestimated. On the right is the Fenix data. It's fairly stable, similar to the workout, and the calorie estimate is in line with the perceived effort (which was very low).

An aerobic capacity lactate test

The blue line is the Whoop heart rate data. The red line is the data from a NordicTrack iFit heart rate strap recorded by a Garmin Fenix 2. The Garmin data makes it clear that the test was a step-test of increasing intensity. No such interpretation is possible with the Whoop data.

An anaerobic capacity lactate test

Note that accuracy and precision improve in the last workout. Is there hope? Not unless you want to wear the Whoop as a sleeve garter.

As above, the blue line is the Whoop data; the red, the Fenix and iFit strap. Here, the Whoop precision has improved due to changing the location of the device. Rather than wear it on my wrist, I positioned the monitor above my elbow over my brachial artery. However, it&#8217;s neither comfortable nor practical to wear it there 24 hours a day.

The precision improved because I changed the location of the Whoop. Instead of on my wrist, I wore it on my arm just above my elbow with the sensor on top of my brachial artery. It’s not the most comfortable location, and wearing it there isn’t something that I would do 24 hours a day.

Automatic activity tracking

Lastly, here’s an instance of the Whoop’s automatic activity tracking. The Whoop thought that this 38-minute period from Tuesday afternoon was worth tracking. It says that my heart rate averaged 110 bpm (52% of maximum) with a peak heart rate of 166 (79% of maximum).

The Whoop spontaneously recorded an activity while I was cleaning the house. Based on experience, it's highly unlikely that my average heart rate was 110 bpm with a peak of 166...

The only problem? That period was 38 minutes of house cleaning.

I didn’t wear my chest-strap monitor to compare. Why would I? The chance of me having those heart rates while I clean the house is highly unlikely.

The biggest problems with Whoop

Heart rate readings are roughly accurate but horribly imprecise

The average heart rates that the Whoop records are usually within a few beats of what my chest-strap monitor records. That’s accurate enough. But the precision (how narrow the range of the recording is) is so horrible that the overall intensity of activities is way off the mark. (Click here to read more on accuracy versus precision.)

The first two workouts were easy recovery sessions. They were so slow that I had to concentrate to stay within the desired zone. Yet Whoop described each of them as “significant cardiovascular load”.

The second two workouts were lactate tests, aerobic and anaerobic respectively. Whoop described both as “reasonable, enough to make you stronger, not enough to burn you out”. But doing a near-maximal and a maximal test every day would be sure to burn anybody out.

For the aerobic lactate test, I had both the Whoop and Fenix heart rates displayed in front of me. During my warm up, the Whoop was reading a heart rate 30 beats higher throughout. That’s crazy.

In addition, when the heart rates displayed were similar, the Whoop took noticeably longer to get to the same number. There’s always a lag between output and heart rate, but the lag for the Whoop was considerably longer. The longer the lag, the harder it is to train precisely.

The Whoop has no options for personalized thresholds

Heart rates are like fingerprints. Everyone is unique, and proper training needs to allow for that individual variation. All the heart rate software that I’ve ever used has personalization options to account for this. In most cases, a user can enter an anaerobic threshold heart rate or, at the very least, a maximum heart rate.

Whoop doesn’t allow for any personalization at all which makes it of very questionable use for serious athletes.

This is a breakdown of the intensity zones that Whoop uses and what it detected during a ~2-hour recovery ski. The actual HR intensity was < 60% of maximum, but Whoop detected a "significant cardiovascular load". In addition to its data being useless, it uses generic zones that are of no use to a trained athlete.

Without thresholds, the “strain” metrics that Whoop calculates will be off the mark. For athletes with higher than average heart rates, Whoop will overestimate the strain. For those with lower than average heart rates, Whoop will underestimate the strain.

That’s a big problem. High-heart-rate athletes will tend to undertrain. Low-heart-rate athletes will tend to overtrain.

I’m in the former category, and Whoop misjudged all four of my sessions. It overestimated the strain from my recovery sessions and underestimated my lactate tests.

It’s strange for a device to market itself to athletes and not have customization options. It seems like an amateurish oversight.

Wrist-based heart rate monitoring is unreliable

Whether its Whoop or something else, optical heart rate monitors produce unreliable data. The heart rates recorded have wide swings in beats per minute, especially when worn on the wrist.

“The popularity of optical heart-rate monitors…is largely due to the convenience and low cost… [D]uring periods of physical activity, accurately estimating a heart rate using these devices remains challenging.”
~ The Conversation, How reliable is your wearable heart-rate monitor? (June 19th, 2018)
“If you need to know your heart rate with accuracy when exercising…wrist-worn monitors are less accurate than the standard chest strap.”
~ American College of Cardiology, Wrist-worn heart rate monitors less accurate than standard chest strap (March 8th, 2017)
…if you’re serious about your sport and data today, then steer clear of the wrist monitor. It’s a great way for entry-level runners and fitness fans to take control of their plans, but for those using the data to train, it’s not up to the task.”

~ Wareable, Optical HR Accuracy: The experts speak (February 12th, 2016)

I can’t imagine trying to train with a heart rate monitor that displayed numbers so widely varied. Hitting a target intensity would be next to impossible. Without that precision, a workout would be pointless. You’d miss the desired stimulus and get an unplanned, and perhaps undesirable, response.

Should Whoop change its name to Whoops?

I had high hopes for the Whoop. But with how expensive it is, the quality of its data should be much, much better. I can see now that the thing that Whoop does better than the competition is marketing and graphic design.

The good news for Whoop

I assume that Whoop’s business plan is to go after users that don’t require precision in their heart rate data. This could include people new to exercise, those whose activity is of low aerobic intensity (i.e. golf, yoga, etc), and those who only require a rough estimate of their output (i.e. team sports). If an average heart rate and a rough strain gauge are all that’s required, then Whoop may be the right tool for the job.

Most users won’t know what an anaerobic threshold is. Even worse, many athletes don’t know what an aerobic threshold is. That ignorance will work in Whoops favor. They can continue applying generic formulas to their user base. And then maybe Garmin will buy them, and it’ll be a win-win for everybody.

Everybody except athletes that need accurate and precise data.

The good news for everyone else

I did see one positive behavioral change while using Whoop. I started paying more attention to the amount of time in bed.

Whoop recommends how much sleep you need. And it also recommends how much time in bed you’ll need to get that amount of sleep. I never would have thought of using time in bed as a metric, but it works.

In general, Whoop recommended that I stay in bed for at least nine hours. Previously, when I woke up, I’d get up. But with the Whoop, if I woke up too early, I would stay in bed, and often I’d fall back to sleep. That extra sleep always helped.

That’s one lesson that I’m going to take with me: stay in bed longer. And the best part? It’s free.