I keep seeing the same claim go around: evening workouts wreck your sleep.
I can actually check that. So I went through 233,917 workouts and 27,609 nights of sleep from 745 athletes on athletedata.health, scored each night against that person’s own normal, and looked at what their sleep did after easy versus hard sessions.
The hour of the workout barely mattered. How hard they went mattered a lot.
After a hard session, HRV came in lower and resting heart rate higher than that athlete’s usual night. After an easy session, both looked better. That held whether the workout finished at lunchtime or late evening, because it follows the strain you put in, not the time on the clock.
The one place timing did bite: hard sessions that finished within about two hours of bed. Those were the worst nights in the whole dataset. REM and deep sleep dropped, and so did HRV. Give the same session three or four hours of room before bed and most of that went away.
Easy sessions were fine at any hour. An easy spin an hour before bed still came out a touch better than average.
None of this argues with the research. The big four-million-night study everyone screenshots (Leota 2025) found the same shape: a timing-by-intensity effect with roughly a four-hour line, not a blanket “no evening exercise” rule. The popular version just flattened it into something punchier and wrong.
So if you train late and sleep fine, you are not doing anything wrong. Keep the easy work wherever it fits your day. Move the hard intervals and the heavy lifting a few hours earlier when you can.
1 Like
You are not taking a number of things in account…
- You simply state ‘HRV’ but the numbers for HRV will be hugely different depending on a ‘morning measurement’ or an ‘overnight measurement’. An overnight measurement will average HRV for the whole night and will be highly impacted by the ‘late’ training, while a morning measurement will barely make a difference in between a late evening or an afternoon training. Simply because the first hours after a training have significant lower HRV. Measuring in the morning is almost not impacted by that.
- REM and deep sleep hours from a wearable are not to be trusted. They are the result of ‘correlations’ found in parameters that are not directly describing sleep. There’s enough evidence that sleep phases deduced by wearables are, best case, 80% reliable.
But you should effectively avoid hard training short before bedtime because testosterone and adrenaline levels will disrupt good sleep. Give yourself at least 2 hours to wind down before going to sleep.
1 Like
Really good pushback, and you’re right on the measurement points. Let me be straight about what we did and didn’t control for.
On HRV: we did not separate overnight from morning readings, and you’ve put your finger on a real weakness. The HRV in our dataset is mostly an overnight or sleep-window value (Oura, WHOOP, Garmin), plus a large block of Apple Watch SDNN samples scattered through the night, with only a minority of morning-logged readings from intervals.icu. So it is not a clean orthostatic morning number. That means our HRV result is exactly the case your critique applies to: a hard session finishing close to bed depresses HRV in the first hours of sleep, the overnight average sits inside that window, and the number drops. That is the acute autonomic response you describe, caught by the averaging window, more than it is proof of worse overall recovery. A cleaner test would restrict to a single source, or to morning-logged HRV only, and see how much of the effect survives. I suspect a chunk would not, and that is a fair correction to how I framed it.
On wearable sleep staging: also fair. Consumer REM and deep classification against PSG sits about where you say, often 70 to 85 percent epoch agreement, and worse for REM than for total sleep time. Two things I’d offer. First, we scored every night against that same athlete on that same device, so a systematic staging bias largely cancels. We’re comparing a person to their own baseline, not Garmin to Oura. Second, staging noise pushes results toward the null, so it makes a real effect harder to find, not easier to invent. Total sleep time and the heart-rate signals are more trustworthy than the stage split, so that is what I’d lean on for the headline and treat REM and deep as supporting rather than load-bearing. Where I implied stage-level precision we don’t have, that’s on me.
And we agree completely on the practical end. Hard work close to lights out is the case that bites, and a couple of hours to wind down is the right rule. Our data put the clear penalty inside roughly two hours of bed, which lines up with what you’re saying about adrenaline and not being settled yet.
How is this not a privacy policy breach? Your policy says HealthKit data, including HRV, is “not used for research of any kind,” and more broadly says user data is used exclusively for personalized coaching, proactive messages, digests, context, and account management. Yet you used Apple Watch SDNN/HRV and other athlete data across hundreds of users for a public dataset-level analysis. Where in the policy did users agree to their health data being used this way?
1 Like