An interesting observation about Apple Watch VO₂max and training-based fitness metrics
Over the past three months, I’ve noticed a striking similarity between two independent estimates of my fitness:
• Apple Watch’s estimated VO₂max (“Cardio Fitness”)
• intervals.icu’s fitness/form metrics derived from training data
What caught my attention isn’t just that they move in the same direction — that’s expected. VO₂max, performance, and training load are obviously related.
What surprised me is how closely the patterns align over time, almost as if both systems are reacting to the same underlying signals.
Apple states that its VO₂max value is an estimate, not a direct measurement. According to Apple’s own documentation, the calculation is based on:
• heart-rate response during exercise
• motion and GPS data (pace, distance, elevation)
• personal attributes like age, sex, weight, and height
• submaximal exercise behavior, accumulated over time
What Apple does not disclose is the exact algorithm or weighting. There’s also no official claim that Apple uses machine learning or generative models for VO₂max estimation — at least not in anything they’ve published.
Still, this raises an interesting question:
Is Apple’s VO₂max estimate really driven primarily by heart-rate-to-effort relationships,
or does it implicitly encode something much closer to performance-based fitness modeling — similar in outcome (if not in method) to platforms like intervals.icu?
One plausible explanation is overlap rather than copying:
• both systems ingest pace, consistency, and physiological response
• both smooth data over time
• both try to infer aerobic capacity from imperfect real-world signals
If you train consistently, those inputs may converge strongly — even if the internal models are completely different.
I’m not claiming Apple “uses the same formula” or that there’s anything wrong here. Quite the opposite:
the alignment may actually be a sign that Apple’s model captures more than just raw heart rate — and that it’s better at reflecting functional fitness than we often give it credit for.
Still, the similarity is strong enough to make me curious.
And curiosity, after all, is how good models get questioned.