Any reviews or comparisons of the wealth of AI tools for intervals?

My 2c,

I’ve checked out a number of the “applications” being posted on the forums, to see what’s available. I’m not using a single one of them.

I initially felt I should keep my opinions to myself, but with the “spam-type” flood of yet-another AI application being advertised, it’s getting a bit much.

So, as everyone is unique, and possible has their own goal(s), it doesn’t mean they are all bad; in fact there might be some that are good to get summary information quicker than it would take to go through a person’s data (wellness, activities, metrics, etc). But, and it’s a big but, it would have to be validated first to ensure it’s correctly returning information that is relevant and accurate. A quote often attributed to Einstein applies, “Not everything that can be counted counts, and not everything that counts can be counted”. The same applies to metrics, they might be measure but they don’t matter, and some things that matter can’t be measured (accurately).

I also notice that of them are suggesting the “next workout” based on fitness and fatigue. I guess it’s fine for someone just wanting to plod along and be maintenance-fit. I review the PMC (fitness, fatigue & form) at the beginning of the season, and perhaps a review 3-4 times per year. For me, RPE, Feel, subjective metrics (sleep quality, duration, soreness (DOMS), illness, tone of voice, style of writing text messages are more important than most of the wearable metrics available on the market). Device manufacturers want to sell devices (plus subscriptions where applicable) with the promise that it can tell you how you feel.

Power is good, but not essential. It’s does make it easier to calibrate a session effort, but it’s not essential.

HR is a lagging metric, but together with some other metrics it can help to triangulate the effect of the workout.

As for the workouts, there are two parts to this.

  1. AI can predict what next to do based on keeping the person in the green
  2. It can generate workouts.

Neither of these tasks are difficult to do, and once there’s a library of basic workouts, there’s no need to get all fancy with the workouts. Either extend time in zone, for long term endurance, or intensify workouts to sharpen the pencil or quickly pull fitness up. Don’t forget to rest/recover to allow adaptations to occur. Okay, it’s not as simple as that but it’s not rocket science. There are no magic workouts. Maximise the basics before trying to refine the marginal stuff.

Interval.icu has more than enough to get the basics done right.

  • Annual Training Plan - a long term planning tool
  • Calendar page - to record what’s important
  • Wellness pop up - to record subjective feelings, and some important objective metrics
  • RPE and Feel pop-up, after every workout
  • Workout builder and a library
  • Plans and Folders to help make medium and short term planning that aligns with the long term ATP.
  • Notes and comments, so that you don’t have to remember what you did last time, last week, last month or last year (and longer).
10 Likes

I’ll take a look, thanks.

Can you show me a worked example of an LLM doing this including the math?

I would disagree with this, at least in part. I understand the analogy, but first of all, every person would have their own model (obviously because of different “inputs”). However, a person always learns online, while an LLM learns offline. And an LLM “only” generates text based on probabilities or, more accurately, weights, while a person has memories and store them in distributed neural patterns, and can recall them flexibly. These are fundamental differences.

So surely no LLM in the world will outperform me when it comes to cycling. Or cooking in the kitchen. And so on. No “real world task” can be done by AI.
Of course, they beat me when it comes to writing (a lot of) text. But even then, they make an enormous number of mistakes as soon as the number of tokens approaches their set limit.

There are specialized chess AIs that beat the best human chess players. But do you know how to defeat these AIs? By having one of the best chess players use an AI as support.
This analogy certainly applies to coaching as well. It can certainly be used as support, but not as a replacement.
How do you measure the outperformance of AI compared to a coach? And what do you use it for, if I may ask?


But to get back 2 topic. I’ve tried two or three of them, too. But wouldn’t pay a penny for that. The analysis sucks. Either it’s way too much text, and it’s repeating itself again and again. Or it’s just telling me how long I have been ridden, and that I did great efforts … I see that myself in intervals, with all the customization even much better.
And I find it kind of annoying to tell a chatbot how much time I have available on which day and so on, so that it can plan the training correctly. If you wanted to change a workout, it wasn’t possible or it didn’t work properly. The workouts sometimes had a certain progression, but sometimes they just did weird things.

Don’t get me wrong. I also use the support of AI where it’s appropriate. I’ve built a “workout generator” with my predefined progressions that creates hundreds of workouts of varying lengths for me. My library probably already contains over 1,000 workouts. It’s now much easier to drag and drop them onto the day than to tell the chatbot what to do and then have it get it wrong again. And you don’t need new workouts every week… once they’re generated, why change anything?

I’ve also built a “Plan Filler” that uses these workouts and assigns them to available days to meet the planned target load and target duration. For me, that’s now just two or three clicks, and I have the plan for the next 4 weeks (or 6 months, if I want). I doubt that any of the “AI apps” can do this as well, with my chosen progression, increase in duration, selection of hard units, etc., etc.

I like the idea of @RunTK.com to built or use his Prompt Generator. If I had to choose an AI app, I would start there … Generating a prompt, and put it in the LLM of my choice. And chatting with it regarding my questions to maybe get some different “inputs”. But workout generation or analysing … I wouldn’t care about that at all in an AI app. I don’t want or need that.

Language models process text tokens so they are pretty terrible at actual calculations. In an agentic setup the AI is just the middleman. It realizes a calculation is needed, writes a quick Python script behind the scenes and runs it in an isolated environment to get the actual hard number.

It’s also not magically inventing workout routines. A properly built system just uses the LLM as an interface to understand your input and chat with you. The actual heavy lifting like deciding the sets, reps and load is handled by standard old-school code relying on established rules (think progressive overload algorithms). It’s not just spitting out mashed up fitness blogs.

And honestly it’s not “more exact than a human”. The math is identical. The real advantage is just pure automation. A machine can crunch your daily HRV, sleep data and total tonnage in milliseconds without making the careless tired mistakes a person might make if they had to run those numbers by hand every single morning

And of course I’m trying to build something like this. Can I say that I have already achieved it? No, but I can say that I have the principles and the objective clear

@txuselo Which AI tool is yours?

The reason to look at a worked example is that math is very unlikely to be identical. There is not a universally agreed upon mathematical equation to predict training response. The Banister impulse response model is one example, and does not have parameterizarion for a lot of the buzz words thrown around with HRV, sleep, readiness, etc. So yes the LLM maybe a middleman, but they are a middleman that must make all kinds of assumptions in order write the Python code that does the math. A worked example, would allow me to see the assumptions being made and help understand when and how the model is likely to break. It would be a much more efficient and robust process than just having a ton of alpha testers who have then have it try to interpret whether the output was reliable or not without a ground truth comparison.

I definitely appreciate you keeping me honest here. I think the main difference is that I’m not actually trying to predict training response. PlanWatts isn’t trying to run Banister models or guess adaptation based on HRV, sleep, or readiness. I totally agree that doing that would require assumptions I’m nowhere near qualified to make.

What it does do is plan and distribute training load, and the math for that is actually a lot more standard:

  • TSS/NP/IF: Standard Coggan formulas (the exact same ones TrainingPeaks and Intervals.icu use)
  • Weekly progression: Just a standard exponential ramp (around 5-8% a week) with hard ceilings and baked-in recovery weeks. Basically textbook Bompa & Haff periodization.
  • Taper: A progressive ~40% volume drop based on Bosquet.
  • Intensity: Either Seiler’s polarized model (80/20) or pyramidal, depending on your CTL and how many hours you have available.
  • Fitness tracking: Just basic fatigue metrics (CTL/ATL/TSB). Intervals.icu actually calculates these, I just pull them in to spot trends and categorize things.

After a workout, the app doesn’t try to analyze your adaptation or adherence. It basically just grabs the TSS from the actual ride data and plugs it right back into the load model. That’s pretty much it.

All of this is just hard-coded logic. The AI isn’t inventing the math, it just takes the TSS budget the algorithm already set and builds out the actual workout structure (intervals, durations, etc.). If the AI misses the mark, the code catches it and rescales the TSS to fit.

So the “assumptions” under the hood are really just explicit settings like ramp rates, recovery percentages, and zone thresholds. Exactly the kind of stuff a coach could look at and say, “Yeah, I’d set that differently.” Honestly, that’s the exact feedback I’m looking for. Let me know if you want to walk through an actual example, I’d be happy to show you

It sounds plausible and reasonably safe to use an LLM take care of the tedium. I’m not sure if that would be LLM based coaching versus facilitated. I’m interested in the prediction modeling side. In your case, your prediction model looks like: Performance ~ TSS, while TSS(week i+1) < 1.08*TSS(week i) and TSS < TSS(max) … that is until you taper. During the taper it’s more complicated … once you add taper and get into ATL and CTL that gets into Banister assumptions or other lagged parameters.

Thanks for your analysis. When I said I wasn’t trying to run Banister models, I meant I’m not trying to dynamically adjust individual time constants based on sleep or HRV to predict an exact peak. But you are spot on by relying on ATL and CTL to manage the taper, I am implicitly assuming that mathematical base, even if I just stick to the industry defaults. I stand corrected.

Regarding the term “LLM facilitated training”, technically I agree with you, but realistically… If tech companies had to call their products by the strict names of the technologies they use, no one would understand them. Using terms like coach or AI is just necessary marketing to get the core concept across quickly. My app is built for self-trained athletes who need to organize their training weeks, not for a UCI pro team structure. For that user, the app serves the practical function of a coach by automating the tedious work.

Given your highly technical background in the training world, I’d love to explore if a tool like PlanWatts could be used by actual coaches to speed up their workflow. I’m not exactly sure what that would look like in practice since the professional coaching world is outside my direct experience, but as I mentioned in my first post, I genuinely believe that those who don’t adapt and leverage these tools in their own sector will likely get left behind.

I’d love to get your take on how someone with real physiological knowledge could actually use or tweak this system to their advantage. I would be happy to discuss something in that direction

1 Like

But that’s my point. Why using the LLM if you also could use that script (which does the math) yourself, without the Chat interface?

I did create that myself and it is nearly fully automated. Get the fitness curve, get available time, grab target values and plan the workouts for given days. Just with some sliders and buttons.
If I don’t have time on Sunday for a three hour ride, just change it to a 45m sweet spot in less than a minute, without explaining it in four or five sentences to a chatbot.

So what’s the advantage of using the chat interface in this case?

You make a very fair point. If you already built a script that solves your scheduling problem in seconds, I completely understand why you wouldn’t see the appeal of typing into a chat interface.

To answer why I prefer the chat approach and where the advantage of this integration actually lies, it basically comes down to developer laziness. My mindset pushes me to systematically automate anything I have to do more than twice. With current AI tools, building these types of flexible integrations and automating workflows has become much more accessible

I originally started coding because I was tired of the generic, pre-made workouts on Zwift, and I didn’t want to take on the recurring cost of a personal coach. However, the one thing Zwift got absolutely right was the integration. The real value it provided wasn’t just the plans themselves, but the ability to wake up every day and have the workout already waiting on my Garmin without me having to do anything at all

I wanted that exact level of frictionless automation, but with total flexibility. For me, the real advantage of the chat is being able to avoid building or navigating a complex UI for every possible edge case. Its incredibly convenient to just drop two sentences saying, “build me a 1-hour sweet spot workout for today with cadence changes and around 80 TSS.” The LLM understands the intent, applies the underlying logic, and the integration pushes the structured workout straight to my device. Tomorrow or next week, I can just ask for a completely different variation without touching a single slider or manually exporting a file.

In the past, before I discovered Intervals.icu, the friction of manually managing workouts was so high that I actually stopped training for a while just to avoid dealing with Garmin editor. I figured if the friction of traditional editors was enough to derail my training, other cyclists were probably experiencing the exact same issue.

You solved that friction by building your own tool, which is a great approach. I opted to let the chat handle the friction of translating what I want into structured data and letting the API integrations do the heavy lifting.

Ultimately, we are both just trying to remove the barriers to getting on the bike, though I have to admit that in my case, all that “clicking” to build the software is actually another one of my passions right alongside riding :keyboard: :man_biking:

1 Like

Important discussion – let’s see what we get out of all this for the next year, i.e., 2027😉
I have tried a few of the floating LLM-based “coaches” and some keep sending me info, analysis, planning emails. Sometimes useful reflections I take into consideration.
Moreover, I do have an active athletic.ai subscription because that provides me a good base for my short-term (1–4 weeks) training plan including tapering for races, etc. I’ve also copied some of its recovery arithmetics to intervals.icu, for instance in the (public) “HRV profile” or “rHR profile” charts…
I also have an opinion on how an AI coach should work for me, but I really want much more than a general LLM or a Pareto-fulfilling GPT behind …
For now, I am continuing with a “natural intelligence” (me) adjusted version of the atheltica.ai plan, because it is convenient, esp. regarding my time restrictions. I want to spend time on the bike, not thinking about time on the bike. And I adjust based on feel and some simple plots like the standard ones in intervals.icu and the added ones above… :slight_smile:

But I am also sure that LLMs and agents will provide a lot more – also to training – in the very near future:-)

Honestly, I think AI is already more than capable of doing this well.

As I’ve said above, there’s nothing magical about what coaches do or about training theory in general. The issue is that most of these AI coach apps aren’t using the best model, just the most cost-effective one.

That’s the core problem with most current “AI-powered” coaching apps: poor consistency and poor execution. What makes it even worse is that a lot of these apps are optimizing for token efficiency. They’ll route your prompt through an initial cheap model first with a system prompt along the lines of “review this prompt, decide the complexity of the task, and return the cheapest model from the list below that can effectively answer it.” So one moment you’re getting Sonnet 4.6, the next you’re getting GPT-4o mini. The user experience ends up being wildly inconsistent, and people blame “AI” when the real problem is cost-cutting in the pipeline.

I am constantly blown away by the pricing of these AI coaching apps when under the hood they’re basically a few system prompts and a nice-looking wrapper. They look great in screenshots, but the moment you actually try to use the AI for something meaningful, the aggressive cost optimization kicks in and the results are terrible. I was actually just looking at one that costs $280/year and I think it was for some premium elite pro max tier that apparently doesn’t even give you Opus 4.6, only Sonnet at best. That’s crazy.

Any good idea/suggestion how I connect my intervals.icu data and chat to a chatbot (API)? Maybe even an OpenWebUI LLM? :wink:
I completely follow your argument on the quality of high-end LLMs, at least, for supplying the input for (semi-)self-coached athletes as myself and would really be happy to do all my analysis and planning in intervals.icu – with an LLM-advisor in the background of this. Pretty darn close to my “dream”.

Did you try any of the more recently developed AI coaches/assistants that have active threads on this forum? You mentioned you use Athletic.ai, but platforms like PlanWatts from @txuselo, LeCoach from @Rutger, and MyTrainPal from me fit your “dream” more, in my opinion. Sure, they have their own UIs, but because they sync both ways with Intervals.icu, you can treat the chat that each one of these apps has as the LLM advisor you mentioned.

It takes a lot of work to fine-tune the agent to be reliable in coaching/scheduling tasks, so if you want to go with something like OpenWebUI LLM, you need to be prepared to do this work yourself.

1 Like

Yes, multiple. But several weeks ago. Wasn’t super-happy and decided to be conservative and stay where I am for the spring and re-evaluate what’s actually really working well for next season later this year😉
@RunTK.com’s argumentation seems to support my “conservativeness”…

Ive tried about 30 different local LLM models up to 70b (and aggressive quantization) and do a lot of benchmarking between the models. All models leave so much to be desired.

Until LLMs can mimic intuition or gut feeling, they’ll never replace humans, in my opinion

What’s your criteria for “best”? Do you have any quantitative benchmarks you run to determine which model to use? I’m really only interested in running these models locally, though, so I’ll always be limited by context space, unfortunately.

.

I think you are missing a massive blind spot: convenience and the value of time.

The vast majority of athletes don’t care about the underlying LLM model, the context window, or setting up API keys. They are demanding these commercial apps for two simple reasons: immediacy and zero friction.

Yes, the DIY approach is technically superior and cheaper in the long run. But for the massive silent majority, the goal is to spend less time looking at screens and more time pedaling. These apps aren’t selling AI intelligence; they are selling execution and convenience.

And that is exactly why the debate in this thread is so interesting. It helps uncover which app, tool, or DIY approach best fits each specific use case and user profile

5 Likes

I disagree with one statement: it’s not guaranteed that the DIY approach will be technically superior in the long run. Commercial apps will go through thousands of feedback loops, improving every small detail. That’s hard to reproduce on your own, even with AI.

I agree with earlier points, that the vast majority of AI coaching apps are just trying to serve you in the cheapest way possible. But this is the reason a couple of passionate people here are dedicating their time to offer something different. The pricing model can be transparent and usage-based, making it no different cost-wise for the end user than calling an LLM directly, in many cases even cheaper for a better result, because someone has put in the work to make context usage more efficient.

2 Likes