comparison · triathlon
What general-purpose AI gets wrong about your training plan
Two-thirds of gym-goers used AI-powered fitness software in 2025, according to an industry survey cited by The New York Times in April 2026. Most of them were not using dedicated coaching platforms. They were typing prompts into ChatGPT, Claude, or Gemini and getting back training blocks within seconds.
That is not inherently wrong. The plans these tools produce are often structurally coherent, periodized in a recognisable way, and better than nothing. But “better than nothing” is a low bar, and for intermediate triathletes who are already training consistently, the gap between a plausible-looking plan and one that is actually calibrated to them is where progress stalls or injury accumulates.
The question worth asking is not whether general-purpose AI can write a training plan. It clearly can. The question is what it cannot see, and whether that matters for your training.
What a general-purpose LLM actually produces
Ask ChatGPT, Claude, or Gemini to write a 12-week Olympic-distance triathlon block for an intermediate athlete training 10 hours per week, and you will get something back quickly. It will have a taper week, a build phase, a recovery week every third or fourth week, and a reasonable distribution across the three disciplines. The structure will look familiar to anyone who has read a triathlon training book.
What it will not have is any of the inputs that determine whether that structure is right for you specifically.
The model is working from patterns in its training data. It has absorbed the logic of periodization, the general principles of progressive overload, and enough triathlon literature to produce something that passes a surface inspection. But it has never seen your last six months of training. It does not know your current chronic training load or how much acute load you accumulated last week. It cannot tell whether your swim is your limiter or whether you are already carrying run fatigue from a half-marathon two weeks ago.
This is not a criticism of the models. It is a description of what they are. General-purpose LLMs are reasoning engines trained on text. They are not connected to your body.
The inputs that actually drive training outcomes
Iñigo San Millán, the physiologist who has worked with Tadej Pogačar and other elite athletes, has written extensively on the relationship between metabolic markers and training prescription. The core principle is that effective training is built on individual physiological data, not population averages. A generic plan is a population average by definition.
For triathletes specifically, the inputs that matter most include:
- Current fitness-fatigue balance. Banister’s fitness-fatigue model describes how performance at any point is the difference between accumulated fitness and accumulated fatigue. Both decay at different rates. A plan that ignores where you currently sit on that curve is guessing.
- Sport-specific fatigue interaction. Research by Mujika et al. on triathlon training load shows that cumulative fatigue across swim, bike, and run does not aggregate simply. Run fatigue, in particular, carries over into cycling economy in ways that aggregate TSS scores do not capture. A plan that treats your weekly load as a single number misses this.
- Session-RPE calibration. Carl Foster’s work on session-RPE (the Borg scale applied to the full session rather than a single moment) demonstrated that subjective load ratings are a valid measure of internal training load, but only when they are tracked longitudinally. A single session’s RPE means relatively little. Six weeks of session-RPE data tells you a great deal. A general-purpose LLM has access to neither.
- Your actual limiter. The SAID principle (Specific Adaptation to Imposed Demands) is well understood in sports science: your body adapts to the specific demands you place on it. If your run is your limiter, a plan that distributes load evenly across three disciplines is not optimising for your race outcome. Identifying the limiter requires knowing your current performance across all three disciplines, which a fresh prompt session cannot establish.
A general-purpose AI tool, asked cold, will default to a balanced distribution. That is the statistically average answer. For many athletes, it will be the wrong one.
What purpose-built platforms handle differently
Platforms built specifically for adaptive training, such as Athletica.ai and Pelaris, approach the problem from a different starting point. Rather than generating a plan from a single prompt, they work from a continuously updated model of the athlete.
Athletica.ai’s published methodology references Banister’s model explicitly, using real-time fitness-fatigue tracking to adjust prescribed load based on where the athlete sits on the curve at any given point in the training cycle. The plan is not static. It responds to what actually happened in training, not what was supposed to happen.
This matters more than it might seem. Intermediate triathletes training 10-14 hours per week are operating close enough to their recovery ceiling that a week of accumulated fatigue, a disrupted sleep pattern, or an unexpected hard effort can shift the optimal training stimulus significantly. A static plan generated from a prompt cannot see any of that. An adaptive platform that is receiving your training data can.
The distinction is not about which tool is “smarter.” It is about what data each tool has access to. This is a data access problem, not an intelligence problem.
(The interference effect between concurrent strength and endurance training in triathlon is a related problem, and one worth understanding separately. We have covered the physiology of that in detail here.)
Where general-purpose AI tools genuinely add value
This is not an argument for abandoning ChatGPT, Claude, or Gemini in your training process. They are genuinely useful, just not for the job most people are trying to use them for.
Where they add real value:
- Explaining the physiology. Ask Claude to explain how lactate threshold training works, what happens to muscle protein synthesis during a taper, or why Zone 2 training drives mitochondrial adaptation, and you will get a clear, accurate answer. The reasoning is sound even when the prescription cannot be personalised.
- Structuring a training week conceptually. If you understand your own limiters and current load, a general-purpose LLM can help you think through how to structure a training week, what to prioritise, and how to balance competing demands. You are providing the context it cannot access itself.
- Interpreting your data. If you bring your own training data into the conversation, these tools become considerably more useful. Paste in your last four weeks of load data and ask Claude to help you identify patterns, and it can do that well.
- Answering specific questions. “What is the typical taper duration for an Olympic-distance triathlon?” or “How should I adjust training if I am also doing a long bike event three weeks out?” are questions that general-purpose AI handles accurately.
The limitation is not the tool. It is the missing context. When you supply the context, the tool performs. When you ask it to generate a plan without context, it is producing a population average dressed up as a personalised recommendation.
Using both: where the combination works
The most practical approach for intermediate triathletes is not to choose between general-purpose AI and adaptive platforms. It is to use each for what it does well.
General-purpose LLMs are good thinking partners. They can help you reason through structure, understand the science, and work through trade-offs. They are useful for the “why” and the “what” at a conceptual level.
Adaptive platforms are better suited to the “how much” and “when,” because those answers depend on longitudinal data that only accumulates over time. The ACWR (Acute:Chronic Workload Ratio) is a useful example here: managing the ratio of your recent load to your longer-term load baseline is one of the most reliable injury prevention tools available, but it requires weeks of consistent tracking before it becomes meaningful. A general-purpose LLM cannot compute your ACWR because it does not have your training history. An adaptive platform that has been receiving your data for six weeks can.
That combination, LLM for reasoning and adaptive platform for load management, is more useful than either tool alone.
If you are going to use a general-purpose LLM for training, give it this
If you are going to use ChatGPT, Claude, or Gemini to help build a training plan, the quality of the output is almost entirely determined by the quality of the context you provide. A prompt that includes the following will produce something substantially more useful than a prompt that does not:
- Your current weekly training volume by discipline (swim, bike, run in hours or kilometres)
- Your most recent race result or benchmark effort for each discipline
- Your available training hours per week and which days are constrained
- Any injuries, limiters, or known weaknesses
- Your target event, distance, and date
- Your subjective sense of current fatigue (fresh, normal, tired, very tired)
That last point matters more than most people give it credit for. The model cannot measure your fatigue, but if you tell it where you are, it can at least factor that into the structure it produces.
The honest answer is that even a well-prompted LLM is producing a starting point, not a calibrated prescription. It is a reasonable first draft that still requires your judgment to apply. That is a useful thing. It is just not the same as a plan that adapts to what actually happens over the weeks that follow. The bridge is to give the assistant your real training context: through Tideway, ChatGPT or Claude can reach your training history and coach over the Pelaris MCP, so the plan is grounded in what is actually true about you, not population averages.
What this means for your training
- General-purpose AI tools (ChatGPT, Claude, Gemini) can produce structurally sound training plans, but they are working from population averages, not your individual physiology or current training state.
- The inputs that determine whether a plan is right for you, including fitness-fatigue balance, sport-specific load interaction, and session-RPE history, require longitudinal data that a single prompt session cannot access.
- For triathletes, cumulative fatigue across three disciplines does not aggregate simply. A plan that treats weekly load as a single number will misallocate training stress, particularly in the run-to-bike fatigue interaction.
- If you use a general-purpose LLM for training, provide as much context as possible: volume by discipline, benchmark efforts, available hours, known limiters, and current fatigue level. The output quality scales directly with the context you supply.
- The most effective approach combines both: use LLMs for reasoning, conceptual structure, and answering specific questions; use an adaptive platform like Pelaris for ongoing load management, where the value compounds as your training history accumulates.