Usually regulated if clinical output

Is an AI health app a medical device?

The technology is irrelevant. What determines classification is whether the AI output influences a clinical decision.

General information only. This guide does not constitute legal, regulatory, or professional advice. Regulatory classification depends on the specific facts of your product, your intended market, and how your software is described and used. Rules change, and the information here may not reflect the most current guidance. If you are unsure whether your software is a medical device, seek advice from a qualified regulatory consultant or legal professional before making any compliance decisions. QualiHQ makes no representations about the accuracy or completeness of this content and accepts no liability for any decision made in reliance on it.

Founders building AI health products often frame the regulatory question as "is AI regulated?" It is not the right question. AI is a technology. What is regulated is what your product does with that technology. The same questions that apply to any health software apply to AI health software, and they produce the same answers.

Where AI makes things genuinely harder is not at the classification stage. It is everything after: validation, explainability, drift monitoring, retraining documentation, and the ongoing obligation to demonstrate that your model still performs as claimed. Those are problems that static software does not have, and regulation is only beginning to grapple with them.

The same test applies: what does the output do?

MDCG 2019-11 does not mention AI. It does not need to. The test is whether software produces outputs that are "used to take decisions with diagnosis or therapeutic purposes." A model that classifies a retinal scan as showing diabetic retinopathy is producing diagnostic output. It is a medical device. A model that schedules follow-up appointments based on calendar availability is producing administrative output. It is not.

The confusion with AI comes from the fact that large language models can do both things, sometimes in the same conversation. ChatGPT answering "what causes chest pain?" is providing general health information. The same model, configured to assess your specific symptoms and tell you whether to go to A&E, is functioning as a diagnostic tool. The model is identical. The product context and output determine the classification.

Google's Med-PaLM and similar large language models trained on medical data sit in the same position. A model that answers general clinical questions for a medical education platform is not a medical device. The same model embedded in a clinical decision support tool that influences how a doctor treats a patient very likely is. Google has been deliberate about this distinction in how they describe the intended use of their health AI products.

Tools like dxGPT, which are built specifically to assist with differential diagnosis, illustrate the other end of the spectrum. The intended purpose is explicit: help identify what condition a patient might have. That is a diagnostic purpose, and the output influences a clinical decision. The AI label does not change the analysis.

Neko Health, the preventive health company co-founded by Spotify's Daniel Ek, is another clear example. Their full-body scan service uses cameras and AI to detect anomalies across the skin, body composition, heart, and more. That output -- personalised detection of potential health issues for an individual -- is exactly what EU MDR is designed to cover. The technology is impressive. The classification is straightforward.

Same model, different product, different classification

General health Q&A (e.g. ChatGPT answering "what causes chest pain?") Not regulated
LLM configured to assess symptoms and recommend care (e.g. ChatGPT Health) Regulated
AI differential diagnosis tool (e.g. dxGPT) Regulated
Google Med-PaLM used for medical education and information Not regulated
Google Med-PaLM embedded in clinical decision support for patient care Regulated
Full-body preventive AI scan (e.g. Neko Health) Regulated
General fitness AI coaching (e.g. Samsung Health) Not regulated
AI summarising clinical notes for admin and billing Not regulated

Where AI creates unique compliance problems

Classification is the same. Everything after classification is harder.

Algorithm validation is ongoing, not one-time

Traditional medical device software is validated at a point in time. AI models can drift as the data they encounter in production shifts from training data. EU MDR expects post-market surveillance. For AI, this is not a passive audit -- it requires active performance monitoring and potentially re-validation when the model is retrained.

Technical files need to explain the AI decision

Notified bodies want to understand how your device reaches its outputs. For a rule-based system, this is straightforward. For a neural network that classifies pathology images, "the model learned from 500,000 training examples" is not a sufficient explanation. You need to demonstrate how you validated the model's performance, how you assessed it for bias, and how you would detect if it stopped working correctly.

The EU AI Act adds a second layer

AI medical devices that are also high-risk AI systems under the EU AI Act face dual compliance obligations. The AI Act requires conformity assessments, transparency documentation, human oversight mechanisms, and registration in the EU AI database. For most AI medical devices, this means two overlapping regulatory frameworks with partially shared and partially distinct requirements. Enforcement timelines differ: MDR applies now; the AI Act's high-risk provisions are phasing in through 2025-2026.

The practical implication for AI health founders

If your AI produces clinical output, you are in the same compliance position as any regulated medical device manufacturer, plus additional obligations that are still being defined. The honest position in 2025 is that full regulatory clarity on AI medical devices -- particularly around retraining, drift, and the AI Act overlap -- does not yet exist. Notified bodies are working it out in real time.

What this means practically: build your QMS to capture evidence about your model -- training data, validation results, performance metrics, bias assessments, retraining logs. These are not optional extras. They are the technical file evidence that a notified body will need, and collecting them retroactively is significantly harder than building them in from the start.

Building the evidence base for an AI medical device

AI medical devices need the same QMS foundation as any regulated software -- ISO 13485, IEC 62304, clinical evaluation -- plus an evidence layer specific to the AI: algorithm validation reports, training data documentation, bias assessments, and post-market performance monitoring.

QualiHQ helps you build and maintain the requirements, test evidence, and release documentation that form this foundation. Getting it right from the start is the difference between a clean notified body submission and a remediation cycle.

Check your AI product →

Common questions

Does GDPR affect AI medical devices differently?

Yes. AI medical devices often involve automated decision-making that affects individuals, which triggers GDPR Article 22 rights. You will need to document how the AI makes decisions, ensure appropriate human oversight, and address data minimisation in your AI training process.

What guidance exists for AI as a medical device in the EU?

The primary references are EU MDR 2017/745, MDCG 2019-11 (software qualification), and European Commission guidance on AI as a medical device. The EU AI Act also applies additional obligations for high-risk AI systems that overlap with medical device AI. This area is evolving quickly.

Is an LLM a medical device if I use it in healthcare?

A general-purpose LLM is not itself a medical device. If you build a product on top of an LLM that produces personalised clinical outputs -- diagnoses, treatment recommendations, risk predictions -- for individual patients, your product is likely a medical device. The classification attaches to the finished product and its intended use, not to the underlying model.

Not sure where your app sits? The tool takes two minutes.

Check my app →