The problem: variability in ASA-PS classification #
The ASA-PS classification is the gold standard for preoperative anesthetic risk assessment. Every patient scheduled for surgery receives a score from ASA I (healthy patient) to ASA V (moribund patient), which guides the anesthesiologist’s decisions on anesthesia type, monitoring and postoperative management.
A critical system with a known flaw #
The problem has been known for decades: inter-observer variability is high. Published studies show that anesthesiologists agree on the correct classification only 70% of the time — 7 out of 10 cases.
This means the same patient, examined by two different anesthesiologists, can receive different classifications. And the ASA-PS classification is not an academic exercise: it determines the level of monitoring, anesthetic precautions and resource allocation.
What happens when the classification is wrong #
| Error | Consequence |
|---|---|
| Risk underestimation (e.g. ASA II instead of ASA III) | Insufficient monitoring, unforeseen complications |
| Risk overestimation (e.g. ASA III instead of ASA II) | Wasted resources, delayed procedures, patient anxiety |
| Variability across hospitals | Unreliable clinical comparisons, distorted epidemiological data |
The HTX study: 11 AI models, 20 clinical cases #
HTX conducted a systematic study to assess whether large language models (LLMs) can improve the consistency of ASA-PS classification.
Methodology #
- 20 standardised clinical vignettes, selected from the most studied ASA-PS benchmarks in the scientific literature
- 11 language models tested, from first-generation models to advanced reasoning models
- Multilingual testing: each case evaluated in both English and Italian
- Repeated trials: each model tested multiple times to verify reproducibility
Models tested #
| Category | Models | Average accuracy |
|---|---|---|
| First generation | GPT-4, LLaMA 2, LLaMA 3, Mistral | ~77% |
| Second generation | GPT-4o, Claude 3.5 Sonnet | ~85% |
| Advanced reasoning | GPT-o3, Claude Sonnet (latest), DeepSeek R1 | 97.5% |
Key results #
Advanced reasoning models achieve 97.5% accuracy (95% CI: 92.9%–99.1%), significantly outperforming:
- First-generation models (~77%)
- The human benchmark (7.7/10 = 77%)
The average error drops dramatically:
- Physicians: 2.3 misclassifications per 10 cases
- First-generation models: 2.3 misclassifications (similar to physicians)
- Advanced models: 0.25 misclassifications per 10 cases
This represents an 89% reduction in error compared with manual classification.
DeepSeek R1: privacy without compromise #
A particularly relevant result: DeepSeek R1, an open-source model that can be deployed entirely on-premise, showed:
- Accuracy on par with the best commercial models
- Perfect reproducibility across repeated trials (same case, same result)
- Zero dependence on cloud servers
This demonstrates that private deployment — essential in healthcare — is feasible without sacrificing accuracy.
How KOI works #
KOI is the clinical decision support system for anesthesiology developed by HTX. It turns research findings into a usable clinical tool.
The workflow #
Step 1 — Clinical record analysis
KOI receives patient data: medical history, physical examination, diagnostic tests. The AI model analyses the complete clinical picture using a chain-of-thought approach — the same type of structured reasoning that an expert anesthesiologist applies mentally.
Step 2 — ASA-PS classification
The model produces:
- An ASA-PS classification (I to V)
- Detailed clinical reasoning explaining why it chose that class
- A confidence score indicating how certain the model is about the classification
- The key diagnoses that influenced the decision
Every classification is explainable and verifiable. It is not a black box.
Step 3 — Physician decision
The anesthesiologist:
- Reads the proposed classification and the reasoning
- Compares it with their own clinical judgement
- Decides whether to accept, modify or investigate further
KOI is a human-in-the-loop system: it supports the physician’s decision, it does not replace it.
Architecture: data never leaves the hospital #
KOI runs on PRISMA, the private AI infrastructure by HTX:
- On-premise: the model runs inside the hospital
- End-to-end encryption: data is protected in transit and at rest
- No data sent externally: not even metadata
- GDPR and AI Act compliant by design
In a sector where 38.4% of LLM studies fail to implement adequate data protections, KOI is designed for privacy from the ground up.
Regulatory status and roadmap #
Research Use Only (RUO) #
KOI is currently classified as Research Use Only — usable for:
- Clinical research
- Scientific validation
- Observational studies
- Education and training
It is not usable for clinical diagnostic practice.
Path to medical device certification #
HTX is following a structured certification path:
| Milestone | Timeline |
|---|---|
| Validation study (20 cases, 11 LLMs) | Completed |
| ISO 13485 certification (quality management system) | In progress |
| IEC 62304 certification (medical device software) | In progress |
| Clinical validation with Ospedale del Quadrante | Dec 2025 – Nov 2026 |
| Medical device marking | Planned 2027 |
The funded project #
KOI stems from the “ASA-PS Classification” project, funded by the Friuli Venezia Giulia Region (LR 22/2022, art. 7 — support for TRL 6–8 validation projects, EUR 90,000 grant). In collaboration with the Ospedale del Quadrante (Ramsay Sante), the project clinically validates the AI system for ASA-PS classification during the period December 2025 – November 2026.
Why AI in anesthesiology is different #
AI in medicine is often associated with excessive promises. ASA-PS classification is a different case, for three reasons:
1. The problem is well-defined #
ASA-PS classification has clear criteria, published case studies and established benchmarks. We are not asking AI to “diagnose cancer” — we are asking it to classify a patient on a standardised scale, using structured information.
2. Human error is documented and frequent #
The 30% inter-observer variability is not disputed: it is published and replicated data. AI does not need to be perfect — it needs to be more consistent than physicians. And at 97.5% accuracy, it is.
3. The human-in-the-loop model mitigates risk #
KOI does not decide: it proposes. The physician always has the final say. The system adds an objective second opinion — like having an expert colleague always available.
Who is KOI for #
KOI is relevant for:
- Hospitals looking to reduce variability in preoperative assessment
- Hospital groups seeking consistency across different facilities
- Researchers in anesthesiology studying AI clinical decision support
- Universities training anesthesia residents
If you are interested in a research collaboration or an RUO pilot, contact us.
This article was written by the HTX team — Human Technology eXcellence. KOI is currently Research Use Only (RUO). The information in this article is for informational purposes and does not constitute medical advice.
Frequently asked questions #
What is the ASA-PS classification?
The ASA-PS classification (American Society of Anesthesiologists - Physical Status) is the international standard for assessing preoperative anesthetic risk. It classifies patients from ASA I (healthy) to ASA VI (brain death), guiding decisions on anesthesia type and monitoring.
How accurate is AI in ASA-PS classification?
In the HTX study, advanced reasoning models (GPT-o3, Claude Sonnet, DeepSeek R1) achieved 97.5% accuracy on 20 standardised clinical cases. The average error drops from 2.3 to 0.25 misclassifications per 10 cases compared with physicians.
Does KOI replace the anesthesiologist?
No. KOI is a human-in-the-loop clinical decision support system. It proposes a classification with detailed clinical reasoning, but the final decision always rests with the physician. KOI supports, it does not replace.
Is KOI a medical device?
KOI is currently classified as Research Use Only (RUO) — usable for research and clinical validation, not for diagnostic practice. HTX is pursuing ISO 13485 and IEC 62304 certification, with medical device marking planned for 2027.
Can I use KOI in my hospital?
Yes, for research and clinical validation activities (RUO). KOI runs on PRISMA, which can be installed on-premise. Clinical data never leaves the hospital. The project is developed with the Ospedale del Quadrante (Ramsay Sante).