Methodological Compliance for The Use of Loon AI® in Evidence Synthesis: How Loon's Confidence Calibration Meets NICE, CDA-AMC, and ISPOR Auditing Requirements

Technical Brief for Market Access and HEOR professionals: Loon AI®'s compliance with in HTA Submissions. For HEOR professionals, the adoption of Large Language Models (LLMs) in evidence synthesis presents a paradox: immense efficiency gains versus compliance uncertainty
November 24, 2025 5 min read By Mara Rada
AI HTA Market Access HEOR HTA Clinical Evidence Systematic Reviews HTA Submissions Risk Management Biopharma Regulatory Real-World Evidence

Technical Brief for Market Access and HEOR professionals.

Introduction: The Mandate for Trustworthy AI in HTA Submissions

For HEOR professionals, the adoption of Large Language Models (LLMs) in evidence synthesis presents a paradox: immense efficiency gains versus compliance uncertainty. While AI can revolutionize manual Systematic Literature Review (SLR) processes—demonstrably reducing human review timelines from over 6,000 person-hours to approximately 85 hours for complex reviews—generic models are plagued by uncalibrated confidence and the risk of generating factually incorrect data (hallucination).1

This fundamental technical instability is the source of regulatory anxiety. Global HTA bodies, including the National Institute for Health and Care Excellence (NICE), the Canadian Drug Agency (CDA-AMC), and ISPOR, have established rigorous frameworks to govern AI use. This brief outlines how the Loon AI® confidence-calibrated technology, based on peer-reviewed validation, is specifically engineered to satisfy the explicit requirements of these bodies, thereby de-risking HTA submissions and ensuring methodological rigour.

I. The Foundational Requirement: Quantifiable Trust and Accuracy

The core technical failure of generic LLMs lies in their inability to provide reliable Uncertainty Quantification (UQ). They often report high confidence even when their answers are incorrect, making human oversight ineffective.2Compliance with HTA standards demands that AI tools not just perform, but prove their reliability.

Loon addresses this through a validated, confidence-calibrated methodology, published in Value in Health (2025). The system provides a quantitative confidence score for every screening decision, which is then used to structure the Human-in-the-Loop (HITL) workflow.

Peer-Reviewed Compliance Metrics

The validation confirmed that by routing only low-confidence outputs for human review (approximately $\le 5.8\%$ of citations), the overall methodological quality of the SLR reached a superior level, satisfying HTA requirements for rigour.

Performance Metric

Fully Automated (Baseline)

Calibrated HITL Performance (Post-Routing ≤5.8% Outputs)

Methodological Justification

Augmented Accuracy

95.5%

99.0%

Exceeds typical clinical data accuracy standards, eliminating data quality concerns.

Augmented Sensitivity (Recall)

98.9%

99.0%

Guarantees methodological completeness, mitigating the primary risk of excluding critical evidence.

Augmented Precision (PPV)

63.0%

89.9%

Confirms the effective triage of human expert effort, validating the tool's efficiency gain.

Negative Predictive Value (NPV)

99.9%

N/A

Provides assurance that rejected evidence is truly non-relevant, strengthening the systematic nature of the review.

II. Regulatory Compliance: Meeting HTA Mandates

The Loon system is architecturally designed to provide the specific evidence required by major regulatory and scientific organizations to validate the use of AI in evidence synthesis.

2.1. NICE & CDA-AMC: Human Augmentation and Auditability

Both the NICE Position Statement 3 and the CDA-AMC Guidelines 4 require that AI must "augment human involvement, not replace it" and that users must report the risks (e.g., bias) and the steps taken to address them.

HTA Mandate

Loon Compliance Mechanism

Technical Execution

Augmentation, Not Replacement (NICE)

100%, 20%, or confidence-guided sub-6% Human Review Thresholds

The validated system restricts mandatory human oversight to the approximately 5% of decisions where confidence is lowest, ensuring expert time is focused on the highest-risk data, thereby fulfilling the HITL mandate.

Bias Mitigation (CDA-AMC)

Unsupervised/Zero-Shot Methodology

The agentic AI performs screening based only on the provided inclusion/exclusion criteria, without prior training on potentially biased reviewer data. This zero-shot approach mitigates the risk of algorithmic bias in study selection.5

Transparency and Auditing

Structured Confidence Logging

Every decision includes a confidence score and rationale, creating an auditable, progressive trail. This logging supports the CDA-AMC requirement to report and mitigate risks.4

2.2. ISPOR ELEVATE-AI LLMs Framework: Mitigating Hallucination Risk

The ISPOR ELEVATE-AI framework addresses the core technical flaws of LLMs in HEOR, specifically the challenges of hallucination, data inaccuracy, and the need for rigorous reporting.6

ISPOR ELEVATE-AI Challenge

Loon Compliance Mechanism

Technical Execution

Hallucination Risk

Quantitative Confidence Calibration

The peer-reviewed validation proves the model accurately discriminates reliable outputs (99.0% accuracy), directly countering the primary risk cited by ISPOR.

Need for Traceability

Verifiable Source Grounding

All data extractions and decisions are tied to their original source document, ensuring the information is traceable and verifiable, as required for transparent HEOR reporting.7

Reporting Guidance

Integration of Validation Data

The platform’s outputs are supported by publicly available, peer-reviewed validation metrics, simplifying the process for HEOR teams to complete the ELEVATE-GenAI checklist.

III. The Business Case: De-Risking Submissions and Optimizing Resources

For purchasing Directors and VPs, the decision to adopt this technology hinges on three factors: compliance assurance, cost efficiency, and time savings.

  1. Compliance Assurance: By adopting a peer-reviewed, confidence-calibrated solution, the HEOR department gains quantifiable evidence (99.0% accuracy, 99.9% NPV) to defend against regulatory challenges regarding AI usage. This directly protects the integrity of market access dossiers.

  2. Resource Optimization: The validated ability to focus expert review only on the lowest-confidence decisions achieves unparalleled efficiency. This optimization frees up Methodologists and clinicians from low-value screening, reallocating their time to critical tasks like economic modeling and interpretation.8

  3. Time-to-Market Acceleration: The reduction of a complex systematic review's workload from over 6,000 person-hours to approximately 85 hours provides an immediate and material advantage in compressing evidence generation timelines, which is critical for supporting accelerated submissions and achieving faster time-to-market.8

IV. Conclusion: A De-Risked and Fast-tracked Path to Market Access

The strategic advantage of adopting Loon's confidence-calibrated AI is the removal of the methodological and regulatory risk associated with generic LLMs or unvalidated, uncalibrated tools. The high-stakes environment of HTA demands validation and the highest scientific rigour. The peer-reviewed metrics confirm that Loon's technology provides the quantitative confidence necessary to satisfy the global regulatory standards for quality, transparency, and human oversight.

Loon's pioneering confidence-calibrated methodology combined with peer-reviewed validation provides the necessary technical and scientific foundation for full compliance with the requirements of NICE, CDA-AMC, and ISPOR, providing the biopharmaceutical industry with the first commercially validated technological pathway that moved AI from an exploratory tool to a verified, core component of the evidence generation function.

References

  1. Janoudi, G., et al. Validating an Agentic Artificial Intelligence Abstract Screener across 8 Reviews Showed 99% Recall, Calibrated Confidence Scores, and a Sub-6% Human Check, Lifting Precision to 90%. Value in Health, 2025. (Source for 99.0% accuracy/sensitivity, sub-6% human review threshold, and core efficiency data).

  2. Azam F., et al. Evaluating the Confidence Levels of Large Language Models in Answering Medical Questions: A Multi-Specialty Analysis. Journal of Medical Internet Research, 2025. (Addresses the paradox of worse-performing LLMs exhibiting higher confidence).

  3. National Institute for Health and Care Excellence (NICE). Position Statement: Use of AI in Evidence Generation.November 2024.

  4. Canada’s Drug Agency (CDA-AMC). Position Statement on the Use of AI Methods in Health Technology Assessment. April 2025.

  5. Janoudi, G., et al. Evaluating Loon Lens Pro, an AI-Driven Tool for Full-Text Screening in Systematic Reviews: A Validation Study. medRxiv, 2025. (Describes the zero-shot/unsupervised methodology for bias mitigation).

  6. ISPOR Working Group on Generative AI. The ELEVATE-AI LLMs Framework: An Evaluation Framework for Use of Large Language Models in HEOR. ISPOR Working Group Report, 2025.

  7. ISPOR Working Group on Generative AI. ELEVATE-GenAI: Reporting Guidelines for the Use of Large Language Models in Health Economics and Outcomes Research. Value in Health, 2025.

  8. Al-Najjar, A., et al. The Evolving Role of AI in Systematic Literature Reviews: From Automation to Augmentation. Journal of Medical Systems, 2024. (Addresses the assistive role of AI and enabling researchers to focus on complex tasks/optimization).

Navigate the Complexities of Market Access with Expert Insights

Learn how Loon's evidence-based solutions can help accelerate your HTA submissions and market access strategies.

Schedule a Consultation

Frequently Asked Questions

Frequently Asked Questions

Start Transforming Your HTA and Market Access Strategy Today

Join pharmaceutical companies that are accelerating their market access with evidence-based AI solutions.

Schedule Your Consultation