Methodological Compliance for The Use of Loon AI® in Evidence Synthesis: How Loon's Confidence Calibration Meets NICE, CDA-AMC, and ISPOR Auditing Requirements

Q: What is the role of Loon Lens™ in evidence synthesis?

Loon Lens™ is an AI-powered platform that automates the systematic review process, dramatically reducing the time required from months to weeks. It combines advanced natural language processing with expert validation to ensure comprehensive, accurate evidence synthesis that meets all regulatory requirements. The platform handles literature screening, data extraction, quality assessment, and report generation while maintaining full transparency and audit trails.

Q: What is the role of Loon Lens™ in literature screening?

Loon Lens™ is a scientifically validated AI literature screener that identifies relevant studies with 98.95% sensitivity. It automates the most time-consuming aspect of systematic reviews while maintaining higher accuracy than traditional manual screening methods. The system uses advanced machine learning algorithms trained on millions of biomedical publications to understand complex inclusion criteria and identify relevant studies based on semantic understanding rather than simple keyword matching.

Q: Why is human oversight important in AI-driven evidence synthesis?

Human oversight ensures that AI-generated evidence is clinically relevant, contextually appropriate, and meets the specific requirements of each HTA submission. It combines the efficiency of AI with the critical thinking and domain expertise of human researchers. Expert reviewers can identify nuances and contextual factors that AI might miss, ensuring that the final evidence synthesis reflects both comprehensive data analysis and expert clinical judgment.

Q: How does Loon ensure the security and risk management of its AI systems?

Loon implements comprehensive security measures including data encryption, access controls, regular security audits, and compliance with all relevant data protection regulations. Our risk management framework addresses potential AI biases and ensures consistent, reliable performance. We employ continuous monitoring systems to detect and address any anomalies in AI behavior, and our platforms are designed with privacy-by-design principles to protect sensitive research data.

Mara Rada

Technical Brief for Market Access and HEOR professionals.

Introduction: The Mandate for Trustworthy AI in HTA Submissions

For HEOR professionals, the adoption of Large Language Models (LLMs) in evidence synthesis presents a paradox: immense efficiency gains versus compliance uncertainty. While AI can revolutionize manual Systematic Literature Review (SLR) processes—demonstrably reducing human review timelines from over 6,000 person-hours to approximately 85 hours for complex reviews—generic models are plagued by uncalibrated confidence and the risk of generating factually incorrect data (hallucination).¹

This fundamental technical instability is the source of regulatory anxiety. Global HTA bodies, including the National Institute for Health and Care Excellence (NICE), the Canadian Drug Agency (CDA-AMC), and ISPOR, have established rigorous frameworks to govern AI use. This brief outlines how the Loon AI® confidence-calibrated technology, based on peer-reviewed validation, is specifically engineered to satisfy the explicit requirements of these bodies, thereby de-risking HTA submissions and ensuring methodological rigour.

I. The Foundational Requirement: Quantifiable Trust and Accuracy

The core technical failure of generic LLMs lies in their inability to provide reliable Uncertainty Quantification (UQ). They often report high confidence even when their answers are incorrect, making human oversight ineffective.²Compliance with HTA standards demands that AI tools not just perform, but prove their reliability.

Loon addresses this through a validated, confidence-calibrated methodology, published in Value in Health (2025). The system provides a quantitative confidence score for every screening decision, which is then used to structure the Human-in-the-Loop (HITL) workflow.

Peer-Reviewed Compliance Metrics

The validation confirmed that by routing only low-confidence outputs for human review (approximately $\le 5.8\%$ of citations), the overall methodological quality of the SLR reached a superior level, satisfying HTA requirements for rigour.

Performance Metric	Fully Automated (Baseline)	Calibrated HITL Performance (Post-Routing ≤5.8% Outputs)	Methodological Justification
Augmented Accuracy	95.5%	99.0%	Exceeds typical clinical data accuracy standards, eliminating data quality concerns.
Augmented Sensitivity (Recall)	98.9%	99.0%	Guarantees methodological completeness, mitigating the primary risk of excluding critical evidence.
Augmented Precision (PPV)	63.0%	89.9%	Confirms the effective triage of human expert effort, validating the tool's efficiency gain.
Negative Predictive Value (NPV)	99.9%	N/A	Provides assurance that rejected evidence is truly non-relevant, strengthening the systematic nature of the review.

II. Regulatory Compliance: Meeting HTA Mandates

The Loon system is architecturally designed to provide the specific evidence required by major regulatory and scientific organizations to validate the use of AI in evidence synthesis.

2.1. NICE & CDA-AMC: Human Augmentation and Auditability

Both the NICE Position Statement ³ and the CDA-AMC Guidelines ⁴ require that AI must "augment human involvement, not replace it" and that users must report the risks (e.g., bias) and the steps taken to address them.

HTA Mandate	Loon Compliance Mechanism	Technical Execution
Augmentation, Not Replacement (NICE)	100%, 20%, or confidence-guided sub-6% Human Review Thresholds	The validated system restricts mandatory human oversight to the approximately 5% of decisions where confidence is lowest, ensuring expert time is focused on the highest-risk data, thereby fulfilling the HITL mandate.
Bias Mitigation (CDA-AMC)	Unsupervised/Zero-Shot Methodology	The agentic AI performs screening based only on the provided inclusion/exclusion criteria, without prior training on potentially biased reviewer data. This zero-shot approach mitigates the risk of algorithmic bias in study selection.⁵
Transparency and Auditing	Structured Confidence Logging	Every decision includes a confidence score and rationale, creating an auditable, progressive trail. This logging supports the CDA-AMC requirement to report and mitigate risks.⁴

2.2. ISPOR ELEVATE-AI LLMs Framework: Mitigating Hallucination Risk

The ISPOR ELEVATE-AI framework addresses the core technical flaws of LLMs in HEOR, specifically the challenges of hallucination, data inaccuracy, and the need for rigorous reporting.⁶

ISPOR ELEVATE-AI Challenge	Loon Compliance Mechanism	Technical Execution
Hallucination Risk	Quantitative Confidence Calibration	The peer-reviewed validation proves the model accurately discriminates reliable outputs (99.0% accuracy), directly countering the primary risk cited by ISPOR.
Need for Traceability	Verifiable Source Grounding	All data extractions and decisions are tied to their original source document, ensuring the information is traceable and verifiable, as required for transparent HEOR reporting.⁷
Reporting Guidance	Integration of Validation Data	The platform’s outputs are supported by publicly available, peer-reviewed validation metrics, simplifying the process for HEOR teams to complete the ELEVATE-GenAI checklist.

III. The Business Case: De-Risking Submissions and Optimizing Resources

For purchasing Directors and VPs, the decision to adopt this technology hinges on three factors: compliance assurance, cost efficiency, and time savings.

Compliance Assurance: By adopting a peer-reviewed, confidence-calibrated solution, the HEOR department gains quantifiable evidence (99.0% accuracy, 99.9% NPV) to defend against regulatory challenges regarding AI usage. This directly protects the integrity of market access dossiers.
Resource Optimization: The validated ability to focus expert review only on the lowest-confidence decisions achieves unparalleled efficiency. This optimization frees up Methodologists and clinicians from low-value screening, reallocating their time to critical tasks like economic modeling and interpretation.⁸
Time-to-Market Acceleration: The reduction of a complex systematic review's workload from over 6,000 person-hours to approximately 85 hours provides an immediate and material advantage in compressing evidence generation timelines, which is critical for supporting accelerated submissions and achieving faster time-to-market.⁸

IV. Conclusion: A De-Risked and Fast-tracked Path to Market Access

The strategic advantage of adopting Loon's confidence-calibrated AI is the removal of the methodological and regulatory risk associated with generic LLMs or unvalidated, uncalibrated tools. The high-stakes environment of HTA demands validation and the highest scientific rigour. The peer-reviewed metrics confirm that Loon's technology provides the quantitative confidence necessary to satisfy the global regulatory standards for quality, transparency, and human oversight.

Loon's pioneering confidence-calibrated methodology combined with peer-reviewed validation provides the necessary technical and scientific foundation for full compliance with the requirements of NICE, CDA-AMC, and ISPOR, providing the biopharmaceutical industry with the first commercially validated technological pathway that moved AI from an exploratory tool to a verified, core component of the evidence generation function.

References

Janoudi, G., et al. Validating an Agentic Artificial Intelligence Abstract Screener across 8 Reviews Showed 99% Recall, Calibrated Confidence Scores, and a Sub-6% Human Check, Lifting Precision to 90%. Value in Health, 2025. (Source for 99.0% accuracy/sensitivity, sub-6% human review threshold, and core efficiency data).
Azam F., et al. Evaluating the Confidence Levels of Large Language Models in Answering Medical Questions: A Multi-Specialty Analysis. Journal of Medical Internet Research, 2025. (Addresses the paradox of worse-performing LLMs exhibiting higher confidence).
National Institute for Health and Care Excellence (NICE). Position Statement: Use of AI in Evidence Generation.November 2024.
Canada’s Drug Agency (CDA-AMC). Position Statement on the Use of AI Methods in Health Technology Assessment. April 2025.
Janoudi, G., et al. Evaluating Loon Lens Pro, an AI-Driven Tool for Full-Text Screening in Systematic Reviews: A Validation Study. medRxiv, 2025. (Describes the zero-shot/unsupervised methodology for bias mitigation).
ISPOR Working Group on Generative AI. The ELEVATE-AI LLMs Framework: An Evaluation Framework for Use of Large Language Models in HEOR. ISPOR Working Group Report, 2025.
ISPOR Working Group on Generative AI. ELEVATE-GenAI: Reporting Guidelines for the Use of Large Language Models in Health Economics and Outcomes Research. Value in Health, 2025.
Al-Najjar, A., et al. The Evolving Role of AI in Systematic Literature Reviews: From Automation to Augmentation. Journal of Medical Systems, 2024. (Addresses the assistive role of AI and enabling researchers to focus on complex tasks/optimization).

Navigate the Complexities of Market Access with Expert Insights

Learn how Loon's evidence-based solutions can help accelerate your HTA submissions and market access strategies.

Schedule a Consultation

Frequently Asked Questions

Start Transforming Your HTA and Market Access Strategy Today

Join pharmaceutical companies that are accelerating their market access with evidence-based AI solutions.

Schedule Your Consultation