Foundation model risks include emergent capabilities, dual-use concerns, evaluation difficulty, provider concentration, environmental impact, and systemic risk classification.
Foundation Model Risks: Unique Challenges of Large Pre-Trained Models (2026)
Foundation Model Risks
Foundation model risks include emergent capabilities, dual-use concerns, evaluation difficulty, provider concentration, environmental impact, and systemic risk classification.
Regulatory Requirements
Article 15 of the EU AI Act requires high-risk AI systems to achieve appropriate levels of accuracy, robustness, and cybersecurity. Article 9 requires ongoing risk identification and mitigation. The NIST AI RMF provides complementary technical guidance. ISO/IEC 42001 Annex B catalogues AI-specific risk sources across the lifecycle.
Risk Assessment Process
- Identify the specific technical risk area relevant to your system
- Map the risk to applicable regulatory requirements
- Assess probability and severity in your deployment context
- Evaluate existing technical controls and effectiveness
- Implement additional mitigation where gaps exist
- Document assessment, controls, and residual risk
- Establish continuous monitoring for the identified risk
Documentation Requirements
Technical risk management must be documented as part of the risk management system (Article 9) and technical documentation (Article 11, Annex IV). Documentation should cover risk identification methodology, assessment results, mitigation measures, residual risk levels, and monitoring procedures. Maintained for 10 years after market placement.
Technical Risk Management for AI Systems
Technical risks in AI systems differ fundamentally from those in traditional software. Conventional software follows deterministic logic: given the same input, it produces the same output, and failures typically manifest as crashes or incorrect calculations traceable to specific code. AI systems, by contrast, learn statistical patterns from data, making their behaviour inherently probabilistic, context-dependent, and often difficult to predict at the boundaries of their training distribution.
This probabilistic nature creates unique risk categories. Model hallucination produces plausible but fabricated outputs. Distribution shift causes performance degradation when real-world data diverges from training data. Adversarial inputs can cause targeted misclassification through imperceptible perturbations. Emergent capabilities in large models create behaviours not present during testing. These risks require specialised assessment and mitigation approaches that go beyond traditional software quality assurance.
Article 15 Requirements
The EU AI Act recognises these unique characteristics. Article 15 requires high-risk AI systems to achieve appropriate levels of accuracy, robustness, and cybersecurity throughout their lifecycle. Accuracy must be declared and documented. Robustness must address errors, faults, and inconsistencies in the operating environment. Cybersecurity must protect against attacks exploiting AI-specific vulnerabilities. These are not aspirational goals but enforceable requirements with defined timelines.
ISO/IEC 42001 Annex B provides a comprehensive catalogue of AI-specific risk sources organised by lifecycle stage: data-related risks (quality, bias, drift), model-related risks (accuracy, robustness, interpretability), and system-related risks (integration, deployment, monitoring). Organisations can use this catalogue as a starting point for their risk identification process.
Testing and Validation
Testing AI systems requires approaches beyond traditional software testing. Unit tests and integration tests remain necessary for the software components, but the learned model requires additional validation. This includes performance evaluation on held-out test sets, stress testing on edge cases and out-of-distribution inputs, adversarial robustness testing, fairness evaluation across demographic groups, and A/B testing in controlled production environments.
The NIST AI RMF MEASURE function provides guidance on metrics and measurement approaches. Key considerations include selecting metrics that are meaningful for the specific use case, establishing acceptable thresholds before testing, ensuring test data is independent from training data, and documenting all testing methodology and results for audit purposes.
Monitoring in Production
Production monitoring for AI systems extends beyond traditional application monitoring. In addition to standard infrastructure metrics (availability, latency, error rates), AI-specific monitoring must track input data distributions (detecting data drift via statistical tests such as KS test or PSI), prediction distributions (detecting concept drift through output monitoring), model performance against ground truth where available (detecting accuracy degradation), and fairness metrics across protected groups (detecting emerging bias).
Automated alerting should trigger investigation workflows when metrics breach defined thresholds. The response procedure should include model performance evaluation, root cause analysis, impact assessment, and decision on corrective action (retraining, rollback, or manual review). Article 72 requires post-market monitoring systems for high-risk AI, making this not just best practice but a regulatory obligation.
Governance and Accountability
Effective AI risk governance requires clear accountability structures. Designate named individuals responsible for AI risk at board, management, and operational levels. The EU AI Act places primary obligations on providers (those developing or placing AI on the market) and separate obligations on deployers (those using AI in professional contexts). Both must maintain quality management systems under Article 17 that encompass risk management processes, data governance, record-keeping, post-market monitoring, and corrective actions.
Internal accountability should be supported by appropriate training. All personnel involved in AI development, deployment, and oversight should understand the risk framework relevant to their role. This includes not only technical staff but also product managers, legal counsel, procurement teams, and senior management. Regular training updates are necessary as regulatory requirements evolve and organisational AI maturity develops.
Record-Keeping and Audit Readiness
Maintain comprehensive records of all risk management activities. This includes risk identification workshops, assessment results, treatment decisions, monitoring data, incident reports, and periodic reviews. These records serve as evidence of due diligence for regulatory inspections and conformity assessments. Article 12 requires high-risk AI systems to be designed for automatic logging of events during operation, providing a technical audit trail that complements procedural records.
Prepare for regulatory scrutiny by organising documentation in a readily accessible structure. National competent authorities may request documentation at any time under Article 21. A well-organised documentation management system that allows rapid retrieval by topic, system, or date significantly reduces the burden of responding to regulatory requests and demonstrates mature governance.
Check your AI compliance readiness — free.
Take the Readiness Check 3 minutes · 10 questions · no signup requiredThis article is for informational purposes only and does not constitute legal advice. Regulatory requirements change frequently — verify current rules with official sources. Built by Sawai Gyoseishoshi Office, Hiroshima, Japan.