Large language models (LLMs) and generative AI tools are reshaping how organizations operate, but they also introduce a category of risk most existing frameworks were never built to handle.
Hallucinations, data poisoning, prompt injection, regulatory non-compliance, and reputational exposure are now board-level concerns. This guide walks through a practical
AI risk assessment framework grounded in ISO 31000, NIST AI RMF, and EU AI Act principles, so risk professionals can move from vague concern to structured, defensible evaluation. Expect a step-by-step methodology, ready-to-adapt tools, and the key pitfalls that derail AI risk programs before they gain traction.
1. What Makes AI Risk Different From Traditional IT Risk
Traditional IT risk frameworks focus on availability, integrity, and confidentiality: clear, auditable, mostly binary. AI risk does not behave the same way.
Generative models produce probabilistic outputs, which means the same input can generate a different answer on a different day. That non-determinism is the root cause of most governance headaches.
Consider how a large bank using an LLM-powered compliance chatbot discovered, after six months of deployment, that the model was occasionally citing regulatory provisions that had been superseded.
The model was not hallucinating in the dramatic sense. The outputs were plausible, well-formatted, and confidently stated. They were simply wrong.
Traditional UAT and change-management controls would never have caught this because the failure mode does not occur at deployment, but episodically at inference.
Key differentiators that every AI risk assessment must address:
- Non-determinism: outputs vary across identical prompts, making regression testing harder.
- Opacity: LLM internals are black-box; root-cause analysis is limited.
- Data dependency: model risk is inherited from training data quality, bias, and provenance.
- Prompt attack surface: adversarial prompts can override instructions, expose data, or generate harmful content.
- Third-party concentration: most organizations rely on a handful of model providers, creating significant single-point-of-failure risk.
- Speed of change: model versions change faster than policy cycles.
2. Regulatory and Standards Landscape
Any credible AI risk assessment framework must be anchored to the evolving regulatory environment. The three most relevant references in 2024 and 2025 are:
| Standard / Regulation | Scope | Key Risk Implication |
| NIST AI RMF 1.0 | US voluntary framework: Govern, Map, Measure, Manage | Provides governance structure and measurement criteria |
| EU AI Act (2024) | Risk-tiered regulation: unacceptable, high, limited, minimal | High-risk use cases require conformity assessment and audit trails |
| ISO 42001:2023 | ISMS for AI management systems | Management system standard; integrates with ISO 27001 |
| ISO 31000:2018 | General risk management principles and guidelines | Foundation framework; AI risk is a subset of enterprise risk |
| GDPR / CCPA | Data privacy; includes AI-generated profiling decisions | Automated decision-making rights; data subject explainability obligations |
For US-focused organizations, the NIST AI RMF is the most actionable starting point. The EU AI Act matters even for US companies operating or selling services in the European Union. ISO 42001 is worth tracking, especially organizations already certified to ISO 27001.
3. The Five-Phase AI Risk Assessment Framework
This framework adapts the ISO 31000 risk process to the specific characteristics of LLM and generative AI deployments. Each phase maps to concrete activities and outputs.
Phase 1: Establish Context
Define the scope before doing anything else. Unscoped AI risk assessments either miss critical risks or generate noise that paralyzes decision-making.
- Inventory AI use cases: catalog every LLM or generative AI tool in use, including sanctioned tools, shadow AI, and third-party vendor integrations.
- Classify by risk tier: apply EU AI Act or internal risk tier criteria. High-risk use cases (HR screening, credit scoring, health triage) require deeper assessment.
- Map stakeholders: identify data owners, model owners, business process owners, compliance, and end users.
- Define risk appetite: explicit tolerance statements reduce scope-creep and prioritization arguments later.
Related reading: How to Build a Risk Appetite Statement and Enterprise Risk Management Framework.
Phase 2: Risk Identification
Structured identification prevents the common failure mode of only listing what is already top-of-mind. Use at least two of the following techniques:
- Cause-and-effect analysis: map from threat source to risk event to consequence.
- Red-teaming: adversarial prompt testing to surface prompt injection, jailbreaking, and data leakage scenarios.
- Control gap analysis: compare existing controls against NIST AI RMF subcategories.
- Third-party risk review: assess model provider security posture, SLAs, data processing terms, and API rate-limit risks.
- Review of AI incident databases: the AI Incident Database (AIID) and AVID catalog real-world failures across sectors.
Phase 3: Risk Analysis
Analyze identified risks across two dimensions: likelihood and impact. Supplement qualitative scoring with scenario analysis where stakes are high.
Quantification note: where possible, attach financial or operational metrics to impact scores. A hallucination in a low-stakes customer FAQ is a reputational nuisance.
A hallucination in a clinical decision support system is a patient safety event with regulatory consequence. The same risk category, different materiality.
Phase 4: Risk Evaluation and Treatment
Evaluate each risk against appetite thresholds. Assign one of four treatment options: avoid, reduce, transfer, or accept.
Generative AI introduces a fifth treatment option worth naming explicitly: constrain, meaning architectural or guardrail-level controls that limit what the model can do or say.
Phase 5: Monitor, Review, and Report
AI risk is not a point-in-time assessment. Model behavior drifts, providers update base models without notice, and the regulatory landscape shifts quarterly. Establish:
- Periodic re-assessment cadence (minimum annual; quarterly for high-risk use cases).
- KRI monitoring with automated alerts where feasible.
- Model performance dashboards surfacing accuracy, drift, and anomalous output volume.
- Board-level reporting: a one-page AI risk summary in the quarterly risk report.
4. AI Risk Categories: Taxonomy and Examples
Structured taxonomy prevents gaps. The following eight categories cover the primary risk domains in LLM and generative AI deployments:
| Risk Category | Description | Example Event | Primary Owner |
| Model Accuracy / Hallucination | Model generates plausible but false outputs | LLM cites non-existent legal case in customer-facing advice | AI/Model Owner |
| Data Privacy & Leakage | PII or confidential data exposed via prompt or output | User prompt causes model to regurgitate training data with PII | Data Protection Officer |
| Prompt Injection | Adversarial input overrides model instructions | Malicious user bypasses content policy to generate harmful content | Security / CISO |
| Bias & Fairness | Model outputs reflect discriminatory patterns from training data | Resume screening LLM systematically scores female applicants lower | HR / Compliance |
| Regulatory Non-Compliance | AI use violates sector-specific or data protection law | AI-generated credit decision lacks required explainability under ECOA | Compliance |
| Third-Party / Vendor Risk | Concentration risk in model providers; SLA breaches; data residency violations | API provider outage halts operations; provider updates model with no notice | Procurement / Risk |
| Reputational Risk | Public harm from AI outputs undermines stakeholder trust | AI chatbot produces offensive content that goes viral | Communications / Risk |
| Operational Dependency | Critical processes become over-reliant on AI without fallback | Staff cannot process claims manually after AI tool outage | Business Continuity |
See also: Operational Risk Assessment and Third-Party Risk Management.
5. AI Risk Register Template
A risk register turns the assessment into a living document. Each row should represent one discrete risk event, not a broad category.
The template below can be adapted directly into your organization’s existing risk register format.
| Risk ID | Risk Event | Likelihood (1-5) | Impact (1-5) | Risk Score | Controls | Residual Score | Owner & Due Date |
| AI-001 | LLM hallucination in customer-facing content | 4 | 3 | 12 – High | Output review workflow; human-in-loop sign-off | 6 – Medium | Product Owner / Q1 2025 |
| AI-002 | Prompt injection bypasses content policy | 3 | 4 | 12 – High | Prompt hardening; input validation layer; red-team testing quarterly | 6 – Medium | CISO / Q2 2025 |
| AI-003 | Third-party model provider API outage | 2 | 4 | 8 – Medium | Fallback to secondary provider; manual override procedure | 4 – Low | IT / Ongoing |
| AI-004 | Training data bias causes discriminatory output | 3 | 5 | 15 – Critical | Bias audit pre-deployment; ongoing fairness monitoring dashboard | 9 – High | Compliance / Q1 2025 |
Download the full risk register template at riskpublishing.com. See also: Key Risk Indicators Framework.
6. KRIs and Early Warning Indicators
Key Risk Indicators (KRIs) are quantitative signals that a risk is trending toward a threshold. For AI systems, many traditional KRIs do not apply, and organizations need to build AI-specific early-warning sets.
| KRI | Measurement | Green Threshold | Red Threshold | Escalation Action |
| Hallucination rate (sampled output review) | % flagged outputs / total sampled | < 1% | > 5% | Suspend use case; trigger root-cause review |
| Prompt injection incidents | Confirmed bypass events per month | 0 | > 2 | Immediate red-team retest; CISO notification |
| Model drift score | Statistical divergence from baseline output distribution | < 0.05 (PSI) | > 0.20 | Re-evaluation of outputs; possible rollback |
| Bias metric (demographic parity gap) | Disparity in positive outcome rates across protected groups | < 5% | > 10% | Pause deployment; bias audit within 5 business days |
| Unreviewed AI-generated content published | % AI content published without human review | 0% | > 2% | Process audit; workflow control reinforcement |
| Third-party provider SLA adherence | API uptime % | > 99.5% | < 98% | Activate contingency provider; review contract terms |
For a broader KRI library across risk domains, see ESG Key Risk Indicators, Healthcare KRIs, and Operational Resilience KRIs.
7. Governance and the Three Lines Model
AI risk governance fails when accountability is unclear. The IIA Three Lines Model provides a clean structure, but AI requires some deliberate adaptation because many organizations have no designated AI risk owner.
| Line | Role | AI-Specific Responsibilities |
| First Line | Business / Product / Technology teams deploying AI | Use case classification; prompt governance; output review; incident reporting; maintaining AI use case inventory |
| Second Line | Risk, Compliance, Data Protection functions | AI risk framework ownership; policy; KRI monitoring; fairness and bias oversight; regulatory mapping; vendor due diligence standards |
| Third Line | Internal Audit | Independent assurance on AI controls; audit of model documentation; testing of red-team processes and incident logs |
| Board / Audit Committee | Governance oversight | Approve AI risk appetite; receive periodic AI risk reports; challenge management on emerging AI exposures |
A key governance gap in most organizations: no one owns the AI incident log. Assign this explicitly to a named second-line function. See GRC Framework Implementation and Internal Audit Risk Management.
8. 90-Day Implementation Roadmap
Most organizations do not need a multi-year AI governance program. They need a credible first 90 days that produces tangible outputs and builds momentum.
| Phase | Timeline | Key Activities | Output |
| Foundation | Days 1-30 | AI use case inventory; stakeholder mapping; risk appetite draft; standards alignment (NIST AI RMF, ISO 42001) | AI inventory; risk appetite statement |
| Assessment | Days 31-60 | Risk identification workshops; red-team sessions; vendor due diligence reviews; bias audit on high-risk use cases | AI risk register; vendor risk assessments |
| Controls & Monitoring | Days 61-90 | KRI dashboard build; control framework design; governance RACI; board report template; training rollout | KRI dashboard; first AI risk board report |
For project-level AI risk considerations, see Project Risk Assessment and Monte Carlo Simulation in Risk Analysis.
9. Common Pitfalls to Avoid
Risk professionals with deep ERM experience sometimes make avoidable mistakes when pivoting to AI risk. The following failures appear repeatedly across industries:
- Treating AI risk as a pure IT risk. Hallucination and bias are business risks with legal and reputational consequences. The risk owner should never be exclusively the CTO.
- Assessing once and filing. Model behavior drifts. Provider updates happen without notice. Quarterly review cycles are the minimum standard for high-risk deployments.
- Skipping red-team testing. Internal testing under favorable conditions misses the adversarial scenarios that actually cause incidents. Dedicated red-team exercises are not optional.
- No AI incident log. Organizations that do not record near-misses and minor failures cannot learn from them or demonstrate due diligence to regulators.
- Ignoring shadow AI. Staff use of unapproved tools (ChatGPT, Gemini, Perplexity) to process work data is often the highest-likelihood data leakage vector. The inventory must include unauthorized use.
- Over-relying on vendor assurances. Model provider security certifications (SOC 2, ISO 27001) address infrastructure risk, not model-level hallucination, bias, or prompt injection risk. These are different risk categories.
- No fallback procedure. Operational dependency on an AI tool without a documented manual fallback creates a business continuity vulnerability. See Business Continuity Planning and Disaster Recovery Plan.
10. Forward Look: Emerging AI Risks
The risk landscape is moving faster than governance cycles. These emerging themes warrant monitoring in the next 12 to 24 months:
- Agentic AI: LLM agents that autonomously take actions (send emails, execute code, make API calls) dramatically expand the blast radius of a single error or compromise.
- Synthetic media and deepfakes: generative AI lowers the cost of disinformation campaigns targeting organizations, executives, and financial systems.
- AI-to-AI attacks: adversarial models designed to probe and manipulate other AI systems create attack vectors with no human-facing equivalent.
- Regulatory velocity: the EU AI Act enforcement timeline, US state-level AI bills, and SEC guidance on AI in financial disclosures will create compliance obligations that arrive faster than many governance programs can adapt.
- Model supply chain risk: open-source base models with unknown training data provenance introduce bias and security risks that closed-source models partially mitigate through vendor accountability.
Related: Emerging Technology Risk, Shadow AI Risk Management, AI ML Key Risk Indicators.
Key Takeaways
1. AI risk is structurally different from IT risk because of non-determinism, opacity, and the adversarial attack surface. Traditional controls do not transfer without adaptation.
2. Anchor to recognized standards: NIST AI RMF, ISO 42001, and ISO 31000 provide the governance skeleton. EU AI Act compliance matters even outside Europe.
3. Start with a use case inventory. You cannot assess what you have not catalogued, and shadow AI is often the highest-risk category in the inventory.
4. Risk categories matter: hallucination, prompt injection, bias, data leakage, vendor concentration, regulatory exposure, and operational dependency each require distinct controls.
5. KRIs are not optional. Qualitative heatmaps are insufficient governance for AI. Quantitative early-warning indicators with defined thresholds and escalation paths are the standard.
6. The Three Lines Model works for AI, but only if accountability is explicit. Designate named owners for the AI incident log, model inventory, and KRI monitoring dashboard.
7. The 90-day roadmap produces results. Organizations do not need a two-year program. A disciplined 90-day sprint delivers an inventory, risk register, KRI dashboard, and first board report.
References
1. National Institute of Standards and Technology. AI Risk Management Framework (AI RMF 1.0). January 2023.
2. European Parliament. EU AI Act. 2024.
3. International Organization for Standardization. ISO 42001:2023 — Artificial Intelligence Management Systems. 2023.
4. International Organization for Standardization. ISO 31000:2018 — Risk Management Guidelines. 2018.
5. Institute of Internal Auditors. Three Lines Model. 2020.
6. AI Incident Database. aiincidentdatabase.org.
7. AI Vulnerability Database (AVID). avidml.org.
8. KPMG. AI Risk and Governance Survey 2024. 2024.
9. Deloitte. Generative AI Risk: What Boards Need to Know. 2024.
10. McKinsey Global Institute. The State of AI in 2024.
11. riskpublishing.com — IEC 62443 Risk Assessment, Pension Fund Risk Management, Definition of Risk Assessment, Risk Management Process.
Ready to build your AI risk framework? Explore the full library at riskpublishing.com — covering ERM, BCM, KRIs, and compliance frameworks designed for risk professionals.

Chris Ekai is a Risk Management expert with over 10 years of experience in the field. He has a Master’s(MSc) degree in Risk Management from University of Portsmouth and is a CPA and Finance professional. He currently works as a Content Manager at Risk Publishing, writing about Enterprise Risk Management, Business Continuity Management and Project Management.
