Large language models now underpin customer service chatbots, medical triage systems, legal research tools, and financial advisory platforms across every major industry.

Yet an estimated 73 percent of organisations deploying AI in production still lack a formal risk framework tailored to generative models.

Definition: A taxonomy of risks posed by language models is a structured classification system that categorises the potential harms, vulnerabilities, and failure modes of large language models (LLMs) into logical groupings.

It enables organisations to systematically identify, assess, and manage risks across technical, human-use, and societal dimensions.

Two landmark efforts now anchor the field. The NIST AI 600-1 Generative AI Profile maps 13 distinct risk categories and prescribes more than 400 mitigation actions.

The MIT AI Risk Repository catalogues 1,612 classified risks drawn from over 700 academic papers.

Language Model Risk Landscape — Key Figures

73%
of organisations lack a formal AI risk framework
13
NIST AI 600-1 risk categories identified
400+
prescribed mitigation actions in NIST framework
1,612
classified risks in MIT AI Risk Repository

Summary

  • A taxonomy of risks posed by language models organises threats into three pillars: technical/model risks, human misuse risks, and ecosystem/societal risks.
  • The NIST AI 600-1 Generative AI Profile identifies 13 risk categories with 400+ prescribed mitigation actions for generative AI systems.
  • Key technical risks include confabulation (hallucination), harmful bias, data privacy violations, and homogenisation of outputs.
  • Human misuse risks span disinformation campaigns, CBRN information generation, social engineering, and deepfake creation.
  • Building an organisational risk taxonomy requires five steps: identify, classify, assess, mitigate, and monitor.
  • Integration with enterprise risk management (ERM) frameworks ensures language model risks are governed alongside traditional operational, strategic, and financial risks.

What Is a Risk Taxonomy for Language Models?

A risk taxonomy for language models is a hierarchical classification system that organises the full spectrum of threats, failure modes, and harms associated with large language models into structured categories. It functions as the foundational reference document for risk identification, assessment, and mitigation across the AI lifecycle.

A well-designed taxonomy serves three core functions:

  1. Systematic identification: Ensures no significant risk category is overlooked during assessment
  2. Consistent communication: Provides a shared vocabulary across technical, legal, compliance, and executive teams
  3. Prioritised mitigation: Enables risk-based allocation of resources to the highest-severity threats

The concept extends established risk management principles from ISO 31000 into the AI domain, adapted for the unique characteristics of generative models.

The Three Pillars of Language Model Risk

Drawing from the NIST AI Risk Management Framework and MIT AI Risk Repository, language model risks organise into three pillars:

The Three Pillars of Language Model Risk

A structured framework for classifying LLM risks

⚙️

Technical & Model Risks

Risks from the model itself — training data, architecture, and inference

🔴 Confabulation / Hallucination
🔴 Harmful Bias & Homogenisation
🔴 Data Privacy Violations
🔴 Value Chain Integration Risks
👤

Human Misuse Risks

Risks from deliberate or negligent misuse of LLM capabilities

🟠 CBRN Information Generation
🟠 Disinformation & Deepfakes
🟠 Social Engineering / Fraud
🟠 Obscene & Degrading Content
🌍

Ecosystem & Societal Risks

Systemic effects from widespread LLM deployment

🟢 Environmental Impact (50 GWh)
🟢 Intellectual Property Issues
🟢 Labour Displacement
🟢 Power Concentration

Pillar 1: Technical and Model Risks

Technical risks originate from the model itself, its training data, architecture, and inference behaviour. These include:

RiskDescription
Confabulation (Hallucination)The model generates plausible but factually incorrect information, presenting fabricated data as truth
Harmful Bias and HomogenisationTraining data imbalances produce outputs that perpetuate stereotypes or underrepresent minority perspectives
Data Privacy ViolationsModels may memorise and reproduce personal data from training corpora, violating GDPR, CCPA, and other regulations
Value Chain Integration RisksThird-party model components introduce opaque dependencies and supply-chain vulnerabilities

Pillar 2: Human Misuse Risks

These risks arise when human actors deliberately or negligently misuse language model capabilities:

  • CBRN information generation: Models may provide detailed instructions for creating chemical, biological, radiological, or nuclear weapons
  • Disinformation and deepfakes: Generating convincing false narratives, synthetic media, or impersonation content at scale
  • Social engineering: Crafting personalised phishing emails, scam scripts, or manipulation campaigns
  • Obscene and degrading content: Producing harmful, explicit, or abusive material

Pillar 3: Ecosystem and Societal Risks

Broader systemic effects that emerge from widespread language model deployment include:

  • Environmental impact: Training large models consumes significant energy; GPT-4-scale training estimated at 50 GWh
  • Intellectual property: Models trained on copyrighted material raise unresolved legal questions
  • Labour displacement: Automation of knowledge work tasks affecting writing, coding, analysis, and customer service roles
  • Power concentration: A small number of organisations control the most capable models, creating dependency risks

NIST AI 600-1: The 13 Risk Categories

The NIST AI 600-1 Generative AI Profile defines 13 distinct risk categories for generative AI systems. Each category maps to specific mitigation actions:

NIST AI 600-1 — 13 Risk Categories

Illustrative risk severity scale

CBRN Info
Critical
Confabulation
Critical
Info Security
Critical
Data Privacy
High
Harmful Bias
High
Info Integrity
High
Dangerous Recs
High
Obscene Content
Med
IP / Copyright
Med
Value Chain
Med
Human-AI Config
Med
Homogenisation
Low
Environmental
Low
Critical
High
Medium
Lower
CategoryDescriptionKey Mitigation
CBRN InformationGeneration of weapons-related contentOutput filtering, use-case restrictions
ConfabulationFactually incorrect but plausible outputsRetrieval augmentation, confidence scoring
Data PrivacyMemorisation of training data PIIDifferential privacy, data scrubbing
EnvironmentalEnergy and resource consumptionEfficient architectures, carbon tracking
Harmful BiasDiscriminatory or stereotyping outputsBias audits, diverse training data
HomogenisationReduced diversity of perspectivesModel diversity, plurality metrics
Human-AI Config.Over-reliance or misattributionTransparency, human-in-the-loop
Information IntegrityMisinformation at scaleProvenance tracking, watermarking
Information SecurityPrompt injection, model theftInput validation, access controls
IP / CopyrightTraining on copyrighted materialLicensing, opt-out mechanisms
Obscene ContentGenerating harmful materialContent moderation, RLHF
Value ChainThird-party component risksVendor assessment, model cards
Dangerous Recs.Harmful advice or instructionsSafety classifiers, red-teaming

Building a Language Model Risk Taxonomy: A 5-Step Process

Building a Language Model Risk Taxonomy

A five-step cyclical process

1

IDENTIFY

Threat modelling, red-teaming & workshops

2

CLASSIFY

Three-pillar hierarchy & causal mechanisms

3

ASSESS

Likelihood, impact, velocity & detectability

4

MITIGATE

Technical, governance & operational controls

5

MONITOR

KRIs, dashboards & reassessment cycles

🔄 Continuous cycle — update as threat landscape evolves
  1. Step 1: Identify — Catalogue all potential risks through threat modelling, red-teaming, incident analysis, and stakeholder workshops. Use NIST AI 600-1 and the MIT AI Risk Repository as starting references.
  2. Step 2: Classify — Organise identified risks into the three-pillar hierarchy (technical, human misuse, societal). Assign subcategories and map causal mechanisms.
  3. Step 3: Assess — Evaluate each risk for likelihood, impact severity, velocity, and detectability. Use a risk matrix aligned with your organisation’s risk appetite.
  4. Step 4: Mitigate — Develop controls for each risk category: technical controls (guardrails, RLHF), governance controls (policies, oversight), and operational controls (monitoring, incident response).
  5. Step 5: Monitor — Establish key risk indicators (KRIs), continuous monitoring dashboards, and regular reassessment cycles. Update the taxonomy as the threat landscape evolves.

Document all findings in a risk register and align with your enterprise risk management framework.

Industry-Specific Applications

Industry Risk Heatmap

Relative severity of LLM risk categories across key industries

Halluc. Bias Privacy Disinfo IP Security
💰 FinanceCriticalCriticalHighMedLowCritical
🏥 HealthcareCriticalHighCriticalLowMedHigh
⚖️ LegalCriticalMedHighLowCriticalMed
📰 MediaHighMedMedCriticalCriticalMed
🎓 EducationHighHighMedMedHighLow
🏛️ GovernmentHighCriticalCriticalCriticalMedCritical
IndustryKey LLM RisksMitigation Focus
Financial ServicesAlgorithmic bias in lending, credit scoring, AML false positivesFair lending audits, model validation (SR 11-7)
HealthcareDiagnostic bias, hallucinated treatment recommendations, patient privacyClinical validation, HIPAA compliance, human oversight
LegalHallucinated case citations, confidentiality breachesSource verification, attorney review requirements
Media / ContentMisinformation at scale, deepfakes, copyright violationContent provenance, watermarking, editorial review
EducationPlagiarism enablement, assessment undermining, bias in feedbackAcademic integrity policies, pedagogical AI guidelines
GovernmentSurveillance concerns, democratic manipulation, accessibility biasTransparency mandates, algorithmic impact assessments

Integrating the Risk Taxonomy into Enterprise Risk Management

ERM Integration Framework

How LLM risk taxonomy connects to enterprise risk management

Enterprise Risk Management Framework
AI / LLM Risk Taxonomy Layer
Technical Risks
Maps to → Operational & Technology Risk registers
Misuse Risks
Maps to → Compliance & Reputational Risk registers
Societal Risks
Maps to → Strategic & ESG Risk registers
📊 Risk Appetite
📈 KRI Monitoring
📋 Board Reporting
🔍 ISO 31000 & NIST RMF

A language model risk taxonomy should not exist in isolation. It must integrate with the organisation’s broader enterprise risk management framework to ensure AI risks are governed alongside operational, strategic, and financial risks.

Key integration points:

  • Map taxonomy categories to existing risk appetite statements and tolerance thresholds
  • Assign AI-specific key risk indicators (KRIs) alongside traditional operational KRIs
  • Include language model risks in the enterprise risk register with consistent scoring
  • Report AI risk exposure through established governance channels (risk committees, board reports)
  • Align with ISO 31000 risk management principles and the NIST AI RMF Govern function

Risk Mitigation Strategies

Defence-in-Depth: Risk Mitigation Framework

Three layers of controls working together

Layer 1

⚙️ Technical Controls

RLHF alignment training
Red-teaming & adversarial testing
Output guardrails & filters
RAG for grounded outputs
Differential privacy
Layer 2

📋 Governance Controls

AI ethics committees
Model risk policies (SR 11-7)
Mandatory impact assessments
Model cards & documentation
Layer 3

🔄 Operational Controls

Continuous monitoring dashboards
AI incident response procedures
Regular model revalidation (PDCA)
Responsible AI user training

Technical Controls

  • Reinforcement Learning from Human Feedback (RLHF) to align model behaviour with human values
  • Red-teaming and adversarial testing to identify vulnerabilities before deployment
  • Guardrails and output filters to block harmful, biased, or policy-violating content
  • Retrieval-augmented generation (RAG) to ground outputs in verified source material
  • Differential privacy techniques to prevent training data memorisation

Governance Controls

  • AI ethics committees with cross-functional representation
  • Model risk management policies aligned with SR 11-7 and NIST AI RMF
  • Mandatory impact assessments before deploying LLMs in high-risk use cases
  • Transparency requirements including model cards and system documentation

Operational Controls

  • Continuous monitoring dashboards tracking bias metrics, hallucination rates, and security incidents
  • Incident response procedures specific to AI failures and adversarial attacks
  • Regular model revalidation cycles aligned with the PDCA improvement cycle
  • User training programmes on responsible AI use and limitation awareness

Challenges and Limitations

Key Challenges in LLM Risk Taxonomy Development

Obstacles that organisations must navigate

Rapid Evolution

New capabilities create risks faster than taxonomies can update

📏

Measurement Difficulty

Many risks lack agreed quantitative metrics

🔗

Interdependency

Risks interact and cascade unpredictably

💰

Resource Intensity

Requires significant expertise and investment

📜

Regulatory Flux

EU AI Act, NIST RMF, and sector rules keep evolving

🔍

Model Opacity

Model internals not fully interpretable

ChallengeDescriptionMitigation
Rapid evolutionNew model capabilities create risks faster than taxonomies updateContinuous monitoring and quarterly taxonomy reviews
Measurement difficultyMany risks (bias, fairness) lack agreed quantitative metricsCombine quantitative and qualitative assessment methods
InterdependencyRisks interact and cascade in unpredictable waysSystems thinking and scenario analysis
Resource intensityComprehensive taxonomy development requires significant expertiseStart with NIST AI 600-1 as baseline, customise incrementally
Regulatory fluxEU AI Act, NIST RMF updates, and sector-specific rules continue evolvingBuild adaptable frameworks with regular compliance reviews
OpacityModel internals are not fully interpretableExplainability techniques, external audits

Frequently Asked Questions

What is the taxonomy of risk models?

A taxonomy of risk models is a structured classification system that organises the potential harms and failure modes of AI models into logical categories. For language models specifically, it typically groups risks into three pillars: technical/model risks (hallucination, bias, data privacy), human misuse risks (disinformation, CBRN, social engineering), and ecosystem/societal risks (environmental impact, IP issues, labour displacement). The NIST AI 600-1 framework identifies 13 distinct categories with over 400 prescribed mitigation actions.

What is the taxonomy of AI risks?

The taxonomy of AI risks encompasses the full spectrum of threats across the AI lifecycle. The MIT AI Risk Repository, the most comprehensive database available, catalogues 1,612 classified risks across two taxonomies: a causal taxonomy (what causes the risk) and a domain taxonomy (what domain the risk affects). Key domains include discrimination, misinformation, security breaches, environmental harm, and societal disruption.

What are the risks of language models?

Language model risks include: perpetuating biases and stereotypes from training data; generating hallucinated (fabricated) information; breaching data privacy through memorised personal information; enabling disinformation campaigns at scale; providing dangerous instructions (CBRN content); facilitating social engineering and fraud; producing offensive or harmful content; and concentrating power among a small number of AI providers.

What is the risk factor taxonomy?

A risk factor taxonomy identifies and classifies the root causes that give rise to language model risks.

These factors include biased or unrepresentative training data, lack of diverse development teams, insufficient safety testing, inadequate guardrails, poor human-AI configuration, and absence of governance frameworks. Understanding risk factors enables organisations to address risks at their source rather than only managing symptoms.

How does the NIST AI RMF address LLM risks?

The NIST AI Risk Management Framework addresses LLM risks through four core functions: Govern (establishing policies and oversight), Map (identifying and categorising risks), Measure (assessing risk severity and likelihood), and Manage (implementing controls and monitoring).

The companion document NIST AI 600-1 provides a generative AI-specific profile with 13 risk categories and detailed mitigation guidance.

What is the difference between AI risk assessment and AI risk taxonomy?

An AI risk taxonomy is a classification system that organises all possible risks into categories and subcategories. An AI risk assessment is the process of evaluating specific risks within that taxonomy for a particular system, use case, or deployment context. The taxonomy provides the framework; the assessment applies it. Think of the taxonomy as the map and the assessment as the journey through it.

External References

Index