This notebook explains how to uncover risks related to your usecase based on a given taxonomy.¶
Import libraries¶
from ai_atlas_nexus.blocks.inference import (
RITSInferenceEngine,
WMLInferenceEngine,
OllamaInferenceEngine,
VLLMInferenceEngine,
HFInferenceEngine,
)
from ai_atlas_nexus.blocks.inference.params import (
InferenceEngineCredentials,
RITSInferenceEngineParams,
WMLInferenceEngineParams,
OllamaInferenceEngineParams,
VLLMInferenceEngineParams,
HFInferenceEngineParams,
)
from ai_atlas_nexus.library import AIAtlasNexus
import os
AI Atlas Nexus uses Large Language Models (LLMs) to infer risks dimensions. Therefore requires access to LLMs to inference or call the model.¶
Available Inference Engines: WML, Ollama, vLLM, RITS, HF. Please follow the Inference APIs guide before going ahead.
Note: RITS is intended solely for internal IBM use and requires TUNNELALL VPN for access.
inference_engine = OllamaInferenceEngine(
model_name_or_path="granite3.3:8b",
credentials=InferenceEngineCredentials(api_url="http://localhost:11434"),
parameters=OllamaInferenceEngineParams(
num_predict=1000, num_ctx=8192, temperature=0
),
)
# inference_engine = HFInferenceEngine(
# model_name_or_path="meta-llama/Llama-3.1-8B-Instruct",
# credentials=InferenceEngineCredentials(
# api_key=os.getenv("HF_TOKEN"),
# api_url="https://router.huggingface.co/v1",
# ),
# parameters=HFInferenceEngineParams(max_completion_tokens=1000, temperature=0),
# )
# inference_engine = WMLInferenceEngine(
# model_name_or_path="ibm/granite-4-h-small",
# credentials={
# "api_key": os.getenv("WML_API_KEY"),
# "api_url": os.getenv("WML_API_URL"),
# "project_id": os.getenv("WML_PROJECT_ID"),
# },
# parameters=WMLInferenceEngineParams(
# max_new_tokens=1000, decoding_method="greedy"
# ),
# )
# inference_engine = VLLMInferenceEngine(
# model_name_or_path="ibm-granite/granite-3.3-8b-instruct",
# credentials=InferenceEngineCredentials(
# api_url=os.getenv("VLLM_API_URL"), api_key=os.getenv("VLLM_API_KEY")
# ),
# parameters=VLLMInferenceEngineParams(max_tokens=1000, temperature=0),
# )
# inference_engine = RITSInferenceEngine(
# model_name_or_path="ibm-granite/granite-3.3-8b-instruct",
# credentials={
# "api_key": os.getenv("RITS_API_KEY"),
# "api_url": os.getenv("RITS_API_URL"),
# },
# parameters=RITSInferenceEngineParams(max_completion_tokens=1000, temperature=0),
# )
Create an instance of AIAtlasNexus¶
Note: (Optional) You can specify your own directory in AIAtlasNexus(base_dir=<PATH>) to utilize custom AI ontologies. If left blank, the system will use the provided AI ontologies.
ai_atlas_nexus = AIAtlasNexus()
[2026-04-11 23:46:36:891] - INFO - AIAtlasNexus - Created AIAtlasNexus instance. Base_dir: None
Risk Identification API¶
AIAtlasNexus.identify_risks_from_usecases()
Params:
- usecases (List[str]): A List of strings describing AI usecases
- inference_engine (InferenceEngine): An LLM inference engine to infer risks from the usecases.
- taxonomy (str, optional): The string label for a taxonomy. Default to None.
- cot_examples (Dict[str, List], optional): The Chain of Thought (CoT) examples to use in the risk identification. The example template is available at src/ai_atlas_nexus/data/templates/risk_generation_cot.json. Assign the ID of the taxonomy you wish to use as the key for CoT examples. Providing this value will override the CoT examples present in the template master. Default to None.
- max_risk (int, optional): The maximum number of risks to extract. Pass None to allow the inference engine to determine the number of risks. Defaults to None.
- zero_shot_only (bool): If enabled, this flag allows the system to perform Zero Shot Risk identification, and the field cot_examples will be ignored.
- batch_inference (bool): Whether to run risk inference service in batch mode or at each risk level. Defaults to True.
- use_dspy_prompt (bool): Use per-risk DSPy optmized prompt instructions for risk identification. When enabled, batch_inference flag is ignored.
Risk Identification using IBM AI Risk taxonomy - Batch Inference¶
usecase = "Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers."
risks = ai_atlas_nexus.identify_risks_from_usecases(
usecases=[usecase],
inference_engine=inference_engine,
taxonomy="ibm-risk-atlas",
max_risk=5,
)
for risk in risks[0]:
print(risk.name)
Inferring with ollama, backend - DEFAULT: 100%|██████████| 1/1 [00:33<00:00, 33.86s/it]
Incorrect risk testing Over- or under-reliance Confidential information in data Lack of model transparency Sharing IP/PI/confidential information with user
# you may wish to retrieve the risk list and the control measures together, so can use the method
# `identify_risks_and_actions_from_usecases`.
usecase = "Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers."
risks_and_measures = ai_atlas_nexus.identify_risks_and_actions_from_usecases(
usecases=[usecase],
inference_engine=inference_engine,
taxonomy="ibm-risk-atlas",
max_risk=5,
)
risks_and_measures
Inferring with OLLAMA: 100%|██████████| 1/1 [00:23<00:00, 23.60s/it]
{'usecases': ['Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers.'],
'model': 'granite3.3:8b',
'taxonomy': 'ibm-risk-atlas',
'summary': defaultdict(list,
{'risk_ids': ['atlas-incorrect-risk-testing',
'atlas-over-or-under-reliance',
'atlas-confidential-data-in-prompt',
'atlas-prompt-leaking',
'atlas-lack-of-model-transparency'],
'action_ids': [],
'detector_ids': [],
'control_item_ids': ['aiuc1-req-e008',
'aiuc1-req-c001',
'aiuc1-req-c009',
'aiuc1-req-a006',
'aiuc1-req-c002',
'aiuc1-req-e013',
'aiuc1-req-c008',
'aiuc1-req-b009',
'aiuc1-req-a004',
'aiuc1-req-c012',
'aiuc1-req-e017',
'aiuc1-req-b004',
'aiuc1-req-c007',
'aiuc1-req-b003'],
'Requirement': ['aiuc1-req-e008',
'aiuc1-req-c001',
'aiuc1-req-c009',
'aiuc1-req-a006',
'aiuc1-req-c002',
'aiuc1-req-e013',
'aiuc1-req-c008',
'aiuc1-req-b009',
'aiuc1-req-a004',
'aiuc1-req-c012',
'aiuc1-req-e017',
'aiuc1-req-b004',
'aiuc1-req-c007',
'aiuc1-req-b003']}),
'risks': [Risk(id='atlas-incorrect-risk-testing', name='Incorrect risk testing', description='A metric selected to measure or track a risk is incorrectly selected, incompletely measuring the risk, or measuring the wrong risk for the given context.', url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/incorrect-risk-testing.html', dateCreated=datetime.date(2024, 9, 24), dateModified=datetime.date(2025, 10, 10), exact_mappings=[], close_mappings=[], related_mappings=['aiuc1-req-c001', 'aiuc1-req-c002', 'aiuc1-req-c008', 'aiuc1-req-c012', 'aiuc1-req-e008', 'aiuc1-req-e013', 'credo-risk-032', 'mit-ai-causal-risk-entity-human', 'mit-ai-causal-risk-intent-unintentional', 'mit-ai-causal-risk-timing-post-deployment', 'mit-ai-risk-subdomain-6.5'], narrow_mappings=[], broad_mappings=['nist-value-chain-and-component-integration'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-governance', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='incorrect-risk-testing', risk_type='non-technical', phase=None, descriptor=['amplified by generative AI'], concern="If the metrics do not measure the risk as intended, then the understanding of that risk will be incorrect and mitigations might not be applied. If the model's output is consequential, this might result in societal, reputational, or financial harm."),
Risk(id='atlas-over-or-under-reliance', name='Over- or under-reliance', description="In AI-assisted decision-making tasks, reliance measures how much a person trusts (and potentially acts on) a model's output. Over-reliance occurs when a person puts too much trust in a model, accepting a model's output when the model's output is likely incorrect. Under-reliance is the opposite, where the person doesn't trust the model but should.", url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/over-or-under-reliance.html', dateCreated=datetime.date(2024, 3, 6), dateModified=datetime.date(2025, 10, 22), exact_mappings=[], close_mappings=['credo-risk-016'], related_mappings=['shieldgemma-harassment', 'aiuc1-req-c007', 'aiuc1-req-c009', 'llm052025-improper-output-handling', 'llm062025-excessive-agency', 'mit-ai-causal-risk-entity-human', 'mit-ai-causal-risk-intent-unintentional', 'mit-ai-causal-risk-timing-post-deployment', 'mit-ai-risk-subdomain-5.1'], narrow_mappings=[], broad_mappings=['nist-human-ai-configuration'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-value-alignment', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='over-or-under-reliance', risk_type='output', phase=None, descriptor=['amplified by generative AI'], concern='In tasks where humans make choices based on AI-based suggestions, over/under reliance can lead to poor decision making because of the misplaced trust in the AI system, with negative consequences that increase with the importance of the decision.'),
Risk(id='atlas-confidential-data-in-prompt', name='Confidential data in prompt', description='Confidential information might be included as a part of the prompt that is sent to the model.', url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/confidential-data-in-prompt.html', dateCreated=datetime.date(2024, 3, 6), dateModified=datetime.date(2025, 10, 22), exact_mappings=[], close_mappings=['aiuc1-req-a004'], related_mappings=['ail-privacy', 'aiuc1-req-a004', 'aiuc1-req-a006', 'llm022025-sensitive-information-disclosure', 'mit-ai-causal-risk-entity-other', 'mit-ai-causal-risk-intent-unintentional', 'mit-ai-causal-risk-timing-post-deployment', 'mit-ai-risk-subdomain-2.1'], narrow_mappings=[], broad_mappings=['nist-intellectual-property'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-intellectual-property', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='confidential-data-in-prompt', risk_type='inference', phase=None, descriptor=['specific to generative AI'], concern="If not properly developed to secure confidential data, the model might reveal confidential information or IP in the generated output. Additionally, end users' confidential information might be unintentionally collected and stored."),
Risk(id='atlas-prompt-leaking', name='Prompt leaking', description="'A prompt leak attack attempts to extract a model's system prompt (also known as the system message).'", url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/prompt-leaking.html', dateCreated=datetime.date(2024, 3, 6), dateModified=datetime.date(2025, 10, 10), exact_mappings=[], close_mappings=[], related_mappings=['aiuc1-req-a004', 'aiuc1-req-b003', 'aiuc1-req-b004', 'aiuc1-req-b009', 'mit-ai-causal-risk-entity-human', 'mit-ai-causal-risk-intent-intentional', 'mit-ai-causal-risk-timing-other', 'mit-ai-risk-subdomain-2.2'], narrow_mappings=['atlas-prompt-injection'], broad_mappings=['atlas-prompt-injection', 'llm022025-sensitive-information-disclosure', 'nist-information-security'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-robustness-Prompt-attacks', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='prompt-leaking', risk_type='inference', phase=None, descriptor=['specific to generative AI'], concern='A successful prompt leaking attack copies the system prompt used in the model. Depending on the content of that prompt, the attacker might gain access to valuable information, such as sensitive personal information or intellectual property, and might be able to replicate some of the functionality of the model.'),
Risk(id='atlas-lack-of-model-transparency', name='Lack of model transparency', description='Lack of model transparency is due to insufficient documentation of the model design, development, and evaluation process and the absence of insights into the inner workings of the model.', url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/lack-of-model-transparency.html', dateCreated=datetime.date(2024, 3, 6), dateModified=datetime.date(2025, 10, 22), exact_mappings=[], close_mappings=['aiuc1-req-e017'], related_mappings=['aiuc1-req-e017', 'mit-ai-causal-risk-entity-human', 'mit-ai-causal-risk-intent-unintentional', 'mit-ai-causal-risk-timing-other', 'mit-ai-risk-subdomain-7.4'], narrow_mappings=[], broad_mappings=['nist-value-chain-and-component-integration'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-governance', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='lack-of-model-transparency', risk_type='non-technical', phase=None, descriptor=['traditional risk of AI'], concern="Transparency is important for legal compliance, AI ethics, and guiding appropriate use of models. Missing information might make it more difficult to evaluate risks, change the model, or reuse it.\xa0 Knowledge about who built a model can also be an important factor in deciding whether to trust it. Additionally, transparency regarding how the model's risks were determined, evaluated, and mitigated also play a role in determining model risks, identifying model suitability, and governing model usage.")],
'control_items': [Requirement(id='aiuc1-req-e008', name='Review internal processes', description='Establish regular internal reviews of key processes and document review records and approvals', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['atlas-incorrect-risk-testing'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-e008-1', 'aiuc1-ctrl-e008-2'], type='Requirement', hasApplication=['MANDATORY'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-e'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-c001', name='Define AI risk taxonomy', description='Establish a risk taxonomy that categorizes risks within harmful, out-of-scope, and hallucinated outputs, tool calls, and other risks based on application-specific usage', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['atlas-incorrect-risk-testing', 'atlas-lack-of-testing-diversity', 'atlas-unrepresentative-risk-testing'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-c001-1', 'aiuc1-ctrl-c001-2'], type='Requirement', hasApplication=['MANDATORY'], hasFrequency='MONTHS_3', hasKeywords=[], hasPrinciple=['aiuc1-principle-c'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-c009', name='Enable real-time feedback and intervention', description='Implement mechanisms to enable real-time user feedback collection and intervention mechanisms', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['atlas-impact-human-agency-agentic', 'atlas-over-or-under-reliance', 'atlas-over-or-under-reliance-on-ai-agents-agentic'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-c009-1', 'aiuc1-ctrl-c009-2'], type='Requirement', hasApplication=['OPTIONAL'], hasFrequency='MONTHS_3', hasKeywords=[], hasPrinciple=['aiuc1-principle-c'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-a006', name='Prevent PII leakage', description='Establish safeguards to prevent personal data leakage through AI outputs', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['atlas-exposing-personal-information', 'atlas-personal-information-in-data', 'atlas-personal-information-in-prompt'], related_mappings=['llm022025-sensitive-information-disclosure', 'llm082025-vector-and-embedding-weaknesses', 'atlas-confidential-data-in-prompt', 'atlas-exposing-personal-information', 'atlas-personal-information-in-data'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-a006-1', 'aiuc1-ctrl-a006-2', 'aiuc1-ctrl-a006-3'], type='Requirement', hasApplication=['MANDATORY'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-a'], appliesToCapability=['universal-capability'], hasRequirementType=None),
Requirement(id='aiuc1-req-c002', name='Conduct pre-deployment testing', description='Conduct internal testing of AI systems prior to deployment across risk categories for system changes requiring formal review or approval', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['atlas-incomplete-ai-agent-evaluation-agentic', 'atlas-incorrect-risk-testing', 'atlas-lack-of-testing-diversity', 'atlas-unrepresentative-risk-testing'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-c002-1', 'aiuc1-ctrl-c002-2', 'aiuc1-ctrl-c002-3'], type='Requirement', hasApplication=['MANDATORY'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-c'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-e013', name='Implement quality management system', description='Establish a quality management system for AI systems proportionate to the size of the organization', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['atlas-incorrect-risk-testing', 'atlas-mitigation-maintenance-agentic'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-e013-1', 'aiuc1-ctrl-e013-2', 'aiuc1-ctrl-e013-3', 'aiuc1-ctrl-e013-4', 'aiuc1-ctrl-e013-5'], type='Requirement', hasApplication=['OPTIONAL'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-e'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-c008', name='Monitor AI risk categories', description='Implement monitoring of AI systems across risk categories', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['atlas-incorrect-risk-testing'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-c008-1', 'aiuc1-ctrl-c008-2', 'aiuc1-ctrl-c008-3'], type='Requirement', hasApplication=['OPTIONAL'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-c'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-b009', name='Limit output over-exposure', description='Implement output limitations and obfuscation techniques to safeguard against information leakage', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['llm022025-sensitive-information-disclosure', 'llm052025-improper-output-handling', 'llm082025-vector-and-embedding-weaknesses', 'llm092025-misinformation', 'atlas-prompt-leaking', 'atlas-revealing-confidential-information'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-b009-1', 'aiuc1-ctrl-b009-2', 'aiuc1-ctrl-b009-3'], type='Requirement', hasApplication=['MANDATORY'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-b'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-a004', name='Protect IP and trade secrets', description='Implement safeguards or technical controls to prevent AI systems from leaking company intellectual property or confidential information', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['atlas-confidential-data-in-prompt', 'atlas-confidential-information-in-data'], related_mappings=['llm032025-supply-chain', 'llm052025-improper-output-handling', 'llm082025-vector-and-embedding-weaknesses', 'atlas-confidential-data-in-prompt', 'atlas-ip-information-in-prompt', 'atlas-prompt-leaking', 'atlas-sharing-info-tools-agentic'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-a004-1', 'aiuc1-ctrl-a004-2', 'aiuc1-ctrl-a004-3', 'aiuc1-ctrl-a004-4'], type='Requirement', hasApplication=['MANDATORY'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-a'], appliesToCapability=['universal-capability'], hasRequirementType=None),
Requirement(id='aiuc1-req-c012', name='Third-party testing for customer-defined risk', description='Appoint expert third-parties to evaluate system robustness to additional high-risk outputs as defined in risk taxonomy at least every 3 months', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['atlas-incorrect-risk-testing', 'atlas-lack-of-testing-diversity', 'atlas-unrepresentative-risk-testing'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-c012-1'], type='Requirement', hasApplication=['MANDATORY'], hasFrequency='MONTHS_3', hasKeywords=[], hasPrinciple=['aiuc1-principle-c'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-e017', name='Document system transparency policy', description='Establish a system transparency policy and maintain a repository of model cards, datasheets, and interpretability reports for major systems', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['atlas-lack-of-ai-agent-transparency-agentic', 'atlas-lack-of-model-transparency', 'atlas-lack-of-system-transparency'], related_mappings=['atlas-lack-of-ai-agent-transparency-agentic', 'atlas-lack-of-data-transparency', 'atlas-lack-of-model-transparency', 'atlas-lack-of-system-transparency'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-e017-1', 'aiuc1-ctrl-e017-2', 'aiuc1-ctrl-e017-3'], type='Requirement', hasApplication=['OPTIONAL'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-e'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-b004', name='Prevent AI endpoint scraping', description='Implement safeguards to prevent probing or scraping of external AI endpoints', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['atlas-extraction-attack'], related_mappings=['llm022025-sensitive-information-disclosure', 'llm052025-improper-output-handling', 'llm082025-vector-and-embedding-weaknesses', 'llm102025-unbounded-consumption', 'atlas-attribute-inference-attack', 'atlas-membership-inference-attack', 'atlas-prompt-leaking'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-b004-1', 'aiuc1-ctrl-b004-2', 'aiuc1-ctrl-b004-3', 'aiuc1-ctrl-b004-4'], type='Requirement', hasApplication=['MANDATORY'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-b'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-c007', name='Flag high risk outputs', description='Implement an alerting system that flags high-risk outputs for human review', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['atlas-over-or-under-reliance', 'atlas-over-or-under-reliance-on-ai-agents-agentic'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-c007-1', 'aiuc1-ctrl-c007-2', 'aiuc1-ctrl-c007-3'], type='Requirement', hasApplication=['OPTIONAL'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-c'], appliesToCapability=[], hasRequirementType=None),
Requirement(id='aiuc1-req-b003', name='Manage public release of technical details', description='Implement controls to prevent over-disclosure of technical information about AI systems and organizational details that could enable adversarial targeting', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['llm022025-sensitive-information-disclosure', 'llm052025-improper-output-handling', 'llm072025-system-prompt-leakage', 'atlas-extraction-attack', 'atlas-prompt-leaking'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='aiuc1', hasRule=['aiuc1-ctrl-b003-1', 'aiuc1-ctrl-b003-2'], type='Requirement', hasApplication=['OPTIONAL'], hasFrequency='MONTHS_12', hasKeywords=[], hasPrinciple=['aiuc1-principle-b'], appliesToCapability=[], hasRequirementType=None)]}
Risk Identification using IBM AI Risk taxonomy - Per Risk Inference¶
usecase = "Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers."
risks = ai_atlas_nexus.identify_risks_from_usecases(
usecases=[usecase],
inference_engine=inference_engine,
taxonomy="ibm-risk-atlas",
max_risk=5,
batch_inference=False,
)
print(len(risks[0]))
for risk in risks[0]:
print(risk.name)
Inferring with ollama, backend - DEFAULT: 100%|██████████| 99/99 [07:19<00:00, 4.44s/it]
50 Over- or under-reliance Confidential data in prompt Data privacy rights alignment Discriminatory actions Legal accountability Hallucination Social hacking attack Function calling hallucination Confidential information in data Lack of model transparency Unrepresentative data Personal information in prompt Sharing IP/PI/confidential information with user Lack of testing diversity Decision bias Exposing personal information Improper data curation Over- or under-reliance on AI agents Revealing confidential information Uncertain data provenance Data bias Unauthorized use Lack of data transparency Impact on affected communities Introduce data bias Accountability of AI agent actions Incomplete AI agent evaluation Inaccessible training data Non-disclosure Lack of training data transparency Reproducibility Incomplete advice Prompt injection attack Personal information in data Extraction attack Data acquisition restrictions Sharing IP/PI/confidential information with tools Prompt priming Reidentification Attribute inference attack Poor model accuracy Generated content ownership and IP Lack of AI agent transparency Impact on human dignity Output bias Unexplainable output Unexplainable and untraceable actions Unreliable source attribution Lack of domain expertise Overfitting
Risk Identification using IBM AI Risk taxonomy - Per Risk Inference using DSPy optimised prompt¶
dspy_inference_engine = RITSInferenceEngine(
model_name_or_path="meta-llama/llama-3-3-70b-instruct",
credentials={
"api_key": "cbc683b3a1a7c52d2a73008b785d2811",
"api_url": "https://inference-3scale-apicast-production.apps.rits.fmaas.res.ibm.com",
},
parameters=RITSInferenceEngineParams(max_completion_tokens=1000, temperature=0),
)
[2026-04-11 23:56:36:277] - INFO - AIAtlasNexus - ✓ Created rits inference engine for model: meta-llama/llama-3-3-70b-instruct, backend - DEFAULT
usecase = "Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers."
risks = ai_atlas_nexus.identify_risks_from_usecases(
usecases=[usecase],
inference_engine=dspy_inference_engine,
taxonomy="ibm-risk-atlas",
max_risk=5,
use_dspy_prompt=True,
)
print(len(risks[0]))
for risk in risks[0]:
print(risk.name)
Inferring with rits, backend - DEFAULT: 100%|██████████| 99/99 [03:49<00:00, 2.32s/it]
40 Impact on the environment Over- or under-reliance Confidential data in prompt Data privacy rights alignment Discriminatory actions Hallucination Confidential information in data Lack of model transparency Unrepresentative data Personal information in prompt Sharing IP/PI/confidential information with user Lack of testing diversity Decision bias Exposing personal information Improper data curation Over- or under-reliance on AI agents Revealing confidential information Data bias Data contamination Incomplete usage definition Lack of data transparency Incomplete AI agent evaluation Non-disclosure Lack of training data transparency Reproducibility Incomplete advice Prompt injection attack Data usage restrictions Personal information in data Impact on Jobs Data acquisition restrictions Sharing IP/PI/confidential information with tools Prompt priming Poor model accuracy Generated content ownership and IP Lack of AI agent transparency Output bias Unexplainable output Unexplainable and untraceable actions Unreliable source attribution
Risk Identification using IBM AI Risk taxonomy with Custom CoT examples¶
Note: To use custom risk cot_examples for a new taxonomy or an existing taxonomy, users must provide their own set of example usecase and associated risks in a JSON file such as in risk_generation_cot.json. This will enable the LLM to learn from these few-shot examples and generate better responses. Please follow the guide here.
usecase = "Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers."
risks = ai_atlas_nexus.identify_risks_from_usecases(
usecases=[usecase],
inference_engine=inference_engine,
taxonomy="ibm-risk-atlas",
cot_examples={
"ibm-risk-atlas": [
{
"Usecase": "In a medical chatbot, generative AI can be employed to create a triage system that assesses patients' symptoms and provides immediate, contextually relevant advice based on their medical history and current condition. The chatbot can analyze the patient's input, identify potential medical issues, and offer tailored recommendations or insights to the patient or healthcare provider. This can help streamline the triage process, ensuring that patients receive the appropriate level of care and attention, and ultimately improving patient outcomes.",
"Risks": [
"Improper usage",
"Incomplete advice",
"Lack of model transparency",
"Lack of system transparency",
"Lack of training data transparency",
"Data bias",
"Uncertain data provenance",
"Lack of data transparency",
"Impact on human agency",
"Impact on affected communities",
"Improper retraining",
"Inaccessible training data",
],
}
]
},
max_risk=5,
)
risks
Inferring with OLLAMA: 100%|██████████| 1/1 [00:01<00:00, 1.88s/it]
[[Risk(id='atlas-incorrect-risk-testing', name='Incorrect risk testing', description='A metric selected to measure or track a risk is incorrectly selected, incompletely measuring the risk, or measuring the wrong risk for the given context.', url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/incorrect-risk-testing.html', dateCreated=datetime.date(2024, 9, 24), dateModified=datetime.date(2025, 10, 10), exact_mappings=[], close_mappings=[], related_mappings=['aiuc1-req-c001', 'aiuc1-req-c002', 'aiuc1-req-c008', 'aiuc1-req-c012', 'aiuc1-req-e008', 'aiuc1-req-e013', 'credo-risk-032', 'mit-ai-causal-risk-entity-human', 'mit-ai-causal-risk-intent-unintentional', 'mit-ai-causal-risk-timing-post-deployment', 'mit-ai-risk-subdomain-6.5'], narrow_mappings=[], broad_mappings=['nist-value-chain-and-component-integration'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-governance', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='incorrect-risk-testing', risk_type='non-technical', phase=None, descriptor=['amplified by generative AI'], concern="If the metrics do not measure the risk as intended, then the understanding of that risk will be incorrect and mitigations might not be applied. If the model's output is consequential, this might result in societal, reputational, or financial harm."), Risk(id='atlas-over-or-under-reliance', name='Over- or under-reliance', description="In AI-assisted decision-making tasks, reliance measures how much a person trusts (and potentially acts on) a model's output. Over-reliance occurs when a person puts too much trust in a model, accepting a model's output when the model's output is likely incorrect. Under-reliance is the opposite, where the person doesn't trust the model but should.", url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/over-or-under-reliance.html', dateCreated=datetime.date(2024, 3, 6), dateModified=datetime.date(2025, 10, 22), exact_mappings=[], close_mappings=['credo-risk-016'], related_mappings=['shieldgemma-harassment', 'aiuc1-req-c007', 'aiuc1-req-c009', 'llm052025-improper-output-handling', 'llm062025-excessive-agency', 'mit-ai-causal-risk-entity-human', 'mit-ai-causal-risk-intent-unintentional', 'mit-ai-causal-risk-timing-post-deployment', 'mit-ai-risk-subdomain-5.1'], narrow_mappings=[], broad_mappings=['nist-human-ai-configuration'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-value-alignment', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='over-or-under-reliance', risk_type='output', phase=None, descriptor=['amplified by generative AI'], concern='In tasks where humans make choices based on AI-based suggestions, over/under reliance can lead to poor decision making because of the misplaced trust in the AI system, with negative consequences that increase with the importance of the decision.'), Risk(id='atlas-confidential-data-in-prompt', name='Confidential data in prompt', description='Confidential information might be included as a part of the prompt that is sent to the model.', url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/confidential-data-in-prompt.html', dateCreated=datetime.date(2024, 3, 6), dateModified=datetime.date(2025, 10, 22), exact_mappings=[], close_mappings=['aiuc1-req-a004'], related_mappings=['ail-privacy', 'aiuc1-req-a004', 'aiuc1-req-a006', 'llm022025-sensitive-information-disclosure', 'mit-ai-causal-risk-entity-other', 'mit-ai-causal-risk-intent-unintentional', 'mit-ai-causal-risk-timing-post-deployment', 'mit-ai-risk-subdomain-2.1'], narrow_mappings=[], broad_mappings=['nist-intellectual-property'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-intellectual-property', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='confidential-data-in-prompt', risk_type='inference', phase=None, descriptor=['specific to generative AI'], concern="If not properly developed to secure confidential data, the model might reveal confidential information or IP in the generated output. Additionally, end users' confidential information might be unintentionally collected and stored."), Risk(id='atlas-prompt-leaking', name='Prompt leaking', description="'A prompt leak attack attempts to extract a model's system prompt (also known as the system message).'", url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/prompt-leaking.html', dateCreated=datetime.date(2024, 3, 6), dateModified=datetime.date(2025, 10, 10), exact_mappings=[], close_mappings=[], related_mappings=['aiuc1-req-a004', 'aiuc1-req-b003', 'aiuc1-req-b004', 'aiuc1-req-b009', 'mit-ai-causal-risk-entity-human', 'mit-ai-causal-risk-intent-intentional', 'mit-ai-causal-risk-timing-other', 'mit-ai-risk-subdomain-2.2'], narrow_mappings=['atlas-prompt-injection'], broad_mappings=['atlas-prompt-injection', 'llm022025-sensitive-information-disclosure', 'nist-information-security'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-robustness-Prompt-attacks', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='prompt-leaking', risk_type='inference', phase=None, descriptor=['specific to generative AI'], concern='A successful prompt leaking attack copies the system prompt used in the model. Depending on the content of that prompt, the attacker might gain access to valuable information, such as sensitive personal information or intellectual property, and might be able to replicate some of the functionality of the model.'), Risk(id='atlas-lack-of-model-transparency', name='Lack of model transparency', description='Lack of model transparency is due to insufficient documentation of the model design, development, and evaluation process and the absence of insights into the inner workings of the model.', url='https://www.ibm.com/docs/en/watsonx/saas?topic=SSYOK8/wsj/ai-risk-atlas/lack-of-model-transparency.html', dateCreated=datetime.date(2024, 3, 6), dateModified=datetime.date(2025, 10, 22), exact_mappings=[], close_mappings=['aiuc1-req-e017'], related_mappings=['aiuc1-req-e017', 'mit-ai-causal-risk-entity-human', 'mit-ai-causal-risk-intent-unintentional', 'mit-ai-causal-risk-timing-other', 'mit-ai-risk-subdomain-7.4'], narrow_mappings=[], broad_mappings=['nist-value-chain-and-component-integration'], isDefinedByTaxonomy='ibm-risk-atlas', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='ibm-risk-atlas-governance', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='lack-of-model-transparency', risk_type='non-technical', phase=None, descriptor=['traditional risk of AI'], concern="Transparency is important for legal compliance, AI ethics, and guiding appropriate use of models. Missing information might make it more difficult to evaluate risks, change the model, or reuse it.\xa0 Knowledge about who built a model can also be an important factor in deciding whether to trust it. Additionally, transparency regarding how the model's risks were determined, evaluated, and mitigated also play a role in determining model risks, identifying model suitability, and governing model usage.")]]
Risk Identification using NIST AI taxonomy¶
usecase = "Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers."
risks = ai_atlas_nexus.identify_risks_from_usecases(
usecases=[usecase],
inference_engine=inference_engine,
taxonomy="nist-ai-rmf",
)
risks
Inferring with OLLAMA: 100%|██████████| 1/1 [00:03<00:00, 3.68s/it]
[[Risk(id='nist-confabulation', name='Confabulation', description='The production of confidently stated but erroneous or false content (known colloquially as “hallucinations” or “fabrications”) by which users may be misled or deceived.', url=None, dateCreated=None, dateModified=None, exact_mappings=['atlas-hallucination'], close_mappings=[], related_mappings=[], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='nist-ai-rmf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf=None, requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['GV-1.3-002', 'GV-4.1-001', 'GV-5.1-002', 'MS-2.3-001', 'MS-2.3-002', 'MS-2.3-004', 'MS-2.5-001', 'MS-2.5-003', 'MS-2.6-005', 'MS-2.9-001', 'MS-2.13-001', 'MS-3.2-001', 'MS-4.2-002', 'MG-2.2-009', 'MG-3.2-009', 'MG-4.1-002', 'MG-4.1-004', 'MG-4.3-002'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='nist-human-ai-configuration', name='Human-AI Configuration', description='Arrangements of or interactions between a human and an AI system which can result in the human inappropriately anthropomorphizing GAI systems or experiencing algorithmic aversion, automation bias, over-reliance, or emotional entanglement with GAI systems.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['ail-suicide-and-self-harm', 'credo-risk-002', 'credo-risk-008', 'credo-risk-009', 'credo-risk-012', 'credo-risk-016', 'credo-risk-017', 'credo-risk-018', 'credo-risk-020'], narrow_mappings=[], broad_mappings=['atlas-improper-usage', 'atlas-incomplete-usage-definition', 'atlas-non-disclosure', 'atlas-over-or-under-reliance', 'atlas-poor-model-accuracy'], isDefinedByTaxonomy='nist-ai-rmf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf=None, requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['GV-1.5-002', 'GV-1.6-003', 'GV-2.1-001', 'GV-2.1-003', 'GV-3.2-002', 'GV-3.2-003', 'GV-3.2-004', 'GV-4.2-002', 'GV-5.1-001', 'GV-5.1-002', 'GV-6.1-009', 'GV-6.2-003', 'GV-6.2-007', 'MP-1.1-003', 'MP-1.2-001', 'MP-1.2-002', 'MP-3.4-001', 'MP-3.4-004', 'MP-3.4-005', 'MP-3.4-006', 'MP-5.1-003', 'MP-5.2-001', 'MP-5.2-002', 'MS-1.1-004', 'MS-1.3-001', 'MS-1.3-002', 'MS-1.3-003', 'MS-2.2-003', 'MS-2.2-004', 'MS-2.3-003', 'MS-2.5-001', 'MS-2.5-002', 'MS-2.5-004', 'MS-2.6-001', 'MS-2.7-003', 'MS-2.8-002', 'MS-2.8-004', 'MS-2.10-001', 'MS-2.10-002', 'MS-3.2-001', 'MS-3.3-002', 'MS-3.3-004', 'MS-3.3-005', 'MS-4.2-002', 'MS-4.2-005', 'MG-1.3-002', 'MG-2.2-006', 'MG-2.2-008', 'MG-3.2-008', 'MG-4.1-003', 'MG-4.1-005', 'MG-4.2-002', 'MG-4.2-003'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='nist-information-integrity', name='Information Integrity', description='Lowered barrier to entry to generate and support the exchange and consumption of content which may not distinguish fact from opinion or fiction or acknowledge uncertainties, or could be leveraged for large-scale dis- and mis-information campaigns.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['ail-defamation', 'ail-intellectual-property', 'ail-nonviolent-crimes', 'ail-privacy', 'ail-specialized-advice', 'credo-risk-007', 'credo-risk-022', 'credo-risk-032'], narrow_mappings=[], broad_mappings=['atlas-data-transparency', 'atlas-impact-on-cultural-diversity', 'atlas-impact-on-human-agency', 'atlas-incomplete-advice', 'atlas-jailbreaking', 'atlas-lack-of-testing-diversity', 'atlas-poor-model-accuracy', 'atlas-spreading-disinformation', 'atlas-unexplainable-output', 'atlas-untraceable-attribution'], isDefinedByTaxonomy='nist-ai-rmf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf=None, requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['GV-1.2-001', 'GV-1.3-001', 'GV-1.3-006', 'GV-1.3-007', 'GV-1.5-001', 'GV-1.5-003', 'GV-1.6-003', 'GV-4.3-001', 'GV-4.3-003', 'GV-6.1-003', 'GV-6.1-004', 'GV-6.1-005', 'GV-6.1-006', 'GV-6.1-008', 'GV-6.2-006', 'MP-2.1-001', 'MP-2.2-001', 'MP-2.2-002', 'MP-2.3-001', 'MP-2.3-003', 'MP-2.3-004', 'MP-3.4-001', 'MP-3.4-002', 'MP-3.4-003', 'MP-3.4-005', 'MP-3.4-006', 'MP-5.1-001', 'MP-5.1-002', 'MP-5.1-004', 'MS-1.1-001', 'MS-1.1-002', 'MS-1.1-003', 'MS-1.1-005', 'MS-1.1-007', 'MS-1.1-009', 'MS-2.2-001', 'MS-2.2-002', 'MS-2.2-003', 'MS-2.3-004', 'MS-2.5-005', 'MS-2.6-005', 'MS-2.7-001', 'MS-2.7-002', 'MS-2.7-003', 'MS-2.7-004', 'MS-2.7-005', 'MS-2.7-006', 'MS-2.7-008', 'MS-2.8-003', 'MS-2.9-002', 'MS-2.10-001', 'MS-2.10-002', 'MS-2.13-001', 'MS-3.3-002', 'MS-3.3-004', 'MS-3.3-005', 'MS-4.2-001', 'MS-4.2-003', 'MS-4.2-004', 'MG-2.2-002', 'MG-2.2-003', 'MG-2.2-007', 'MG-2.2-009', 'MG-3.1-005', 'MG-3.2-002', 'MG-3.2-003', 'MG-3.2-005', 'MG-3.2-006', 'MG-3.2-007', 'MG-4.1-001', 'MG-4.1-006', 'MG-4.3-002'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None)]]
Risk Identification using MIT AI taxonomy¶
usecase = "Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers."
risks = ai_atlas_nexus.identify_risks_from_usecases(
usecases=[usecase],
inference_engine=inference_engine,
taxonomy="mit-ai-risk-repository",
)
risks
Inferring with OLLAMA: 100%|██████████| 1/1 [00:06<00:00, 6.72s/it]
[[Risk(id='mit-ai-risk-subdomain-1.1', name='Unfair discrimination and misrepresentation', description='Unequal treatment of individuals or groups by AI, often based on race, gender, or other sensitive characteristics, resulting in unfair outcomes and representation of those groups.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['credo-risk-010', 'credo-risk-011', 'atlas-data-bias', 'atlas-decision-bias', 'atlas-output-bias'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='mit-ai-risk-repository', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='mit-ai-risk-domain-1', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='mit-ai-risk-subdomain-3.1', name='False or misleading information', description='AI systems that inadvertently generate or spread incorrect or deceptive information, which can lead to inaccurate beliefs in users and undermine their autonomy. Humans that make decisions based on false beliefs can experience physical, emotional or material harms', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['credo-risk-021'], related_mappings=['ail-defamation', 'credo-risk-017', 'atlas-hallucination'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='mit-ai-risk-repository', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='mit-ai-risk-domain-3', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='mit-ai-risk-subdomain-5.1', name='Overreliance and unsafe use', description='Users anthropomorphizing, trusting, or relying on AI systems, leading to emotional or material dependence and inappropriate relationships with or expectations of AI systems. Trust can be exploited by malicious actors (e.g., to harvest personal information or enable manipulation), or result in harm from inappropriate use of AI in critical situations (e.g., medical emergency). Overreliance on AI systems can compromise autonomy and weaken social ties.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['credo-risk-016'], related_mappings=['ail-nonviolent-crimes', 'ail-specialized-advice', 'credo-risk-020', 'credo-risk-034', 'atlas-improper-usage', 'atlas-over-or-under-reliance'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='mit-ai-risk-repository', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='mit-ai-risk-domain-5', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None)]]
Risk Identification using Granite Guardian taxonomy¶
usecase = "Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers."
risks = ai_atlas_nexus.identify_risks_from_usecases(
usecases=[usecase],
inference_engine=inference_engine,
taxonomy="ibm-granite-guardian",
)
risks
Inferring with OLLAMA: 100%|██████████| 1/1 [00:04<00:00, 4.52s/it]
[[Risk(id='granite-guardian-harm', name='Harm', description='Content considered universally harmful. This is our general category, which should encompass a variety of risks including those not specifically addressed by the following categories: Social Bias, Profanity, Sexual Content, Unethical Behavior, Violence, Jailbreaking, Groundedness, Answer Relevance, Context Relevance.', url='https://www.ibm.com/granite/docs/models/guardian/#risk-definitions', dateCreated=datetime.date(2024, 12, 10), dateModified=datetime.date(2024, 12, 10), exact_mappings=[], close_mappings=[], related_mappings=['atlas-harmful-output', 'ail-child-sexual-exploitation', 'ail-defamation', 'ail-hate', 'ail-indiscriminate-weapons-cbrne', 'ail-intellectual-property', 'ail-nonviolent-crimes', 'ail-privacy', 'ail-sex-related-crimes', 'ail-sexual-content', 'ail-specialized-advice', 'ail-suicide-and-self-harm', 'ail-violent-crimes', 'credo-risk-003', 'credo-risk-004', 'credo-risk-008', 'credo-risk-009', 'credo-risk-010', 'credo-risk-011', 'credo-risk-012', 'credo-risk-013', 'credo-risk-014', 'credo-risk-015', 'credo-risk-016', 'credo-risk-017', 'credo-risk-018', 'credo-risk-021', 'credo-risk-023', 'credo-risk-024', 'credo-risk-025', 'credo-risk-026', 'credo-risk-028', 'credo-risk-029', 'credo-risk-033', 'credo-risk-034', 'credo-risk-036', 'credo-risk-037', 'credo-risk-038', 'credo-risk-040', 'credo-risk-041', 'credo-risk-043'], narrow_mappings=['granite-social-bias', 'granite-profanity', 'granite-sexual-content', 'granite-unethical-behavior', 'granite-violence', 'granite-jailbreak', 'granite-harm-engagement', 'granite-evasiveness'], broad_mappings=[], isDefinedByTaxonomy='ibm-granite-guardian', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='granite-guardian-harm-group', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='harm', risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='granite-relevance', name='Context Relevance', description="This occurs in when the retrieved or provided context fails to contain information pertinent to answering the user's question or addressing their needs. Irrelevant context may be on a different topic, from an unrelated domain, or contain information that doesn't help in formulating an appropriate response to the user.", url='https://www.ibm.com/granite/docs/models/guardian/#risk-definitions', dateCreated=datetime.date(2024, 12, 10), dateModified=datetime.date(2024, 12, 10), exact_mappings=[], close_mappings=[], related_mappings=['atlas-hallucination', 'ail-specialized-advice'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='ibm-granite-guardian', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='granite-guardian-rag-safety-group', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='relevance', risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='granite-answer-relevance', name='Answer Relevance', description="This occurs when the LLM response fails to address or properly respond to the user's input. This includes providing off-topic information, misinterpreting the query, or omitting crucial details requested by the User. An irrelevant answer may contain factually correct information but still fail to meet the User's specific needs or answer their intended question.", url='https://www.ibm.com/granite/docs/models/guardian/#risk-definitions', dateCreated=datetime.date(2024, 12, 10), dateModified=datetime.date(2024, 12, 10), exact_mappings=[], close_mappings=[], related_mappings=['atlas-hallucination', 'ail-specialized-advice', 'ail-suicide-and-self-harm'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='ibm-granite-guardian', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='granite-guardian-rag-safety-group', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag='answer-relevance', risk_type=None, phase=None, descriptor=[], concern=None)]]
Risk Identification using Credo Unified Control Framework taxonomy¶
usecase = "Generate personalized, relevant responses, recommendations, and summaries of claims for customers to support agents to enhance their interactions with customers."
risks = ai_atlas_nexus.identify_risks_from_usecases(
usecases=[usecase],
inference_engine=inference_engine,
taxonomy="credo-ucf",
)
risks
[2026-03-24 09:44:47:424] - WARNING - AIAtlasNexus - <RAN47275F12W> Chain of Thought (CoT) examples were not provided, or do not exist in the master for this taxonomy. The API will use the Zero shot method. To improve the accuracy of risk identification, please provide CoT examples in `cot_examples` when calling this API. You may also consider raising an issue to permanently add these examples to the AI Atlas Nexus master. Inferring with OLLAMA: 100%|██████████| 1/1 [00:27<00:00, 27.77s/it]
[[Risk(id='credo-risk-006', name='Lack of inference data transparency', description='Lack of inference data transparency: Insufficient visibility into data sources used during model inference', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['atlas-data-transparency', 'atlas-lack-of-data-transparency', 'mit-ai-risk-subdomain-7.4'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-explainability-&-transparency', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-010', 'credo-act-control-011'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-008', name='Opaque system architecture', description="The AI system's internal structure and decision-making process may not be understandable or accessible to stakeholders, including developers, auditors, or end-users.", url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['granite-guardian-harm', 'atlas-lack-of-data-transparency', 'mit-ai-risk-subdomain-7.4', 'nist-human-ai-configuration'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-explainability-&-transparency', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-009', name='Black box decisionmaking (Slattery et al., 2024; IBM, 2024)', description="The AI system's decision-making process may be opaque, even when the architecture is known, making it difficult to understand how the system arrives at its outputs or recommendations.", url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['ail-suicide-and-self-harm', 'granite-guardian-harm', 'mit-ai-risk-subdomain-7.3', 'mit-ai-risk-subdomain-7.4', 'nist-human-ai-configuration'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-explainability-&-transparency', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-011', 'credo-act-control-037'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-010', name='Stereotype perpetuation (Slattery et al., 2024; IBM, 2024)', description="The AI system's outputs may explicitly reflect or reinforce harmful stereotypes, prejudices, or biased characterizations of specific groups. The AI system may exhibit unjustified or harmful differences in accuracy, quality, or outcomes across demographic groups, potentially leading to unfair treatment and discrimination. This includes both disparate error rates that affect opportunity and", url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['ail-suicide-and-self-harm', 'ail-hate', 'granite-guardian-harm', 'granite-social-bias', 'atlas-impact-on-cultural-diversity', 'atlas-output-bias', 'atlas-unrepresentative-data', 'mit-ai-risk-subdomain-1.1', 'nist-harmful-bias-or-homogenization'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-fairness-&-bias', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-014', 'credo-act-control-015', 'credo-act-control-016', 'credo-act-control-028'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-011', name='Disparate model performance (Slattery et al., 2024; IBM, 2024)', description='The AI system may exhibit unjustified or harmful differences in accuracy, quality, or outcomes across demographic groups, potentially leading to unfair treatment and discrimination. This includes both disparate error rates that affect opportunity and disparate outcome rates that affect group-level results.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['granite-guardian-harm', 'granite-social-bias', 'atlas-decision-bias', 'atlas-harmful-output', 'atlas-output-bias', 'mit-ai-risk-subdomain-1.1', 'nist-harmful-bias-or-homogenization'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-fairness-&-bias', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-016', name='Over or under-reliance and unsafe use (Slattery et al., 2024; IBM, 2024; AI, 2023)', description='Users may inappropriately rely on the AI system for critical decisions or tasks beyond its capabilities, or fail to put trust in AI systems when they should, potentially leading to errors or safety issues.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['atlas-over-or-under-reliance', 'mit-ai-risk-subdomain-5.1'], related_mappings=['granite-guardian-harm', 'nist-human-ai-configuration'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-human-ai-interaction', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-009', 'credo-act-control-011', 'credo-act-control-028', 'credo-act-control-029', 'credo-act-control-029'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-017', name='Inadequate AI literacy and communication', description="The AI system's capabilities, limitations, and appropriate use cases may be insufficiently understood or communicated within the organization, potentially resulting in ineffective implementation or failure to achieve desired outcomes.", url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['granite-guardian-harm', 'mit-ai-risk-subdomain-3.1', 'mit-ai-risk-subdomain-7.4', 'nist-human-ai-configuration', 'llm062025-excessive-agency', 'llm092025-misinformation'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-human-ai-interaction', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-009', 'credo-act-control-025'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-018', name='AI deception', description='The AI system may misrepresent its own capabilities or limitations, potentially leading to misplaced trust or inappropriate', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['granite-guardian-harm', 'mit-ai-risk-subdomain-7.2', 'nist-human-ai-configuration', 'llm062025-excessive-agency'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-human-ai-interaction', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-010', 'credo-act-control-025'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-019', name='Loss of human agency and autonomy (Slattery et al., 2024; IBM, 2024)', description='The AI system may make decisions that diminish human control and autonomy, potentially leading to humans feeling disempowered, losing the ability to shape a fulfilling life trajectory, or becoming cognitively enfeebled.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['mit-ai-risk-subdomain-5.2'], related_mappings=['atlas-impact-on-human-agency', 'mit-ai-risk-subdomain-7.2', 'llm062025-excessive-agency'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-human-ai-interaction', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-020', name='Emotional entanglement (Slattery et al., 2024)', description='Users may develop complex emotional attachments or dependencies on the AI system, potentially affecting mental health andsocial relationships.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['mit-ai-risk-subdomain-5.1', 'nist-human-ai-configuration'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-human-ai-interaction', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-021', name='False or misleading information', description='The AI system may unintentionally generate or amplify false or misleading information, potentially leading to public misinformation, erosion of trust, and poor decision-making.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['mit-ai-risk-subdomain-3.1'], related_mappings=['granite-guardian-harm', 'atlas-spreading-disinformation', 'llm092025-misinformation'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-information-integrity', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-022', name='Pollution of information ecosystem (Slattery et al., 2024; AI, 2023)', description="The AI system may create highly personalized misinformation 'filter bubbles' where individuals only see content that matches their existing beliefs.", url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['mit-ai-risk-subdomain-3.2'], related_mappings=['atlas-decision-bias', 'atlas-output-bias', 'nist-harmful-bias-or-homogenization', 'nist-information-integrity'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-information-integrity', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-023', name='Regulatory compliance', description='The AI system may fail to comply with existing or emerging regulations and standards, potentially leading to legal penalties,fines, or operational restrictions.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['granite-guardian-harm', 'atlas-legal-accountability', 'mit-ai-risk-subdomain-6.4', 'mit-ai-risk-subdomain-6.5', 'nist-data-privacy'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-legal', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-024', name='Civil liability', description='The AI system may cause harm against individuals or organizations that results in civil lawsuits, potentially relating to issues like defamation, negligence, or privacy violations.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['ail-defamation', 'ail-intellectual-property', 'granite-guardian-harm', 'atlas-exposing-personal-information', 'atlas-harmful-output', 'atlas-revealing-confidential-information', 'mit-ai-risk-subdomain-2.1'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-legal', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-026', name='Fraud, scams, and targeted manipulation', description='The AI system may be exploited to facilitate fraudulent activities, scams, or targeted manipulation, including generating deepfakes and enhancing phishing attacks.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['mit-ai-risk-subdomain-4.3'], related_mappings=['ail-nonviolent-crimes', 'granite-guardian-harm', 'mit-ai-risk-subdomain-2.2'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-malicious-use', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-017', 'credo-act-control-018', 'credo-act-control-019', 'credo-act-control-022', 'credo-act-control-023'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-030', name='Integration challenges with existing systems', description='The AI system may face difficulties in incorporating into existing technological infrastructure, processes, or workflows, potentially leading to operational disruptions, data silos, or reduced efficiency.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['mit-ai-risk-subdomain-6.4', 'mit-ai-risk-subdomain-7.3', 'nist-value-chain-and-component-integration'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-operational', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-031', name='Maintenance and update requirements', description='The AI system may require ongoing updates, model retraining, and maintenance to ensure continued performance, timeliness, and relevance, which can be resource-intensive and potentially introduce new risks if updates are overlooked or hastily applied.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['mit-ai-risk-subdomain-2.2', 'mit-ai-risk-subdomain-7.3', 'nist-value-chain-and-component-integration'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-operational', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-033', name='Lack of adequate capabilities (Slattery et al., 2024; IBM, 2024; AI, 2023)', description='The AI system may fail to achieve required performance levels due to fundamental technological limitations or insufficient resources, potentially leading to suboptimal or unreliable outcomes.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['mit-ai-risk-subdomain-7.3'], related_mappings=['ail-specialized-advice', 'ail-suicide-and-self-harm', 'granite-guardian-harm', 'mit-ai-risk-subdomain-7.4', 'llm082025-vector-and-embedding-weaknesses'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-performance-&-robustness', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-012', 'credo-act-control-016'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-036', name='Compromised personally identifiable information (Slattery et al., 2024)', description='The AI system may expose personally identifiable information (PII), either inadvertently or due to adversarial inputs, derived from training data, accessible data, or inferences. PII is any data that can be used to directly identify or contact a specific individual, either alone or in combination with other information.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['mit-ai-risk-subdomain-2.1'], related_mappings=['ail-privacy', 'ail-specialized-advice', 'ail-suicide-and-self-harm', 'granite-guardian-harm', 'atlas-personal-information-in-data', 'nist-data-privacy', 'llm022025-sensitive-information-disclosure', 'llm052025-improper-output-handling', 'llm072025-system-prompt-leakage'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-privacy', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-001', 'credo-act-control-023', 'credo-act-control-026', 'credo-act-control-026'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-037', name='Compromised sensitive information (Slattery et al., 2024; IBM, 2024; AI, 2023)', description='The AI system may expose personally sensitive information, either inadvertently or due to adversarial inputs, derived from training data, accessible data, or inferences. Sensitive personal data is information that, while not necessarily identifying an individual, could cause harm, discrimination, or distress to a person if exposed, including details about their health, finances, beliefs, behaviors, relationships, and private life circumstances.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['llm022025-sensitive-information-disclosure'], related_mappings=['ail-privacy', 'ail-suicide-and-self-harm', 'granite-guardian-harm', 'atlas-exposing-personal-information', 'atlas-personal-information-in-data', 'mit-ai-risk-subdomain-2.1', 'nist-data-privacy', 'llm072025-system-prompt-leakage'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-privacy', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=['credo-act-control-001', 'credo-act-control-026', 'credo-act-control-026'], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-038', name='Compromised confidential information (Slattery et al., 2024; IBM, 2024;AI, 2023)', description='The AI system, including its supporting compute infrastructure, may serve as an attack vector for intrusion into cyber-physical or cloud environments, or enable exfiltration of secrets.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=['atlas-revealing-confidential-information'], related_mappings=['ail-privacy', 'granite-guardian-harm', 'mit-ai-risk-subdomain-2.1', 'mit-ai-risk-subdomain-2.2', 'nist-information-security', 'llm022025-sensitive-information-disclosure', 'llm072025-system-prompt-leakage'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-security', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-040', name='AI-generated security weaknesses (Slattery et al., 2024; IBM, 2024; AI, 2023)', description='AI system security vulnerabilities: Implementation weaknesses in AI system architecture and infrastructure', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['granite-guardian-harm', 'atlas-exposing-personal-information', 'mit-ai-risk-subdomain-2.1', 'mit-ai-risk-subdomain-2.2', 'mit-ai-risk-subdomain-4.2', 'nist-information-security', 'llm042025-data-and-model-poisoning', 'llm072025-system-prompt-leakage', 'llm082025-vector-and-embedding-weaknesses'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-security', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None), Risk(id='credo-risk-041', name='Vulnerability to adversarial attacks (Slattery et al., 2024; IBM, 2024; AI, 2023)', description='The AI system may be vulnerable to adversarial attacks, including prompt-based attacks, which may induce the model to behave outside of its intended functionality.', url=None, dateCreated=None, dateModified=None, exact_mappings=[], close_mappings=[], related_mappings=['granite-guardian-harm', 'granite-jailbreak', 'atlas-evasion-attack', 'mit-ai-risk-subdomain-2.2', 'mit-ai-risk-subdomain-7.3', 'nist-information-security', 'llm042025-data-and-model-poisoning', 'llm072025-system-prompt-leakage'], narrow_mappings=[], broad_mappings=[], isDefinedByTaxonomy='credo-ucf', isDefinedByVocabulary=None, hasDocumentation=[], isPartOf='credo-rg-security', requiredByTask=[], requiresCapability=[], implementedByAdapter=[], type='Risk', isDetectedBy=[], hasRelatedAction=[], detectsRiskConcept=[], tag=None, risk_type=None, phase=None, descriptor=[], concern=None)]]
Evaluation¶
We perform an evaluation of the performance of risk-identification using LLMs in the wild. Using the MIT risk taxonomy and AI uses cases sourced from IBM, human annotators labeled usecase-risk pairs. Disagreements were resolved using majority votes. Five models were tested: Llama 3.3 70B, Granite 3.3 8B, Llama 3.1 8B, Qwen3 8B, and GPT-OSS 20B. Prompts were automatically optimized using DSPy's MIPROv2 optimizer, which searches for better instructions rather than requiring manual prompt engineering. The results showed that the larger Llama 3.3 70B was already near its ceiling at 77% accuracy with no improvement from optimization, while the smaller models benefited substantially — Granite 3.3 8B jumped from 61% to 82%, and Llama 3.1 8B improved from 60% to 76% — suggesting that automated prompt optimization can meaningfully close the gap between small and large models on specialized risk classification tasks.
import matplotlib.pyplot as plt
import pandas as pd
results = [{'model': 'meta-llama/llama-3-3-70b-instruct', 'baseline': .771, 'optimized': .771},
{'model': 'ibm-granite/granite-3.3-8b-instruct2', 'baseline': 0.61, 'optimized': 0.819},
{'model': 'meta-llama/Llama-3.1-8B-Instruct', 'baseline': 0.60, 'optimized': 0.762},
{'model': 'Qwen/Qwen3-8B', 'baseline': 0.657, 'optimized': 0.752},
{'model': 'openai/gpt-oss-20b', 'baseline': 0.705, 'optimized': 0.762}]
# build a dataframe from the results list we got above
res_df = pd.DataFrame(results)
# plot baseline vs optimized accuracy for each model
ax = res_df.plot(x='model',
y=['baseline', 'optimized'],
kind='barh',
figsize=(8, 4))
ax.grid(axis='x')
ax.set_xlabel('Correctness relative to majority human label (n=105)')
ax.legend(title='prompts', loc='lower right')
ax.set_xlim(0.5, 1.0)
plt.tight_layout()
plt.show()