The rapid evolution of Generative AI (GenAI) presents unprecedented opportunities across industries, from automating complex processes to revolutionising customer engagement. Yet, alongside this immense potential lies a complex web of risks that, if unaddressed, can lead to significant financial, reputation, and regulatory consequences. Before deploying any GenAI application, a structured and comprehensive risk assessment isn't just advisable—it's imperative.
This white paper proposes a guided approach to identifying the risk profile of GenAI implementations. Essentially, this is a structured set of questions and considerations designed to ensure that stakeholders systematically evaluate how an AI will be used, what could potentially go wrong, and what controls should be applied to mitigate those risks. This framework, adaptable to various GenAI use cases—from customer chatbots and advisory tools to internal process automation empowers organisations to make informed decisions, fostering responsible and secure AI deployment.
The Guided Framework for GenAI Risk Identification
Our framework is divided into eight key steps, each designed to uncover specific risk factors associated with GenAI applications.
A. Define the Use Case and Context
The foundational step involves clearly articulating the GenAI application's intended function, its affected parties, and its operational environment. A deep understanding of the context sets the appropriate risk tolerance. For instance, an AI assisting an internal developer has more room for error than one directly advising clients on investments.
Key Questions:
- What's the intended function of the GenAI? (e.g., Generate research reports, answer customer queries, assist in code generation, detect fraud patterns.)
- Who are the end-users or beneficiaries? (Internal staff like analysts, or external users like customers? Are they tech-savvy or the general public?)
- What business process or decision does it impact? (Is it making a recommendation, an automated decision, or just providing information?)
- What's the criticality of this process? (High impact like trading decisions or client advice vs. low impact like internal brainstorming.)
Considerations: This step scopes the assessment to GenAI-specific use cases, ensuring focus on the unique characteristics of generative models.
B. Identify Data Involved
Data serves as both the fuel and a major source of risk for GenAI models. A thorough evaluation of data inputs and outputs is crucial.
Key Questions:
- What data is the GenAI model trained on? (Proprietary financial data, public internet data, client transaction data? Identify any sensitive or regulated data in the training set, such as Personally Identifiable Information (PII) or account numbers.)
- Will the AI output or expose any of that training data? (Assess the risk of regurgitation of sensitive information. For example, if training included confidential documents, could the AI quote from them?)
- What data will be provided in prompts or queries to the AI? (List the fields or content an average query will contain. Mark any sensitive data, like customer names, balances, or trade details, as this indicates higher risk.)
- Does the use case involve personal or regulatory data? (e.g., Credit scores? If so, flag for privacy/General Data Protection Regulation (GDPR) and confidentiality risks. Determine if user consent or a legal basis exists for using that data with an AI tool.)
- How is output data used? (Could the AI generate data that needs protection, such as a file with customer information? Plan how that output will be stored and protected.)
Considerations: This step often reveals specific risks like "customer addresses supplied to GenAI model – privacy risk" or "AI trained on month-old market data – stale output risk."
C. Model and Technology
Assessing the type of model and its technical setup is vital for understanding inherent risks and potential vulnerabilities.
Key Questions:
- What type of GenAI model is it, and who built it? (A third-party hosted Large Language Model (LLM), a self-hosted open-source model, or an in-house developed model? Third-party models carry vendor and transparency risks, while in-house models bear development and validation burdens.)
- Is the model pre-trained (foundation model) or custom-trained for our data? (Pre-trained models may have general knowledge and biases not specific to your domain; fine-tuned models on your data might perform better but could expose your data if not handled carefully.)
- How accurate or tested is the model for this use case? (If available, review any benchmarks or test results. A known 5% hallucination rate in general Q&A is a red flag for high-stakes use cases.)
- Does the model support explainability or additional controls? (Some platforms offer features like traceability of sources, which are positive for risk mitigation. A lack of control features means greater reliance on external mitigations.)
- How is the AI hosted and integrated? (Is it an on-premises deployment or cloud API? Cloud introduces data transit and storage risks; on-prem requires strong internal security. Identify integration points—APIs, data feeds—to assess potential vulnerabilities.)
Considerations: This technical review identifies issues like "Using a black-box API from vendor X – need to address vendor risk and lack of explainability" or "Model hasn’t been validated on our data type – model risk present."
D. Output and Decision Impact
Consider the nature of the AI's output and its potential consequences, both intended and unintended.
Key Questions:
- What form do outputs take? (e.g., Free-form text answers, numerical predictions, classifications, images. Free-form text has a broad range and could include inappropriate content, while a numeric score might be easier to validate but still incorrect.)
- How will those outputs be used in decision-making? (Are they advisory, with a human making the final decision, or automated, where AI output directly triggers an action? Automated decisions carry higher risk and require stronger controls.)
- What’s the worst-case scenario of a bad output? (Could a hallucinated answer cause a major loss or compliance breach? For example, an incorrect regulatory filing could lead to a compliance violation; a misclassified legitimate transaction could unjustly freeze a customer account. Enumerate possible failure modes and their severity.)
- Could the output offend or harm someone? (For customer-facing AI, consider misuse or harmful content: Could it inadvertently use foul language, biased language, or share sensitive information? If so, content filtering is needed.)
- Is there a human fail-safe? (Will a human review the AI output in the intended workflow? If not, treat the risk as if the AI is "in charge," requiring a very high bar for accuracy and control. If yes, clarify the human's role—do they have the time and expertise to catch errors, or will they just rubber-stamp?)
Considerations: Evaluating output impact helps categorise the use case’s criticality (e.g., informative vs. decision-making, internal vs. customer-facing), which directly correlates with the risk level.
E. Regulatory Mapping
Cross-referencing the GenAI use case with relevant regulations and laws is crucial for ensuring compliance and avoiding legal repercussions.
Key Questions:
- Which regulations or guidelines apply to this use case? (For instance, a chatbot giving investment suggestions must align with securities advice regulations; using AI in credit underwriting invokes fair lending laws; generating marketing content triggers advertising rules.)
- Does the AI need to provide explanations or reasons due to regulations? (e.g., If the AI helps make credit decisions, can we extract reason codes? If not, that’s a compliance gap.)
- Are there privacy consent requirements? (If customer data is used, check if the privacy policy covers this use. For EU customers, determine if this use of AI could be considered automated profiling under GDPR that requires consent or disclosure.)
- Does the use case align with “acceptable AI uses” in regulators’ eyes? (Regulators have hinted at disallowing some high-risk AI uses. Review recent regulatory statements or guidance, such as FINRA’s notice on GenAI and the UK FCA’s AI principles.)
- Will this likely be audited or require notification? (For example, some regulators might expect notification if AI is used in fraud detection or capital models. Knowing this upfront aids in planning compliance documentation.)
Considerations: By mapping regulations, specific compliance risks are identified, such as "the system might generate unapproved marketing language – violate advertising rules" or "we can’t explain denials – potential fair lending issue." This step ensures no regulatory angle is overlooked.
F. Security Considerations
A robust security analysis examines how the AI could be attacked or could fail from a security perspective, ensuring resilience against malicious activities.
Key Questions:
- What are the entry points for an attacker? (If the AI is user-facing, could someone input malicious prompts (prompt injection)? If internal, could an insider misuse it to divulge sensitive data?)
- Could the AI divulge sensitive information if prompted cleverly? (Test some adversarial prompts in a non-production environment. If it tends to reveal things it shouldn't, note that.)
- Is the model protected from unauthorised access? (Ensure only the intended application can query it.)
- Are there any known vulnerabilities of the model or library? (Check vendor documentation for known issues; some models have had bugs where certain strings would dump system prompts.)
- What’s the impact of model unavailability? (If the AI service goes down, is it just an inconvenience or does it stop a critical process? If critical, plan redundancy.)
- Could model outputs be used maliciously? (e.g., If the AI writes code (like Copilot), could an attacker prompt it to write malware that bypasses checks? Or if it generates text, could it be tricked into creating social engineering content targeting executives? Consider if someone might use your AI against you or your customers.)
Considerations: This security analysis yields risks like "prompt injection could cause data leak" or "lack of monitoring could let misuse go undetected," which directly inform needed controls.
G. Controls and Safeguards Identification
With risks identified, the next step is to pinpoint and assess existing and necessary controls. Leveraging frameworks like the FINOS AI Governance Framework can provide valuable guidance.
Key Steps:
- For each risk noted, what controls do we have or need for this? (Clearly mark any risks with no current mitigation.)
- Is additional tooling required? (Perhaps a content filter, a bias detection tool, or an encryption module for data in transit?)
- Is the risk level versus control strength acceptable? (For example, if a high-risk use case like automated trading has only moderate controls, either bolster controls or rethink the design, potentially making it advisory instead of automatic.)
Considerations: This step culminates in a comprehensive risk control matrix for the use case, providing a clear overview of identified risks and their corresponding mitigations.
H. Decision and Documentation
The final stage involves leveraging the assessment findings to make an informed deployment decision and ensuring meticulous documentation.
Key Steps:
- Assign a risk rating to the use case (e.g., Low/Medium/High) based on the severity and likelihood of risks after controls. This can be subjective but should be justified, potentially using scoring models.
- Decide whether to proceed, adjust, or abort the deployment. High residual risk might mean postponing launch until more controls or model improvements are in place. Medium risk might be acceptable with monitoring.
- Document the assessment. Create a brief report or checklist output that records the questions and answers, identified risks, and chosen mitigations. This is invaluable for internal audits or regulatory inquiries, demonstrating due diligence. It also aids future reviews if the use case changes or scales up.
- Establish review intervals. Given the rapid evolution of AI, set a timeline (e.g., every 6 months) to revisit this use case and redo the risk assessment. New factors like updated regulations or new model versions may change the risk profile.
Conclusion
The guided framework outlined above provides a structured, systematic, and comprehensive approach to identifying and managing risks associated with Generative AI applications. By diligently working through each step, from defining the use case and identifying data to assessing security and regulatory implications, organisations can gain a holistic understanding of their GenAI risk landscape.
This proactive approach not only helps in mitigating potential negative consequences but also fosters responsible innovation, ensuring that GenAI deployments are secure, compliant, and aligned with organisational values. In the dynamic world of AI, continuous assessment and adaptation are key to harnessing the power of GenAI safely and effectively.