This article is sponsored by ActiveFence and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page.
As generative AI systems enter mainstream enterprise workflows, the conversation around AI safety has shifted from theoretical concern to practical necessity.
A 2024 report from the OECD warns that generative models, also known as huge language models (LLMs), pose emerging risks related to misinformation, bias, and misuse — challenges that become more complex when these models are fine-tuned or deployed at scale in sensitive domains such as finance, healthcare, and public safety.
The report goes on to highlight that, unlike traditional software bugs, risks from generative AI can be unpredictable and context-dependent, amplifying the potential for systemic impact across multiple sectors.
This emerging risk landscape demands enterprise leaders rethink how to deploy, monitor, and govern generative AI to avoid unintended harms that could compromise reputation, customer trust, or regulatory compliance.
With growing regulatory scrutiny worldwide, such as the EU’s AI Act proposals and expanding U.S. guidance, enterprises face increasing pressure to demonstrate proactive risk management. Deploying generative AI responsibly is no longer optional; it’s integral to sustainable business operations.
Emerj Artificial Intelligence Research recently hosted a special podcast series with Akhil Khunger, VP of Quantitative Analytics at Barclays, and Tomer Poran, Chief Evangelist and VP of Strategy at ActiveFence, to explore what securing GenAI looks like inside a global bank and a trust and safety technology vendor. Their insights underline why safety needs to be domain-specific, adversarially tested, and embedded across business and technical functions—not just layered on after deployment.
Beyond direct industry-specific risks, enterprises must also consider the regulatory landscape governing their field. Compliance requirements in finance, healthcare, and content moderation differ significantly, necessitating a multilayered approach to AI governance. Neglecting these nuances can result in fines, reputational damage, or even legal consequences, reinforcing the need for robust, tailored safeguards.
This article summarizes key insights for enterprise risk, security, and AI leaders, including:
- Why internal red teaming is essential for enterprise-grade GenAI safety
- How trust and safety lessons from social platforms can inform LLM deployment
Emerj would like to thank our guests for sharing their experience securing GenAI in high-stakes environments.
Red Teaming and Risk: How Enterprises Are Securing GenAI Internally
Episode: Why Red Teaming is Critical for AI Security – with Tomer Poran of ActiveFence
Guest: Tomer Poran, Chief Evangelist and VP of Strategy at ActiveFence
Expertise: AI Safety, Content Risk, and Enterprise Trust and Safety
Brief Recognition: Tomer Poran leads AI strategy and enterprise solutions at ActiveFence. He brings deep experience in applying trust and safety methodologies to large language models and generative AI systems, with a focus on adversarial risk, content integrity, and consumer protection.
In this conversation, Tomer Poran outlines the emerging need for internal security teams at large enterprises to treat AI deployments with the same rigor as any other business-critical infrastructure — especially when models are fine-tuned for customer-facing use cases. His perspective highlights a fundamental shift in how organizations view AI risk management: from relying on vendors to owning the full risk lifecycle internally.
One of the clearest manifestations of this shift is the growing use of red teaming to assess and proactively mitigate GenAI risks. Red teaming originated in cybersecurity as a practice of simulating real-world attacks to expose system vulnerabilities. Applied to GenAI, red teaming means actively probing models with adversarial inputs designed to trigger unsafe or non-compliant outputs. This process reveals weaknesses that static testing or passive filters miss.
He begins by framing a key tension: while foundation model providers advance quickly in improving base model safety, enterprise risk teams still bear the responsibility for how those models behave in their specific contexts.
“There’s a gap between what the foundation model providers can offer in terms of safety and what you need to own as the enterprise using the model,” Tomer explains. The gap he describes creates significant responsibility for enterprises to actively engage in safeguarding AI outputs in a way that aligns with their unique risk profiles. For example, content or production considered benign by the base model may violate compliance or ethical standards when deployed in specific industries or regions.
For enterprises building on open-source or fine-tuned LLMs, ActiveFence argues that safety observability — monitoring outputs for problematic content, behaviors, or instructions — requires a more proactive approach than passive filter layers. One such approach is red teaming, a structured methodology for adversarial testing. According to Tomer, this isn’t just a one-time check but an ongoing, evolving practice:
“If you want to own the quality and safety of a model you built or fine-tuned, you need to be able to simulate misuse and test its boundaries… You can’t rely on guardrails from the base model alone. That’s not enterprise-grade.”
– Tomer Poran, Chief Evangelist and VP of Strategy at ActiveFence
He emphasizes that ActiveFence leverages its experience in handling real-world consumer risks — such as child safety violations or the detection of graphic content — in the generative AI space. This background, he argues, uniquely positions their team to surface edge cases that internal teams might not anticipate:
“The harms we simulate come from 10 years of managing content at scale,” says Tomer. “Understanding what humans can throw at a system, and what a model might inadvertently produce.”
This experience is critical because generative models can be manipulated or produce unexpected harmful outputs that standard filters miss.
One theme echoed by both Tomer and Akhil Khunger at Barclays is the critical importance of domain specificity when securing generative AI. While the underlying architecture of LLMs is general-purpose, the risks they pose vary widely depending on the application domain.
Cross-industry examples include:
- In financial services, models must avoid generating misleading investment advice or leaking sensitive customer information.
- In healthcare, hallucinated medical information could risk patient safety or violate privacy laws.
- In content platforms, harmful or manipulative outputs may lead to social harm or brand damage.
In either case, red teaming exposes failure modes before they reach end users, reducing risk and increasing trustworthiness.
Both experts agree that with agentic AI adoption on the horizon for so many industries, red teaming must be continuous and cross-functional, involving product, security, and compliance teams. It must also evolve in response to emerging risks and novel attack methods. A more proactive, adversarial mindset shifts organizations from reactive compliance to operational resilience, a crucial distinction as AI threats evolve rapidly.
A one-size-fits-all safety solution is insufficient. Instead, enterprises must embed risk controls and red teaming exercises tailored to the specific threats relevant to their industry, audience, and use cases. This domain specificity drives more effective detection, mitigation, and compliance.
Tomer concludes that the conversation about enterprise AI safety is no longer theoretical. The scale and sensitivity of generative AI use cases — especially in regulated industries — demand new risk standards. From ActiveFence’s experience, deploying generative AI responsibly is not just a compliance checkbox but a business imperative for sustaining customer trust and reputational integrity in an environment where unpredictable harms can have systemic consequences.
Emerging best practices suggest safety must be embedded at every stage of the generative AI lifecycle:
- Development: Integrate safety objectives early, with rigorous documentation and validation of model changes.
- Deployment: Implement real-time monitoring systems to flag anomalous or risky outputs.
- Operation: Maintain continuous red teaming and incident response capabilities.
- Governance: Establish clear accountability cross-team communication and conduct regular audits to ensure alignment on safety.
A successful governance framework should also emphasize transparency in AI decision-making. Regular audits must be paired with clear documentation of safety measures, enabling accountability at every level. Transparent policies foster trust among regulators, customers, and internal stakeholders.
By treating safety as a foundational design principle, not an afterthought, enterprises reduce risk and build customer trust. This lifecycle approach requires investment in both technology and internal culture to succeed.
“As we transition from generative AI to agentic systems, the risk profile fundamentally changes. With agents, we’re not just concerned about what information they produce, but what actions they can autonomously take.
This requires a significant elevation in our approach to safety and governance. Organizations need to build sustainable red teaming programs — not one-off projects — that can continuously evaluate these systems as they evolve. The stakes are simply higher when AI can act on its own, and our preparation must match that reality. For most enterprises, this means combining internal expertise with external adversarial perspectives to fully understand the expanded risk surface that agentic AI introduces.“
– Tomer Poran, Chief Evangelist and VP of Strategy at ActiveFence
As enterprises prepare for increasingly autonomous AI systems, red teaming takes on a new strategic role. Preparing for agentic AI requires more than reactive measures — it demands a proactive, deeply embedded safety capability. As models gain autonomy and the ability to perform complex actions, enterprises must continuously evaluate not only technical safeguards but also the broader operational impact of these systems.
Timer emphasizes that internal red teams bring a critical advantage here: they understand the organization’s unique risk landscape, workflows, and use cases. Internal teams also enable more targeted and agile testing as models evolve. For example, a company launching frequent updates of GenAI-powered assistant benefits from an in-house team capable of rapidly assessing emergent behaviors and safety gaps.
Ultimately, building this internal muscle supports safe and scalable adoption. It fosters a culture of AI ownership and accountability — essential for deploying autonomous systems that interact with diverse user groups and sensitive data environments.
Building Trustworthy AI Systems in Financial Services
Episode: How Financial Services Are Building Safer Customer-Facing AI – with Akhil Khunger of Barclays
Guest: Akhil Khunger, VP of Quantitative Analytics at Barclays
Expertise: Model Risk Governance, AI Controls, and Financial Regulation
Brief Recognition: Akhil Khunger leads AI risk and control strategy at Barclays. He specializes in designing governance frameworks for emerging technologies and has worked extensively on integrating compliance into the AI development lifecycle.
In his podcast appearance, Akhil Khunger provides a financial services perspective on deploying generative AI securely — especially in high-stakes, customer-facing applications. At Barclays, the focus is on establishing strong internal controls before model behavior becomes a public issue.
He explains that generative models bring unique risks to banking environments, where explainability, reliability, and compliance are paramount. Unlike deterministic systems, generative models are probabilistic and non-transparent — creating what Akhil describes as a “black box” problem. He notes that “traditional control frameworks are not sufficient. You need new methods for interpretability and guardrails that fit the generative model lifecycle.”
To address this, Barclays has developed a multilayered control framework that includes:
- Model validation and documentation tailored to generative AI’s complexities
- Usage monitoring to detect anomalous or risky outputs in real time
- Adversarial testing to proactively uncover vulnerabilities
- Outcome-based performance thresholds linked to business impact
Importantly, these controls are embedded in cross-functional processes—not isolated within data science teams. Akhil emphasizes the importance of integrating horizontal oversight (risk, compliance, and legal) with vertical accountability within business units.
“You can’t just push a policy and walk away,” he says. “AI safety needs to be a shared responsibility across teams.”
This cross-team collaboration ensures that controls are practical, scalable, and aligned with evolving regulatory expectations. Akhil also reflects on how the internal culture around AI risk is shifting.
As use cases transition from exploratory pilots to scaled deployments, senior leaders increasingly view safety not as a blocker but as a requirement for trust and usability. According to Akhil, “the more we involve our first-line teams — those closest to the customer — the better we get at designing AI systems that are both helpful and safe.”
By incorporating frontline perspectives, Barclays develops AI that strikes a balance between innovation and risk mitigation. He concludes by stressing that building trustworthy AI systems isn’t just about defending against misuse—it’s about designing for value: “When teams proactively identify risks, they create systems that customers can trust, and regulators can approve.”
Proactive risk management strategies help organizations like Barclays maintain their reputation and competitive advantage in a highly regulated environment.
Akhil also shares a forward-looking perspective on agentic AI — systems capable of independently completing tasks or coordinating with other agents. He notes that adoption in financial services will be cautious but inevitable, given the sector’s regulatory constraints.
He addresses challenges in driving agentic enterprise AI outright:
“It will take some time to fully embrace agentic AI in financial institutions due to regulatory requirements. But make no mistake—it is coming. The first challenge is understanding the model itself, as there are many hidden layers in agentic AI with agents working independently or interacting with each other.
When testing these systems, we need to decouple the processes and evaluate if individual components are producing the right outputs, rather than just examining the final results. This requires multiple test runs with various user prompts to ensure comprehensive coverage.
Initially, it’s essential to have actual people in customer-facing roles working alongside the AI to compare outputs. This benchmarking process helps minimize errors but requires significant investment not just in implementing the technology, but in thorough testing procedures as well.”
– Akhil Khunger, VP of Quantitative Analytics at Barclays
In their respective episodes, both speakers discuss how safety is not only a defensive measure but also a strategic enabler.
Yet, as Akhil notes directly, trustworthy AI systems that reliably meet compliance and ethical standards are more likely to:
- Gain regulatory approval faster
- Earn and retain customer trust
- Avoid costly incidents or reputational damage.
When embedded early and comprehensively, safety investments help organizations unlock new value through responsible innovation.