Insight: AGENTSAFE: A Unified Framework for Ethical Assurance and Gov

Winners Consulting Services Co. Ltd. (積穗科研股份有限公司), Taiwan's expert in AI Governance, alerts enterprise leaders to a critical governance gap: as organizations deploy large language model (LLM)-based AI agents capable of autonomous planning, multi-step tool integration, and continuous self-reflection loops, conventional static risk taxonomies are no longer sufficient to satisfy ISO 42001, EU AI Act, or Taiwan's emerging AI Basic Law requirements. The 2025 AGENTSAFE framework, published on arXiv, provides the first unified end-to-end governance architecture specifically designed for agentic AI systems—covering design controls, runtime governance, and auditable accountability mechanisms that map directly onto the lifecycle risk management requirements enterprises need to demonstrate compliance today.

Paper Citation: AGENTSAFE: A Unified Framework for Ethical Assurance and Governance in Agentic AI (Rafflesia Khan, Declan Joyce, Mansura Habiba, arXiv — AI Governance & Ethics, 2025)
Original Paper: http://arxiv.org/abs/2512.03180v1

Read Original Paper →

About the Authors and This Research

AGENTSAFE was co-authored by Rafflesia Khan, Declan Joyce, and Mansura Habiba, and published in 2025 on arXiv within the AI Governance and Ethics domain. Both primary authors hold an h-index of 1 with 3 cumulative citations—indicators that place them among early-career researchers in a rapidly emerging field. However, the significance of this paper should not be measured by citation counts alone. Agentic AI governance is a frontier domain where foundational frameworks are still being established. Papers that propose the first operationalizable governance architectures for a new class of AI systems often become reference points before citation metrics have time to accumulate.

What distinguishes AGENTSAFE from prior work is its deliberate practitioner orientation. Rather than offering another theoretical taxonomy, the authors operationalize the AI Risk Repository—an established knowledge base of AI failure modes and risk categories—into three concrete governance layers: design controls applied before deployment, runtime controls applied during operation, and audit controls applied throughout the system lifecycle. This structure mirrors precisely what ISO 42001:2023 requires in Clauses 6 (Planning), 8 (Operation), and 9 (Performance Evaluation), making AGENTSAFE a natural technical companion to ISO 42001 implementation for any organization deploying LLM-based agents.

Five Governance Blind Spots in Agentic AI That AGENTSAFE Addresses

The shift from static AI models to autonomous LLM agents introduces a qualitatively different risk surface. AGENTSAFE's core insight is that existing governance frameworks are fundamentally fragmented—they offer static taxonomies without an integrated pipeline from risk identification to operational assurance. The framework addresses five interconnected blind spots that Taiwanese enterprises deploying AI agents must urgently address.

Core Finding 1: The Agentic Loop Creates Compounding Risk Across Four Phases

Traditional AI risk assessments treat each model inference as an independent event. LLM agents, however, operate through a continuous loop: Plan → Act → Observe → Reflect. AGENTSAFE profiles this loop systematically, identifying how risks accumulate and compound across each phase. A planning error in phase one does not terminate—it propagates through subsequent actions, observations, and reflective updates, potentially amplifying harm before any human reviewer becomes aware. The framework introduces structured risk taxonomies extended with agent-specific vulnerabilities, mapping risks across four assurance dimensions: security, privacy, fairness, and systemic safety. This four-dimensional taxonomy provides Taiwanese enterprises with a principled basis for conducting the risk classification required under Article 9 of the EU AI Act.

Core Finding 2: Three-Layer Governance Architecture Closes the End-to-End Gap

The AGENTSAFE framework's most operationally valuable contribution is its three-layer integrated governance architecture. The first layer—design-time controls—includes Pre-deployment Scenario Banks: structured test suites that evaluate AI agent behavior across security, privacy, fairness, and systemic safety scenarios before any production deployment. The second layer—runtime governance—employs Semantic Telemetry for continuous behavioral monitoring, Dynamic Authorization that adjusts agent permissions in real time based on detected risk levels, Anomaly Detection algorithms that flag behavioral deviations, and Interruptibility Mechanisms that allow human operators to halt agent execution at any point. The third layer—audit and accountability—leverages Cryptographic Tracing to create tamper-evident records of every agent decision and action, enabling the provenance and accountability evidence that ISO 42001 Clause 9.1 (Monitoring, Measurement, Analysis and Evaluation) and EU AI Act Article 12 (Record-Keeping) both require.

Core Finding 3: High-Impact Actions Must Trigger Mandatory Human Oversight Escalation

AGENTSAFE explicitly defines a class of "high-impact actions" that must automatically trigger Human Oversight Escalation—a formal handoff from autonomous agent execution to human review and approval. This is not merely a best practice recommendation; it is a governance safeguard that directly addresses EU AI Act Article 14's mandatory human oversight requirements for high-risk AI systems, and aligns with Taiwan AI Basic Law's principle that significant decisions affecting human welfare must preserve meaningful human judgment. For Taiwanese enterprises designing AI agent workflows, the AGENTSAFE framework provides concrete design patterns for implementing this escalation mechanism in a way that is both operationally practical and legally defensible.

Strategic Implications for Taiwan AI Governance Compliance

AGENTSAFE's publication arrives at a critical juncture for Taiwanese enterprises. The EU AI Act's high-risk system provisions are scheduled for full enforcement from 2026 onward. ISO 42001 certification is increasingly becoming a procurement prerequisite in regulated industries and international supply chains. Taiwan's AI Basic Law framework is advancing toward formal legislative adoption. Together, these three regulatory pressures create a compliance convergence that requires Taiwanese organizations to upgrade their AI governance infrastructure—and AGENTSAFE provides the technical blueprint to do so efficiently.

Implication 1: ISO 42001-Certified Enterprises Must Extend Controls for Agentic AI

ISO 42001:2023 was finalized before LLM-based agents achieved widespread commercial deployment. While the standard's principles are technology-neutral and extensible, its Annex A control catalog does not provide specific guidance on autonomous behavior loops, toolchain risk profiling, or runtime dynamic authorization—the three areas where agentic AI creates the most significant governance gaps. Taiwanese enterprises that have already pursued ISO 42001 certification, or are currently doing so, should treat AGENTSAFE's three-layer architecture as a practical extension of their Annex A control implementation. Specifically, AGENTSAFE's Pre-deployment Scenario Banks can be integrated into ISO 42001 Clause 8.4 (AI System Impact Assessment) procedures, while its Semantic Telemetry mechanisms align with Clause 9.1 monitoring requirements.

Implication 2: EU AI Act Compliance Requires Taiwanese Exporters to Implement Auditable Agent Controls

Any Taiwanese enterprise providing AI-enabled products or services to European Union markets—whether directly or through supply chain relationships—must assess whether their AI systems fall under EU AI Act high-risk categories defined in Article 3 and Annex III. These categories include AI systems used in employment and worker management (including performance evaluation), credit scoring and financial decision-making, critical infrastructure management, and public service administration. For systems in these categories, Articles 9 (Risk Management System), 12 (Record-Keeping), and 14 (Human Oversight) establish legally binding requirements that AGENTSAFE's framework directly addresses. The Cryptographic Tracing mechanism in AGENTSAFE's audit layer, for instance, provides an implementation pathway for the EU AI Act's Article 12 logging obligations.

Implication 3: Taiwan AI Basic Law Risk Classification Needs Operational Methodology

Taiwan's AI Basic Law establishes a risk-based governance approach requiring organizations to classify AI applications by impact level and apply proportionate controls. However, many Taiwanese enterprise leaders report that current regulatory guidance lacks specific operational methodologies for conducting this classification—particularly for AI agents that exhibit emergent behaviors not present in their individual components. AGENTSAFE's structured risk taxonomy, extended with agent-specific vulnerability categories, provides precisely the operational methodology Taiwan AI Basic Law risk classification requires. The framework's four-dimensional assurance evaluation (security, privacy, fairness, systemic safety) maps naturally onto the impact dimensions that risk-based AI regulation considers.

How Winners Consulting Services Helps Taiwanese Enterprises Implement AGENTSAFE Principles

Winners Consulting Services Co. Ltd. (積穗科研股份有限公司) assists Taiwanese enterprises in building AI management systems compliant with ISO 42001 and EU AI Act requirements, conducting AI risk classification assessments, and ensuring AI applications align with Taiwan AI Basic Law provisions. In response to the agentic AI governance challenges identified in AGENTSAFE, we recommend the following specific actions:

Conduct an Agentic AI System Inventory and Risk Mapping Exercise: Using AGENTSAFE's Agentic Loop profiling methodology as a guide, systematically inventory all AI applications within your organization that exhibit autonomous planning, multi-step tool use, or self-reflective behavior. For each identified system, conduct a phase-by-phase risk assessment across the Plan → Act → Observe → Reflect loop, map identified risks against ISO 42001 Annex A control categories, and identify governance gaps where existing controls do not adequately address agent-specific vulnerabilities. This exercise produces the enterprise-specific AI Agent Risk Map that forms the foundation for all subsequent governance design decisions.
Design a Three-Layer Governance Control Architecture for Each High-Risk AI Agent System: For AI agent systems identified as high-risk under either EU AI Act Annex III criteria or Taiwan AI Basic Law classification standards, design a complete three-layer governance architecture following AGENTSAFE principles: (1) define and implement Pre-deployment Scenario Banks covering security, privacy, fairness