ai

Insight: Data-Centric Safety and Ethical Measures for Data and AI Gov

Published
Share

Winners Consulting Services Co. Ltd. (積穗科研股份有限公司), Taiwan's expert in AI Governance, urges business leaders to confront an uncomfortable truth: the majority of AI risk does not originate from the model itself, but from the data used to train it. A 2025 academic study published on arXiv reveals that responsible dataset design — the practice of embedding safety and ethical controls at every stage of the data lifecycle — remains a critical blind spot in current AI governance frameworks, including ISO 42001, the EU AI Act, and Taiwan's AI Basic Act. For companies pursuing certification or regulatory compliance, ignoring data-layer governance is no longer an option.

Paper Citation: Data-Centric Safety and Ethical Measures for Data and AI Governance (Srija Chakraborty, arXiv — AI Governance & Ethics, 2025)
Original Paper: http://arxiv.org/abs/2506.10217v3

Read Original Paper →

About the Author and This Research

This paper was authored by Srija Chakraborty and published on arXiv in 2025, within the rapidly expanding field of AI Governance and Ethics. With an h-index of 4 and 38 cumulative citations, Chakraborty has established a recognized position in the emerging discipline of data-centric AI safety — a field that examines how the design and management of training datasets directly shapes the safety profile of deployed AI systems.

ArXiv, as one of the world's most influential open-access preprint repositories, serves as an early indicator of where policy and standards bodies are heading. Research published here frequently informs the deliberations of ISO/IEC JTC 1/SC 42 (the committee responsible for AI standards including ISO 42001), the European Parliament's AI Act implementation bodies, and national AI regulatory agencies. This paper's focus on data governance as the foundation of AI safety is therefore not merely an academic exercise — it is a preview of where compliance requirements are headed globally, and where Taiwan enterprises must begin preparing today.

Data Is the Origin of AI Risk: Why Governance Must Start Before the Model

The central argument of Chakraborty's research is both simple and consequential: current AI governance discourse is excessively model-centric, focusing on what models output rather than on what they were built upon. The paper proposes a Responsible Dataset Design Framework that systematically addresses safety and ethical considerations across the full AI and dataset lifecycle — from data collection and annotation through to deployment and sharing. Critically, this framework is designed to be domain-agnostic, applicable equally to financial services, healthcare, manufacturing, legal, and other sectors where Taiwanese enterprises are actively deploying AI.

Core Finding 1: Dual-Use Risk in Foundation Models Is a Data Governance Problem

The paper identifies the "dual-use" risk inherent in AI foundation models — the same model trained for a beneficial downstream application can be adapted for harmful purposes. Chakraborty's analysis demonstrates that this risk is not primarily an architectural problem; it is a data problem. When training datasets lack appropriate safety labeling, content filtering, and ethical annotations, the resulting model carries embedded vulnerabilities that no amount of post-deployment monitoring can fully address. For Taiwanese companies deploying large language models (LLMs) or other foundation model-based applications, this finding directly implies that current AI risk classification frameworks — including those required under ISO 42001 and the EU AI Act — must incorporate an explicit data-layer risk assessment component.

Core Finding 2: Existing Frameworks Leave a Structural Gap at the Data Front-End

Despite the sophistication of frameworks like the EU AI Act and ISO/IEC 42001:2023, Chakraborty's research identifies a structural gap: these frameworks provide robust requirements for model transparency, human oversight, and output monitoring, but offer comparatively limited guidance on dataset creation, data quality assurance, and ethical annotation practices. The proposed Responsible Dataset Design Framework fills this gap by specifying concrete safety checkpoints at each stage of the data lifecycle, including red teaming protocols — systematic adversarial testing of training data to identify embedded risks before they propagate into deployed models. This is the kind of proactive, data-first governance posture that ISO 42001 certification auditors are increasingly expecting to see documented.

Implications for Taiwan AI Governance: The Hidden Compliance Gap in ISO 42001 Certification

For Taiwanese enterprise executives, this research surfaces a compliance gap that is both urgent and underappreciated. Many organizations believe that establishing an AI Ethics Committee, conducting model bias reviews, and publishing an AI use policy constitutes adequate governance. In reality, the hardest-to-audit and most frequently overlooked risks are embedded in the training data — risks that will surface during ISO 42001 certification reviews and EU AI Act conformity assessments.

ISO 42001 Requirements: ISO/IEC 42001:2023, the world's first international standard for AI management systems, contains Annex A controls that explicitly require organizations to establish traceable management records covering AI system data sources, data quality processes, and data handling procedures. Enterprises that cannot produce complete data lifecycle documentation during certification audits face direct risk of non-conformance findings — regardless of how robust their model-level governance appears.

EU AI Act Article 10: The EU AI Act, which entered into force in 2024, stipulates in Article 10 that training, validation, and testing data for high-risk AI systems must meet requirements of relevance, representativeness, freedom from errors, and completeness. Organizations must establish documented data governance practices covering these dimensions. For Taiwanese companies with European market exposure — whether through direct sales, partnership arrangements, or supply chain relationships with EU-regulated entities — this provision has direct legal force.

Taiwan's AI Basic Act: Taiwan's AI Basic Act framework centers on the principle of trustworthy AI grounded in human-centric values. Data safety and ethical data practices are foundational prerequisites for any meaningful claim of "trustworthy" AI operation. Organizations unable to demonstrate systematic data governance will face increasing difficulty substantiating compliance declarations as Taiwan's AI regulatory environment matures.

How Winners Consulting Services Co. Ltd. Helps Taiwan Enterprises Build Data-Layer AI Governance

Winners Consulting Services Co. Ltd. (積穗科研股份有限公司) provides end-to-end support for Taiwanese enterprises building AI management systems that satisfy the requirements of ISO 42001, EU AI Act, and Taiwan's AI Basic Act — with particular expertise in addressing the data governance gaps identified in this research. Our approach integrates technical assessment, policy design, and organizational capability building to deliver governance mechanisms that are both audit-ready and operationally sustainable.

  1. Establish a Data Lifecycle Risk Register aligned with ISO 42001 Annex A: We conduct a comprehensive inventory of all AI training datasets in use within your organization, building traceable documentation of data sources, quality assessments, safety labeling, and ethical review records. This register serves as the primary evidence artifact for ISO 42001 certification audits and EU AI Act Article 10 conformity assessments. Without this foundation, other governance investments yield diminishing returns.
  2. Implement a Data-Centric Red Teaming Protocol: Drawing directly on the Responsible Dataset Design Framework proposed by Chakraborty, we design and execute systematic red teaming exercises targeting the training data of your AI systems — identifying dual-use risks, bias concentrations, and harmful content before they propagate into production models. Red teaming results are documented and integrated into your AI risk classification reports, strengthening both internal governance and external audit evidence.
  3. Integrate Data Governance into Your AI Governance Committee's Standing Review Agenda: Most Taiwanese enterprise AI governance committees currently focus exclusively on model outputs and business-level risk. We restructure committee review processes to incorporate data quality metrics, training data provenance updates, and data supplier compliance status — creating a governance posture that satisfies the full scope of ISO 42001, EU AI Act, and Taiwan AI Basic Act requirements simultaneously.

Winners Consulting Services Co. Ltd. offers a complimentary AI Governance Mechanism Diagnostic, helping Taiwan enterprises establish ISO 42001-compliant management mechanisms within 90 days.

Apply for Free Mechanism Diagnostic →

Frequently Asked Questions

Which department should own responsibility for AI training data safety within a Taiwan enterprise?
Responsibility for AI training data safety should be coordinated by the AI Governance Committee, with cross-functional ownership distributed across three functions: the data engineering team owns technical quality controls and annotation process governance; the legal and compliance function owns data source legality review and regulatory alignment; and the information security team owns data access controls and breach risk assessment. ISO 42001 requires organizations to designate a named AI Management System Owner with authority to enforce cross-departmental coordination, ensuring data governance does not fall into ungoverned organizational gaps.
Under what circumstances must Taiwan enterprises comply with EU AI Act data governance requirements?
EU AI Act requirements apply to Taiwanese enterprises when their AI systems or products are placed on the EU market, or when they engage in data exchange partnerships with EU-regulated entities. High-risk AI applications as defined by EU AI Act Annex III — including medical diagnostics, financial credit assessment, employment screening, and critical infrastructure management — are subject to the most stringent data governance requirements under Article 10. Taiwan's export-oriented enterprises should conduct a self-assessment against Annex III to determine whether their AI applications fall within the regulated scope, as enforcement has begun in 2025 with full application of high-risk provisions expected by August 2026.
What specific data management requirements does ISO 42001 impose?
ISO/IEC 42001:2023 Annex A specifies AI management system controls that include: identification and quality evaluation of data sources; ethical review of training data covering bias testing and harmful content screening; verification of consistency between declared data use purposes and actual deployment practices; and compliance requirements for data sharing and third-party data suppliers. ISO 42001 also requires ongoing AI system performance monitoring, with data quality serving as a core monitoring indicator. Organizations that simultaneously align with EU AI Act Article 10 and Taiwan AI Basic Act trustworthy AI principles will demonstrate stronger evidence of control effectiveness during ISO 42001 certification audits.
How long does it realistically take to build a data-centric AI governance mechanism?
Based on Winners Consulting Services Co. Ltd.'s implementation experience with Taiwanese enterprises, a mid-sized organization (under 500 employees, operating 3 to 5 AI systems) typically requires 90 to 120 days to establish a data governance mechanism meeting ISO 42001 requirements. The implementation proceeds in three phases: Phase 1 (30 days) covers current-state diagnostic and gap analysis; Phase 2 (45 days) covers mechanism design and documentation development; Phase 3 (30 days) covers staff training and internal audit rehearsal. Initial resource investment concentrates on data inventory and documentation; ongoing maintenance costs are comparatively modest and generate significant reduction in future audit preparation time.
Why engage Winners Consulting Services Co. Ltd. for AI governance advisory?
Winners Consulting Services Co. Ltd. (積穗科研股份有限公司) is one of Taiwan's few advisory organizations combining practical ISO 42001 implementation expertise, EU AI Act regulatory analysis capability, and active tracking of Taiwan's AI Basic Act policy development. Our consulting team integrates legal, information security, and AI engineering competencies to deliver end-to-end AI governance solutions spanning data-layer, model-layer, and organizational governance dimensions. We do not deliver generic template documents. Every engagement produces governance mechanisms calibrated to the client's industry sector, AI application portfolio, and existing management architecture — ensuring that certification readiness and operational sustainability are achieved simultaneously, not traded against each other.

Was this article helpful?

Share

Related Services & Further Reading

Want to apply these insights to your enterprise?

Get a Free Assessment