What does Article 10 of the EU AI Act require?

Article 10 requires providers of high-risk AI systems to ensure training, validation, and test data sets are relevant, sufficiently representative, free from errors to the extent possible, and complete in relation to the intended purpose. Providers must also document data governance practices — how data was collected, prepared, and labelled — and examine data for possible biases that could lead to risks for fundamental rights or health and safety.

Does Article 10 apply if I use a pre-trained model I did not build?

If you fine-tune, adapt, or substantially modify a pre-trained model for a high-risk use case, you become a provider under the EU AI Act and must comply with Article 10 for the modified system. If you deploy a third-party AI system without modification, you are a deployer — with narrower data obligations under Article 26 (monitor that inputs are relevant and representative). However, if you add your own data pipeline or customise the model's outputs for a high-risk context, Article 10 applies.

What counts as a bias examination under Article 10?

A bias examination under Article 10(5) must review training, validation, and test data for biases that could lead to risks to health, safety, or fundamental rights. In practice this means: identifying which protected characteristics are relevant to the use case (gender, age, ethnicity, disability etc.); selecting bias detection methods appropriate to the model type and task (statistical fairness testing, subgroup performance analysis, counterfactual testing); documenting findings including any biases identified; and recording what measures were taken. Bias examination is not the same as bias elimination — the Act requires documented analysis and proportionate response.

How does Article 10 connect to Article 9 and Article 11?

Article 10 feeds directly into both Article 9 and Article 11. Any biases identified in the Article 10 examination must be carried into the Article 9 risk management system as identified risks, with corresponding mitigation measures and residual risk assessment. The Article 11 technical documentation must include a description of training datasets and data governance measures — so the Article 10 record is a required component of the Annex IV technical file. The three articles form a connected compliance chain: data quality (Art 10) → risk assessment (Art 9) → documented system (Art 11).

EU AI Act Article 10: Data Governance Requirements for High-Risk AI

What Is Article 10?

Article 10 of the EU AI Act sets out the data and data governance requirements for providers of high-risk AI systems. It applies before market placement — meaning your data quality documentation must exist before your AI system goes live, not after. The obligation covers three types of data:

Training data — the data used to develop and train the model
Validation data — the data used to tune parameters and evaluate the model during development
Test data — the data used to evaluate final performance before deployment

If any of these data sets are inadequate, biased, or poorly documented, that is not just a technical problem — it is a compliance failure that regulators can act on from 2 August 2026.

Who Must Comply with Article 10

Article 10 applies to providers of high-risk AI systems — the organisations that develop and place them on the EU market. You are a provider if you:

Built the AI model and sell it as a product or service in the EU
Integrated a pre-built model into your own product and placed it under your name or brand
Substantially modified a third-party AI system for a high-risk use case

Deployers (organisations using high-risk AI without developing it) have a narrower obligation under Article 26: monitor that inputs to the AI system remain relevant and representative of the intended use context. If inputs drift significantly from the training distribution, that must be flagged and reported to the provider.

Fine-tuning makes you a provider. If you take a foundation model (GPT-4, Claude, Llama, etc.) and fine-tune it on your own data for a high-risk use case — HR screening, credit scoring, medical decision support — you become a provider for that system. Article 10 applies to your fine-tuning data, not just the base model's training data.

The Four Core Data Requirements

1. Relevance and Representativeness (Article 10(3))

Training, validation, and test data must be:

Relevant to the AI system's intended purpose — the data must reflect the actual task the model is being asked to perform
Sufficiently representative of the operational environment — the data must reflect the population, context, and conditions under which the system will be used in practice
As free from errors as possible — this does not require perfection, but it does require documented quality controls
Complete in relation to the characteristics of the intended purpose — relevant variables, edge cases, and demographic groups must be present in appropriate proportions

The regulation does not mandate specific dataset sizes or statistical thresholds. What it requires is documented evidence that these criteria were considered and addressed for your specific use case.

2. Data Governance Practices (Article 10(2))

Beyond data quality, Article 10(2) requires documentation of how data was handled throughout the development process. Your data governance record must cover:

The origin and source of each dataset
How data was collected — surveys, clinical studies, web scraping, operational logs, third-party providers
Data preparation steps: labelling methodology, cleaning procedures, annotation guidelines, inter-annotator agreement rates where applicable
Assumptions made during data formulation — what the data is assumed to represent and the limits of that assumption
Assessment of data availability, quantity, and suitability relative to the task
The train/validation/test split and the rationale for it

This is often the most overlooked part of Article 10. A regulator reviewing your technical file will not just ask "what data did you use?" — they will ask "how do you know this data is appropriate for this purpose?" The governance record is your answer.

3. Bias Examination (Article 10(5))

Article 10(5) is the provision that catches most organisations off guard. It requires that data be examined for possible biases that could lead to risks to health and safety or fundamental rights — and that appropriate data management measures are taken.

A bias examination must:

Identify which protected characteristics are relevant to your specific use case. A credit scoring model has different bias risks than an HR screening tool or a clinical decision support system. The relevant characteristics might include age, gender, ethnicity, disability status, nationality, or other attributes depending on what the AI decides.
Apply appropriate bias detection methods to training, validation, and test data. Common approaches include:
- Distributional analysis — does the dataset proportionally represent the population the AI will affect?
- Subgroup performance testing — does model accuracy, false positive rate, or false negative rate differ significantly across demographic groups?
- Counterfactual fairness testing — does changing a protected attribute while keeping everything else constant change the model's output?
Document findings — including any biases identified, their magnitude, and potential impact on the affected population.
Record what was done in response — data augmentation, rebalancing, algorithmic fairness constraints, or an explanation of why the identified bias is acceptable given the use case and residual risk.

Bias examination is not bias elimination. The EU AI Act does not require zero bias — it requires documented analysis and proportionate response. An organisation that identifies bias, documents it, takes reasonable mitigation steps, and records the residual risk is in a far stronger compliance position than one that asserts "our model is not biased" with no evidence.

4. Special Categories of Personal Data (Article 10(5))

Providers and deployers may, in exceptional circumstances, process special categories of personal data (health data, ethnic origin, biometric data, criminal records, etc.) specifically to detect and correct bias. This is a narrow exemption with strict conditions:

Processing must be explicitly limited to bias detection and correction purposes
Appropriate technical and organisational safeguards must be implemented
Data must be deleted once the bias correction objective is achieved
The legal basis under GDPR must separately permit the processing

Note: the EU Digital Omnibus proposes extending this provision to all AI systems and models (not just high-risk), but it is not yet law. The current Article 10(5) exemption applies to high-risk AI systems only.

How Article 10 Connects to Your Other Obligations

Article	How Article 10 Feeds Into It
Article 9 — Risk Management System	Bias findings from Article 10 become identified risks in the Article 9 risk register. Mitigation measures and residual bias risk must be documented in the risk management record. Article 9(7) requires testing across relevant demographic groups — which should align with the Article 10 bias examination.
Article 11 — Technical Documentation	Annex IV of the EU AI Act specifies what the Article 11 technical file must contain. It explicitly requires: a description of training datasets and data governance measures, information on data provenance, and the results of bias examination. Your Article 10 record is a required component of the technical file.
Article 13 — Transparency	Instructions for use must inform deployers of known limitations — including data coverage limitations or known bias characteristics that could affect performance in specific contexts or demographic groups.
GDPR	GDPR governs the lawfulness of personal data processing in training. Article 10 of the AI Act adds AI-specific quality requirements on top — not instead of — GDPR obligations. You need both a GDPR legal basis and Article 10 quality documentation for any personal data in training.

What Your Article 10 Documentation Should Include

Your Article 10 compliance record — which forms part of the Article 11 technical file — should be structured around four sections:

Data Description

Dataset name, source, and collection date range
Volume, format, and language(s)
Intended use and why this data is appropriate for the intended purpose
How data was obtained (consent, public domain, operational logs, purchased, etc.)
Geographic coverage and jurisdiction-specific considerations

Data Preparation Record

Cleaning procedures applied and why
Labelling methodology and who did the labelling (human annotators, automated, hybrid)
Inter-annotator agreement metrics where applicable
How missing, corrupted, or outlier data was handled
Train/validation/test split percentages and rationale
Any data augmentation applied

Representativeness Assessment

The population on which the AI will be used
How the dataset represents that population
Known gaps, underrepresented groups, or out-of-distribution scenarios
Steps taken to address representativeness gaps (additional data collection, synthetic data, oversampling)
Remaining limitations and how they are disclosed to deployers

Bias Examination Record

Protected characteristics examined and why they are relevant to this use case
Bias detection methods used
Findings: biases identified, their statistical magnitude, and the affected groups
Mitigation measures applied (data rebalancing, algorithmic constraints, etc.)
Residual bias characterisation — what bias remains after mitigation and why it is acceptable or how it is managed
Link to Article 9 risk register entries for bias-related risks

Five Common Article 10 Mistakes

1. Treating data documentation as a one-time exercise. Article 10 applies to the data used at each model version. If you retrain with new data, update the documentation. Regulators examining a system that has had multiple training iterations will expect version-controlled data records.

2. Describing data without examining it. A dataset inventory is not Article 10 compliance. The regulation requires documented evidence of examination — what you looked for, what you found, what you did about it. Asserting "data is representative" without evidence is an Article 10 gap.

3. Assuming GDPR compliance covers Article 10. GDPR determines whether you can process personal data at all. Article 10 sets quality and governance standards for that data. A dataset can be GDPR-compliant and still fail Article 10 if it is insufficiently representative or unexamined for bias.

4. Fine-tuning without new Article 10 documentation. When you fine-tune a foundation model on proprietary data, that fine-tuning data is subject to Article 10. The base model provider's GPAI transparency documentation covers their pre-training — your fine-tuning data is your responsibility.

5. Disconnecting Article 10 from Article 9. The most common audit failure is having a bias examination that identified risks but a risk register with no corresponding entries. Article 10 findings must flow into Article 9. Separate, unlinked documents create compliance gaps that are immediately visible to a trained reviewer.

Aurora Trust generates Article 10 data governance documentation as part of the compliance document pack — including the data description, representativeness assessment, and bias examination record, structured to feed directly into the Article 11 technical file. Starting at €49/month. See how it works →

EU AI Act Article 10: Data Governance Requirements for High-Risk AI — A Practical Guide

What Is Article 10?

Who Must Comply with Article 10

The Four Core Data Requirements

1. Relevance and Representativeness (Article 10(3))

2. Data Governance Practices (Article 10(2))

3. Bias Examination (Article 10(5))

4. Special Categories of Personal Data (Article 10(5))

How Article 10 Connects to Your Other Obligations

What Your Article 10 Documentation Should Include

Data Description

Data Preparation Record

Representativeness Assessment

Bias Examination Record

Five Common Article 10 Mistakes

On This Page

More time. Same work.

More from High-Risk AI

EU AI Act Article 10: Data Governance Requirements for High-Risk AI — A Practical Guide

What Is Article 10?

Who Must Comply with Article 10

The Four Core Data Requirements

1. Relevance and Representativeness (Article 10(3))

2. Data Governance Practices (Article 10(2))

3. Bias Examination (Article 10(5))

4. Special Categories of Personal Data (Article 10(5))

How Article 10 Connects to Your Other Obligations

What Your Article 10 Documentation Should Include

Data Description

Data Preparation Record

Representativeness Assessment

Bias Examination Record

Five Common Article 10 Mistakes

Related Resources

On This Page

More time. Same work.

More from High-Risk AI