What must be included in a GPAI model's technical documentation?

Technical documentation must describe the model architecture, training methodology, data preprocessing, evaluation benchmarks, known limitations, foreseeable risks, and measures taken to address identified risks. It must be detailed enough for downstream providers to conduct their own compliance assessments.

Does the training data summary require disclosing specific datasets?

No. The summary must be sufficiently detailed to be meaningful but does not require disclosure of proprietary datasets or commercially sensitive information. It typically covers general data categories, approximate volume, languages, temporal coverage, and known limitations.

What does the copyright compliance obligation require in practice?

GPAI providers must have processes to identify and honour opt-out signals from rightsholders under Article 4(3) of Directive 2019/790. This includes monitoring robots.txt files, respecting machine-readable rights reservations, and maintaining systems to process opt-out requests.

Must GPAI providers share confidential information with downstream providers?

Providers must share information needed for downstream compliance, but trade secrets can be protected through non-disclosure agreements. The downstream provider must receive enough information to meet their own regulatory obligations under the AI Act.

Who oversees GPAI transparency compliance?

The AI Office oversees GPAI compliance, including transparency obligations. It develops the training data summary template, facilitates codes of practice, and can request additional information or evaluate the adequacy of technical documentation.

Quick answer

GPAI providers must meet three core transparency obligations under Article 53: maintain technical documentation of the model, publish a sufficiently detailed summary of training data, and implement a copyright compliance policy. These requirements ensure downstream providers and the public have meaningful information about GPAI model characteristics.

Updated June 2026 · MmowW AI Compliance

EU AI Act GPAI Transparency: Documentation and Disclosure Requirements

The Foundation of GPAI Transparency

Transparency is the central regulatory tool the EU AI Act (Regulation (EU) 2024/1689) applies to general-purpose AI models. Unlike the high-risk AI system framework, which imposes extensive conformity assessment and risk management requirements, the GPAI framework relies primarily on information disclosure to manage risks and enable accountability.

Article 53 establishes three distinct transparency obligations for GPAI model providers. Each obligation serves a different purpose in the regulatory ecosystem. Technical documentation enables supervisory oversight and downstream compliance. The training data summary provides public accountability. The copyright policy protects the rights of content creators whose works may have been used in model training.

These obligations apply to all GPAI model providers, regardless of whether the model is classified as presenting systemic risk. Systemic risk models face additional obligations under Article 55, but the transparency baseline under Article 53 is universal across all GPAI models.

Technical Documentation Requirements

Article 53(1)(a) requires GPAI model providers to draw up and keep up to date the technical documentation of the model, including its training and testing process and the results of its evaluation. The documentation must contain information that is sufficiently detailed for downstream providers to understand the capabilities and limitations of the model.

The technical documentation must cover several areas. It must describe the model architecture, training methodology, data preprocessing steps, and evaluation benchmarks. It must identify known limitations, foreseeable risks, and conditions under which the model may produce unreliable outputs. It must also describe the measures taken to address identified risks during the development process.

The level of detail required is calibrated to the needs of downstream providers. A downstream provider who integrates a GPAI model into a high-risk AI system under Annex III must be able to use the technical documentation to conduct their own conformity assessment. This means the documentation must be detailed enough to support a risk assessment of the combined system, not just the GPAI model in isolation.

The AI Office has the authority to request access to technical documentation and to evaluate its adequacy. Providers who fail to maintain sufficient documentation may face enforcement action, including orders to update the documentation or, in serious cases, restrictions on making the model available within the EU market.

Training Data Summary

Article 53(1)(d) requires GPAI model providers to draw up and make publicly available a sufficiently detailed summary about the content used for training the GPAI model. This summary must be prepared according to a template provided by the AI Office.

The training data summary serves a dual purpose. First, it provides transparency to the public about the types of data used to train the model. Second, it enables rightsholders to assess whether their content may have been used in training, which connects directly to the copyright compliance obligation.

The requirement for a sufficiently detailed summary creates a balancing act. The summary must be detailed enough to be meaningful for its intended purposes, but it does not require disclosure of proprietary datasets, specific data sources, or commercially sensitive information about the training pipeline. The AI Office template is designed to standardise the level of detail across providers.

Key elements typically expected in the training data summary include the general categories of data used (such as web text, books, code repositories, or image collections), the approximate volume of training data, the languages represented, the temporal coverage of the data, and any known biases or limitations in the data composition.

The summary must be made publicly available, meaning it must be accessible without requiring registration, payment, or special access. This is a stronger transparency requirement than making information available only to downstream providers or regulators.

Copyright Policy Compliance

Article 53(1)(c) requires GPAI model providers to put in place a policy to comply with Union copyright law, in particular to identify and comply with reservations of rights expressed by rightsholders pursuant to Article 4(3) of Directive (EU) 2019/790.

Article 4(3) of the Copyright in the Digital Single Market Directive allows rightsholders to reserve their rights against text and data mining in an appropriate manner, such as machine-readable means for content made available online. When a rightsholder has expressly reserved their rights, the text and data mining exception under Article 4 does not apply, and the use of that content for AI training requires separate authorisation.

The practical implications of this obligation are significant. GPAI model providers must have processes in place to identify opt-out signals from rightsholders across the data sources they use for training. This may include monitoring robots.txt files, respecting machine-readable rights reservations in metadata, and maintaining systems to process and honour opt-out requests.

The copyright policy must be documented and, in practice, should be coordinated with the training data summary. A provider who claims to respect copyright reservations must be able to demonstrate that their data collection and training processes actually implement the stated policy.

Information for Downstream Providers

Beyond public disclosure, GPAI model providers must make certain information available specifically to downstream providers who integrate the model into their own AI systems. Article 53(1)(b) requires providers to make available to downstream providers of AI systems who intend to integrate the GPAI model into their system the information and documentation needed for the downstream provider to comply with its obligations under the Act.

This obligation creates a supply chain transparency requirement. When a downstream provider deploys a GPAI model in a high-risk application, that provider must conduct a conformity assessment, implement a risk management system, and maintain technical documentation for their AI system. They cannot fulfil these obligations without adequate information from the GPAI model provider about how the underlying model works.

The information provided to downstream providers may be more detailed than what is publicly available. Trade secrets and confidential business information can be protected through appropriate non-disclosure agreements, but the downstream provider must receive enough information to meet their own regulatory obligations.

This creates an incentive for GPAI model providers to develop standardised information packages for downstream providers, covering model capabilities, known limitations, recommended use conditions, and identified risks. Some providers are already developing model cards and system cards that serve this function.

Interaction with the AI Office

The AI Office plays a central role in the GPAI transparency framework. It is responsible for developing the template for training data summaries, facilitating the development of codes of practice, and supervising compliance with GPAI obligations.

GPAI model providers must be prepared to engage with the AI Office on transparency matters. The Office may request additional information beyond what is publicly disclosed, may evaluate the adequacy of technical documentation, and may require updates to documentation that it finds insufficient.

The AI Office also serves as a coordination point between GPAI model providers and national competent authorities. When a downstream AI system is subject to enforcement action by a national authority, the AI Office may need to assess whether the GPAI model provider has met its transparency obligations as part of the overall compliance evaluation.

Organisations should establish a compliance function or designate a responsible person for GPAI transparency obligations. This function should oversee the preparation and maintenance of technical documentation, manage the training data summary publication, implement and monitor the copyright policy, and serve as the primary contact point for AI Office communications.

Proactive engagement with the transparency framework, rather than reactive compliance, positions organisations to influence the development of codes of practice and to build trust with regulators, downstream providers, and the public.

Check your AI compliance readiness — free.

Take the Readiness Check 3 minutes · 10 questions · no signup required

This article is for informational purposes only and does not constitute legal advice. Regulatory requirements change frequently — verify current rules with official sources. Built by Sawai Gyoseishoshi Office, Hiroshima, Japan.