INFORMATION TECHNOLOGY
AI and GDPR: the CNIL publishes new recommendations to support responsible innovation.
The GDPR enables the development of innovative and responsible AI in Europe. The new CNIL recommendations illustrate this by providing concrete solutions to inform people whose data is used and to facilitate the exercise of their rights.
Adapting the principles of the GDPR to the specificities of AI.
Some artificial intelligence models are anonymous and therefore not subject to the GDPR. However, other models, such as a large language model (LLM), may contain personal data. The European Data Protection Board (EDPB) has recently provided relevant criteria on the application of the GDPR to AI models.
When applying the GDPR, individuals' data must be protected, both within training data sets, and within models that may have stored data or through the use of the model via prompts. Although the fundamental principles of data protection remain applicable, they must be adapted to the specific context of AI.
The CNIL has long established that:
The CNIL has published two new Recommendations to promote the responsible use of AI, while ensuring respect for the protection of personal data. These recommendations confirm that the GDPR requirements are sufficiently balanced to address the specific challenges of AI. They provide concrete and proportionate solutions to inform individuals and facilitate the exercise of their rights.
When personal data is used to train an AI model and can potentially be stored by it, the individuals concerned must be informed.
The way this information is provided can be adapted based on the risks to individuals and operational constraints. Under the GDPR, in some cases, especially when AI models are based on third-party data sources and the provider cannot directly contact individuals, organisations can limit themselves to general information (e.g., published on their website). When multiple sources are used, as is common with generic AI models, broad disclosure indicating the categories of sources or listing some key sources is generally sufficient.
As for the Recommendation on the exercise of rights, European legislation gives individuals the right to access, rectify, oppose and delete their personal data.
However, exercising these rights can be particularly challenging in the context of AI models, both because of the difficulties in identifying individuals within the model and because of the modification of the model itself. The CNIL urges developers of artificial intelligence to incorporate privacy protection from the design stage, to pay particular attention to personal data within training data sets, to strive to anonymise models whenever this does not compromise the intended purpose, and to develop innovative solutions to prevent the disclosure of confidential personal data by AI models.
In some cases, cost, technical impossibility or practical difficulties may justify refusing to comply with a request to exercise these rights. However, where the right must be guaranteed, the CNIL will consider reasonable solutions available to the model creator and may authorise deadlines for responses to requests and flexible deadlines. The CNIL also emphasises that scientific research in this area is evolving rapidly and urges AI stakeholders to stay informed of the latest advances to ensure the best possible protection of individual rights.
Adapting the principles of the GDPR to the specificities of AI.
Some artificial intelligence models are anonymous and therefore not subject to the GDPR. However, other models, such as a large language model (LLM), may contain personal data. The European Data Protection Board (EDPB) has recently provided relevant criteria on the application of the GDPR to AI models.
When applying the GDPR, individuals' data must be protected, both within training data sets, and within models that may have stored data or through the use of the model via prompts. Although the fundamental principles of data protection remain applicable, they must be adapted to the specific context of AI.
The CNIL has long established that:
- the purpose determination should be applied flexibly to general-purpose AI systems: an operator who is not able to define all potential applications at the training stage can instead describe the type of system being developed and illustrate the main potential functionalities.
- the principle of data minimisation does not preclude the use of large training data sets. However, data should generally be selected and cleaned to optimise the training of algorithms, while avoiding unnecessary processing of personal data.
- the retention of training data can be extended if justified and if the data set is subject to appropriate security measures. This is particularly important for databases that require significant scientific and financial investment, which sometimes become recognised standards within the research community.
- in many cases, the reuse of databases, including those available online, is possible, provided that the data has not been collected unlawfully and that its reuse is in line with the original purpose of collection.
The CNIL has published two new Recommendations to promote the responsible use of AI, while ensuring respect for the protection of personal data. These recommendations confirm that the GDPR requirements are sufficiently balanced to address the specific challenges of AI. They provide concrete and proportionate solutions to inform individuals and facilitate the exercise of their rights.
When personal data is used to train an AI model and can potentially be stored by it, the individuals concerned must be informed.
The way this information is provided can be adapted based on the risks to individuals and operational constraints. Under the GDPR, in some cases, especially when AI models are based on third-party data sources and the provider cannot directly contact individuals, organisations can limit themselves to general information (e.g., published on their website). When multiple sources are used, as is common with generic AI models, broad disclosure indicating the categories of sources or listing some key sources is generally sufficient.
As for the Recommendation on the exercise of rights, European legislation gives individuals the right to access, rectify, oppose and delete their personal data.
However, exercising these rights can be particularly challenging in the context of AI models, both because of the difficulties in identifying individuals within the model and because of the modification of the model itself. The CNIL urges developers of artificial intelligence to incorporate privacy protection from the design stage, to pay particular attention to personal data within training data sets, to strive to anonymise models whenever this does not compromise the intended purpose, and to develop innovative solutions to prevent the disclosure of confidential personal data by AI models.
In some cases, cost, technical impossibility or practical difficulties may justify refusing to comply with a request to exercise these rights. However, where the right must be guaranteed, the CNIL will consider reasonable solutions available to the model creator and may authorise deadlines for responses to requests and flexible deadlines. The CNIL also emphasises that scientific research in this area is evolving rapidly and urges AI stakeholders to stay informed of the latest advances to ensure the best possible protection of individual rights.