Identifying and Eliminating CSAM in Generative ML Training Data and Models
Generative Machine Learning models have been well documented as being able to produce explicit adult content, including child sexual abuse material (CSAM) as well as to alter benign imagery of a clothed victim to produce nude or explicit content. In this study, we examine the LAION-5B dataset—parts of which were used to train the popular Stable Diffusion series of models—to attempt to measure to what degree CSAM itself may have played a role in the training process of models trained on this dataset. We use a combination of PhotoDNA perceptual hash matching, cryptographic hash matching, k-nearest neighbors queries and ML classifiers.
This methodology detected many hundreds of instances of known CSAM in the training set, as well as many new candidates that were subsequently verified by outside parties. We also provide recommendations for mitigating this issue for those that need to maintain copies of this training set, building future training sets, altering existing models and the hosting of models trained on LAION-5B.
|Type of resource
|December 21, 2023; December 21, 2023; December 21, 2023; December 22, 2023; December 23, 2023; December 26, 2023
|December 20, 2023
- Use and reproduction
- User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
- This work is licensed under a Creative Commons Attribution Non Commercial No Derivatives 4.0 International license (CC BY-NC-ND).
Stanford Internet Observatory, Freeman Spogli Institute for International StudiesView other items in this collection in SearchWorks
Also listed in
Loading usage metrics...