Identifying and Eliminating CSAM in Generative ML Training Data and Models
Abstract/Contents
- Abstract
Generative Machine Learning models have been well documented as being able to produce explicit adult content, including child sexual abuse material (CSAM) as well as to alter benign imagery of a clothed victim to produce nude or explicit content. In this study, we examine the LAION-5B dataset—parts of which were used to train the popular Stable Diffusion series of models—to attempt to measure to what degree CSAM itself may have played a role in the training process of models trained on this dataset. We use a combination of PhotoDNA perceptual hash matching, cryptographic hash matching, k-nearest neighbors queries and ML classifiers.
This methodology detected many hundreds of instances of known CSAM in the training set, as well as many new candidates that were subsequently verified by outside parties. We also provide recommendations for mitigating this issue for those that need to maintain copies of this training set, building future training sets, altering existing models and the hosting of models trained on LAION-5B.
Description
Type of resource | text |
---|---|
Date modified | December 21, 2023; December 21, 2023; December 21, 2023; December 22, 2023; December 23, 2023; December 26, 2023; April 23, 2024 |
Publication date | December 20, 2023 |
Creators/Contributors
Author | Thiel, David | https://orcid.org/0000-0002-0947-5921 (unverified) |
---|---|---|
Research team head | Hancock, Jeffrey | https://orcid.org/0000-0001-5367-2677 (unverified) |
Subjects
Subject | Machine learning |
---|---|
Subject | Child abuse |
Genre | Text |
Genre | Report |
Genre | Technical report |
Bibliographic information
Access conditions
- Use and reproduction
- User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
- License
- This work is licensed under a Creative Commons Attribution Non Commercial No Derivatives 4.0 International license (CC BY-NC-ND).
Preferred citation
- Preferred citation
- Thiel, D. (2023). Identifying and Eliminating CSAM in Generative ML Training Data and Models. Stanford Digital Repository. Available at https://purl.stanford.edu/kh752sm9123. https://doi.org/10.25740/kh752sm9123.
Collection
Stanford Internet Observatory, Freeman Spogli Institute for International Studies
View other items in this collection in SearchWorksContact information
- Contact
- dthiel@stanford.edu
Also listed in
Loading usage metrics...