Identifying and Eliminating CSAM in Generative ML Training Data and Models

Placeholder Show Content

Abstract/Contents

Abstract

Generative Machine Learning models have been well documented as being able to produce explicit adult content, including child sexual abuse material (CSAM) as well as to alter benign imagery of a clothed victim to produce nude or explicit content. In this study, we examine the LAION-5B dataset—parts of which were used to train the popular Stable Diffusion series of models—to attempt to measure to what degree CSAM itself may have played a role in the training process of models trained on this dataset. We use a combination of PhotoDNA perceptual hash matching, cryptographic hash matching, k-nearest neighbors queries and ML classifiers.

This methodology detected many hundreds of instances of known CSAM in the training set, as well as many new candidates that were subsequently verified by outside parties. We also provide recommendations for mitigating this issue for those that need to maintain copies of this training set, building future training sets, altering existing models and the hosting of models trained on LAION-5B.

Description

Type of resource text
Date modified December 21, 2023; December 21, 2023; December 21, 2023; December 22, 2023; December 23, 2023; December 26, 2023; April 23, 2024
Publication date December 20, 2023

Creators/Contributors

Author Thiel, David ORCiD icon https://orcid.org/0000-0002-0947-5921 (unverified)
Research team head Hancock, Jeffrey ORCiD icon https://orcid.org/0000-0001-5367-2677 (unverified)

Subjects

Subject Machine learning
Subject Child abuse
Genre Text
Genre Report
Genre Technical report

Bibliographic information

Access conditions

Use and reproduction
User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
License
This work is licensed under a Creative Commons Attribution Non Commercial No Derivatives 4.0 International license (CC BY-NC-ND).

Preferred citation

Preferred citation
Thiel, D. (2023). Identifying and Eliminating CSAM in Generative ML Training Data and Models. Stanford Digital Repository. Available at https://purl.stanford.edu/kh752sm9123. https://doi.org/10.25740/kh752sm9123.

Collection

Stanford Internet Observatory, Freeman Spogli Institute for International Studies

View other items in this collection in SearchWorks

Contact information

Also listed in

Loading usage metrics...