Identifying and Eliminating CSAM in Generative ML Training Data and Models

Thiel, David

doi:10.25740/kh752sm9123

Identifying and Eliminating CSAM in Generative ML Training Data and Models

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fkh752sm9123" class="su-underline">Show Content</a>

Abstract/Contents

Abstract

Generative Machine Learning models have been well documented as being able to produce explicit adult content, including child sexual abuse material (CSAM) as well as to alter benign imagery of a clothed victim to produce nude or explicit content. In this study, we examine the LAION-5B dataset—parts of which were used to train the popular Stable Diffusion series of models—to attempt to measure to what degree CSAM itself may have played a role in the training process of models trained on this dataset. We use a combination of PhotoDNA perceptual hash matching, cryptographic hash matching, k-nearest neighbors queries and ML classifiers.

This methodology detected many hundreds of instances of known CSAM in the training set, as well as many new candidates that were subsequently verified by outside parties. We also provide recommendations for mitigating this issue for those that need to maintain copies of this training set, building future training sets, altering existing models and the hosting of models trained on LAION-5B.

Description

Type of resource	text
Date modified	December 21, 2023; December 21, 2023; December 21, 2023; December 22, 2023; December 23, 2023; December 26, 2023; April 23, 2024
Publication date	December 20, 2023

Creators/Contributors

Author	Thiel, David	https://orcid.org/0000-0002-0947-5921 (unverified)
Research team head	Hancock, Jeffrey	https://orcid.org/0000-0001-5367-2677 (unverified)

Subjects

Subject	Machine learning
Subject	Child abuse
Genre	Text
Genre	Report
Genre	Technical report

Bibliographic information

DOI	https://doi.org/10.25740/kh752sm9123
Location	https://purl.stanford.edu/kh752sm9123

Access conditions

Use and reproduction: User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
License: This work is licensed under a Creative Commons Attribution Non Commercial No Derivatives 4.0 International license (CC BY-NC-ND).

Preferred citation

Preferred citation: Thiel, D. (2023). Identifying and Eliminating CSAM in Generative ML Training Data and Models. Stanford Digital Repository. Available at https://purl.stanford.edu/kh752sm9123. https://doi.org/10.25740/kh752sm9123.

Collection

Stanford Internet Observatory, Freeman Spogli Institute for International Studies

View other items in this collection in SearchWorks

Contact information

Contact: dthiel@stanford.edu

Also listed in

View in SearchWorks

Loading usage metrics...