Scaling human feedback

Kwon, Minae

Scaling human feedback

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fsy876pv8068" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Human-generated data has been pivotal for significant advancements in artificial intelligence (AI). As AI models scale and are applied to a wider range of tasks, the demand for more and increasingly specialized human data will grow. However, current methods of acquiring human feedback, such as learning from demonstrations or preferences, and designing objective functions or prompts, are becoming unsustainable due to their high cost and the extensive effort or domain knowledge they require from users. We addresses this challenge by developing algorithms that reduce the cost and effort of providing human feedback. We leverage Foundation models to aid users in offering feedback. Users initially define their objectives (through language or a small dataset), and Foundation models expand this into more detailed feedback. A key contribution is an algorithm, based on a large language model, that allows users to cheaply define their objectives and train a reinforcement learning agent without needing to develop a complex reward function or provide extensive data. For situations where initial objectives are poorly defined or biased, we introduce an algorithm that efficiently queries humans for more information, reducing the number of needed queries. Finally, we conclude by proposing an information-gathering algorithm that eliminates the requirement for human intervention altogether, streamlining the feedback process. By making it cheaper for users to give feedback, either during training or when queried for more information, we hope to make learning from human feedback more scalable.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2023; ©2023
Publication date	2023; 2023
Issuance	monographic
Language	English

Creators/Contributors

Author	Kwon, Minae
Degree supervisor	Sadigh, Dorsa
Thesis advisor	Sadigh, Dorsa
Thesis advisor	Goodman, Noah (Noah D.)
Thesis advisor	Yang, Diyi
Degree committee member	Goodman, Noah (Noah D.)
Degree committee member	Yang, Diyi
Associated with	Stanford University, School of Engineering
Associated with	Stanford University, Computer Science Department

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Minae Kwon.
Note	Submitted to the Computer Science Department.
Thesis	Thesis Ph.D. Stanford University 2023.
Location	https://purl.stanford.edu/sy876pv8068

Access conditions

Also listed in

View in SearchWorks

Loading usage metrics...