Guided randomized search over programs for synthesis and program optimization

Placeholder Show Content

Abstract/Contents

Abstract
The ability to automatically reason about programs and extract useful information from them is very important and has received a lot of attention from both the academic community as well as practitioners in industry. Scaling such program analyses to real system is a significant challenge, as real systems tend to be very large, very complex, and often at least part of the system is not available for analysis. A common solution to this problem is to manually write models for the parts of the system that are not analyzable. However, writing these models is both challenging and time consuming. Instead, we propose the use of guided randomized search to find models automatically, and we show how this idea can be applied in three diverse contexts. First, we show how we can use guided randomized search to automatically find models for opaque code, a common problem in program analysis. Opaque code is code that is executable but whose source code is unavailable or difficult to process. We present a technique to first observe the opaque code by collecting partial program traces and then automatically synthesize a model. We demonstrate our method by learning models for a collection of array-manipulating routines. Second, we tackle automatically learning a formal specification for the x86-64 instruction set. Many software analysis and verification tools depend, either explicitly or implicitly, on correct modeling of the semantics of x86-64 instructions. However, formal semantics for the x86-64 ISA are difficult to obtain and often written manually with great effort. Instead, we show how to automatically synthesize formal semantics for 1,795 instruction variants of x86-64. Crucial to our success is a new technique, stratified synthesis, that allows us to scale to longer programs. We evaluate the specification we learned and find that it contains no errors, unlike all manually written specifications we compare against. Third, we consider the problem of program optimization on recent CPU architectures. These modern architectures are incredibly complex and make it difficult to statically determine the performance of a program. Using guided randomized search with a new cost function we are able to outperform the previous state-of-the-art on several metrics, sometimes by a wide margin.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2018; ©2018
Publication date 2018; 2018
Issuance monographic
Language English

Creators/Contributors

Author Heule, Stefan
Degree supervisor Aiken, Alexander
Thesis advisor Aiken, Alexander
Thesis advisor Mitchell, John C
Thesis advisor Sridharan, Manu
Degree committee member Mitchell, John C
Degree committee member Sridharan, Manu
Associated with Stanford University, Computer Science Department.

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Stefan Heule.
Note Submitted to the Department of Computer Science.
Thesis Thesis Ph.D. Stanford University 2018.
Location electronic resource

Access conditions

Copyright
© 2018 by Stefan Heule
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...