Improving cloud efficiency with online learning

Placeholder Show Content

Abstract/Contents

Abstract
Cloud computing has emerged as the dominant platform for computing due to its ease of use and scalability advantages. To maintain reliable and high-performance services at a cost-effective manner, cloud providers need to efficiently administer a large number of management tasks. Traditional heuristic-based static management policies often fall short of adapting to the varying demands of diverse cloud workloads. In contrast, machine learning (ML) assisted policies have the ability to learn patterns of different workloads and use predictions to guide their management decisions. This dissertation focuses on how to effectively and safely use online learning techniques to improve management of cloud platforms. We first present SmartHarvest, a system that improves server CPU utilization by dynamically assigning the number of CPU cores to Virtual Machines (VMs) of different priorities. It uses an online cost-sensitive multi-class classification algorithm to predict the core demand of primary VMs and safely harvest the predicted unused cores to run a secondary best-effort VM. In order to incorporate online learning more easily to various types of server node management agents, we then introduce SOL, a Safe On-node Learning framework for developing and operating learning-based node agents. SOL simplifies agent implementation and enables developers to focus on the design of agent-specific components (e.g., power management or memory migration). SOL further ensures that agents implemented on top of it are robust to the range of abnormal or failure conditions that can occur in production, so they can be safely deployed alongside customer workloads in the cloud. We demonstrate the use of SOL by implementing three node agents that leverage online learning to improve management of CPU cores, node power, and memory placement.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Wang, Yawen
Degree supervisor Kozyrakis, Christoforos, 1974-
Thesis advisor Kozyrakis, Christoforos, 1974-
Thesis advisor Rosenblum, Mendel
Thesis advisor Zaharia, Matei
Degree committee member Rosenblum, Mendel
Degree committee member Zaharia, Matei
Associated with Stanford University, Department of Electrical Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Yawen Wang.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/vm846qg7763

Access conditions

Copyright
© 2022 by Yawen Wang
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...