Improving resource efficiency in cloud computing
- Cloud computing is at a critical juncture. An increasing amount of computation is now hosted in private and public clouds. At the same time, datacenter resource efficiency, i.e., the effective utility we extract from system resources has remained notoriously low, with utilization rarely exceeding 20-30%. Low utilization coupled with the lack of scaling in hardware due to technology limitations poses threatening scalability roadblocks for cloud computing. At a high level, two main reasons hinder efficient scalability in datacenters. First, the reservation-based interface through which resources are currently allocated is fundamentally flawed. Users must determine how many resources a new application requires to meet its quality of service (QoS) constraints. Unfortunately this is extremely difficult for users that tend to overprovision their reservations, resulting in mostly-allocated, but lightly-utilized systems. Second, underutilization is aggravated by performance unpredictability; the result of heterogeneity in hardware platforms, interference between applications contending in shared resources, and spikes in input load. Unpredictability results in further resource overprovisioning by users. The focus of this dissertation is to enable efficient, scalable and performance-aware datacenters with tens to hundreds of thousands of machines by improving cluster management. To this end, we present contributions that address both the system-user interface, and the complexity of resource management at scale. These techniques are directly applicable to current systems, with modest design alterations. We first present a new declarative interface between users and cluster manager that centers around performance, instead of resource reservations. This enables users to focus on the high level performance objectives an application must meet, as opposed to the intrinsics on how these objectives should be achieved using low level resources. On the system side, we make two fundamental contributions. First, we design a practical system that leverages data mining to quickly understand the resource requirements of incoming applications in an online manner. We establish that resource management at this scale cannot be solved with the traditional trial-and-error approach of conventional architecture and system design. We show that instead we can introduce data mining principles which leverage the knowledge the system accumulates over time from incoming applications, to significantly benefit both performance and efficiency. We first use this approach in Paragon to tackle the platform heterogeneity and workload interference challenges in datacenter management. The cluster manager relies on collaborative filtering to identify the most suitable hardware platform for a new, unknown application and its sensitivity to interference in various shared resources. We then extend a similar approach to address the larger problem of resource assignment and resource allocation with Quasar. To ensure minimal management overheads, we decompose the problem to four dimensions; platform heterogeneity, application interference, resource scale-up and scale-out. This enables the majority of applications to meet their QoS targets, while operating at 70% utilization, on a cluster with several hundred servers. In contrast, a reservation-based system rarely exceeds 15-20% utilization, with worse per-application performance. Our second contribution pertains to designing scalable scheduling techniques that use the information from Paragon and Quasar to perform efficient and QoS-aware resource allocations. We develop Tarcil, a scalable scheduler that reconciles the high quality of sophisticated centralized schedulers with the low latency of distributed sampling-based systems. Tarcil relies on a simple analytical framework to sample resources is a way that provides statistical guarantees on a job meeting its QoS constraints. It incurs a few milliseconds of scheduling overhead, making it appropriate for highly-loaded clusters, servicing both short- and long-running applications. Finally, we design HCloud, a resource provisioning system for public cloud providers. HCloud leverages the information on the resource preferences of applications to determine the type (e.g., reserved versus on-demand) and size of required instances. The system guarantees high application performance, while securing significant cost savings.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Stanford University, Department of Electrical Engineering.
|Kozyrakis, Christoforos, 1974-
|Kozyrakis, Christoforos, 1974-
|Ousterhout, John K
|Ousterhout, John K
|Statement of responsibility
|Submitted to the Department of Electrical Engineering.
|Thesis (Ph.D.)--Stanford University, 2015.
- © 2015 by Christina Delimitrou
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...