Future scaling of datacenter power-efficiency

Placeholder Show Content

Abstract/Contents

Abstract
Datacenters are critical assets for today's Internet giants (Google, Facebook, Microsoft, Amazon, etc.). They host extraordinary amounts of data, serve requests for millions of users, generate untold profits for their operators, and have enabled new modes of communication, socialization, and organization for society at-large. Their steady growth in scale and capability has allowed datacenter operators to continually expand the reach and benefit of their services. The motivation for the work presented in this dissertation stems from a simple premise: future scaling of datacenter capability depends upon improvements to server power-efficiency. That is, without improvements to power-efficiency, datacenter operators will soon face a limit to the utility and capability of existing facilities that might result in (1) a sudden boom in datacenter construction, (2) a rapid increase in the cost to operate existing datacenters, or (3) stagnation in the growth of datacenter capability. This limit is akin to the "Power Wall" that the CPU industry has been grappling with for nearly a decade now. Like the CPU "Power Wall, " the problem is not that we can't build datacenters with more capability in the future per se. Rather, it will become increasingly difficult to do this economically; even now, the cost to provision power and cooling infrastructure is a substantial fraction of datacenter "Total Cost of Ownership." The root cause of this problem is the recent failure of Dennard scaling for semiconductor technologies smaller than 90 nanometers. Even though Moore's Law continues to march on at a steady pace, granting us exponential growth in transistor count in new processors, we can no longer make full use of that transistor count without also increasing the power density of those new processors. Thus, in order for datacenter operators to sustain the rate of growth in capability that they have come to expect, they must either provision new power and cooling infrastructure to support future servers (at exceptional cost), or find other ways to improve the power-efficiency of datacenters that do not depend on semiconductor technology scaling. Indeed, the initial onset of this problem led to rapid improvements to the efficiency of power delivery and cooling within datacenters, reducing non-server power consumption by an order of magnitude. Unfortunately, those improvements were essentially one-time benefits and have now been exhausted. In this dissertation, we show that most of the future opportunity to improve datacenter power-efficiency lies in improving the power-efficiency of the servers themselves, as most of the inefficiency in the rest of a datacenter has largely been eliminated. Then, we explore four compelling opportunities to improve server power-efficiency: two hardware proposals that explicitly reduce the power consumption of servers, and two software proposals that improve the power-efficiency of servers operating as a cluster. First, we present Multicore DIMM (MCDIMM), a modification to the architecture of traditional DDRx main memory modules optimized for energy-efficiency. MCDIMM modules divide the wide, 64-bit rank interface presented by ordinary DIMMs into smaller rank subsets. By accessing rank subsets individually, fewer DRAM chips are activated per column access (i.e. cache-line refill), which greatly reduces dynamic energy consumption. Additionally, we describe an energy-efficient implementation of error-correction codes for MCDIMMs, as well as "chipkill" reliability, which tolerates the failure of entire DRAM devices. For ordinary server configurations and across a wide range of benchmarks, we estimate more than 20% average savings in memory dynamic power consumption, though the impact on total system power consumption is more modest. We also describe additional, unexpected performance and static-power consumption benefits from rank subsetting. Second, we propose an architecture for per-core power gating (PCPG) of multicore processors, where the power supply for individual CPU cores can be cut entirely. We propose that servers running at low to moderate utilization, as is common in datacenters, could operate with some of their cores gated off. Gating the power to a core eliminates its static power consumption, but requires flushing its caches and precludes using the core to execute workloads until power is restored. In our proposal, we improve the utility of PCPG by coordinating power gating actions with the operating system, migrating workloads off of gated cores onto active cores. This is in contrast to contemporary industry implementations of PCPG that gate cores reactively. We control the gating of cores with a dynamic power manager which continually monitors CPU utilization. Our OS-integrated approach to PCPG maximizes the opportunities available to utilize PCPG relative to OS-agnostic approaches, and protects applications from incurring the latency of waking a sleeping core. We show that PCPG is beneficial for datacenter workloads, and that it can reduce CPU power consumption by up to 40% for underutilized systems with minimal impact on performance. The preceding hardware proposals seek to improve the power-efficiency of individual servers directly. The improvements are modest, however, as there are many factors that contribute to the inefficiency of servers (i.e. cooling fans, spinning disks, power regulator inefficiency, etc.). More to the point, techniques like PCPG only address the power-inefficiency of underutilized CPUs, and do little to address the inefficiency of the rest of the components within a server when it is at low utilization. To address this short-coming, this dissertation then explores a different tack and holistically assesses how utilization across clusters of servers can be manipulated to improve power-efficiency. First, we describe how contemporary distributed storage systems, such as Hadoop's Distributed File System (HDFS), expect the perpetual availability of the vast majority of servers in a cluster. This artificial expectation prevents the use of low-power modes in servers; we cannot trivially turn servers off or put them into a standby mode without the storage system assuming the server has failed. Consequently, even if such a cluster is grossly underutilized, we cannot disable servers in order to reduce its aggregate power consumption. Thus, these clusters tend to be tragically power-inefficient at low utilization. We propose a simple set of modifications to HDFS to rectify this problem, and show that these storage systems can be built to be power-proportional. We find that running Hadoop clusters in fractional configurations can save between 9% and 50% of energy consumption, and that there is a trade-off between performance energy consumption. Finally, we set out to determine why datacenter operators chronically underutilize servers which host latency-sensitive workloads. Using memcached as a canonical latency-sensitive workload, we demonstrate that latency-sensitive workloads suffer substantial degradation in quality-of-service (QoS) when co-located with other datacenter workloads. This encourages operators to be cautious when provisioning or co-locating services across large clusters, and this ultimately manifests as the low server utilization we see ubiquitously in datacenters. However, we find that these QoS problems typically manifest in a limited number of ways: as increases in queuing delay, scheduling delay, or load imbalance of the latency-sensitive workload. We evaluate several techniques, including interference-aware provisioning and replacing Linux's CPU scheduler with a scheduler previous proposed in the literature, to ameliorate QoS problems when co-locating memcached with other workloads. We ultimately show that good QoS for latency-sensitive applications can indeed be maintained while still running these servers at high utilization. Judicious application of these techniques can greatly improve server power-efficiency, and raise a datacenter's effective throughput per TCO dollar by up to 53%. All told, we have found that there exists considerable opportunity to improve the power-efficiency of datacenters despite the failure of Dennard scaling. The techniques presented in this dissertation are largely orthogonal, and may be combined. Through judicious focus on server power-efficiency, we can stave off stagnation in the growth of online services or an explosion of datacenter construction, at least for a time.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2014
Issuance monographic
Language English

Creators/Contributors

Associated with Leverich, Jacob Barton
Associated with Stanford University, Department of Computer Science.
Primary advisor Kozyrakis, Christoforos, 1974-
Thesis advisor Kozyrakis, Christoforos, 1974-
Thesis advisor Olukotun, Oyekunle Ayinde
Thesis advisor Rosenblum, Mendel
Advisor Olukotun, Oyekunle Ayinde
Advisor Rosenblum, Mendel

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Jacob Barton Leverich.
Note Submitted to the Department of Computer Science.
Thesis Ph.D. Stanford University 2014
Location electronic resource

Access conditions

Copyright
© 2014 by Jacob Barton Leverich
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...