Packet Transport Mechanisms for Data Center Networks
Mon., Feb 10, 2014
3:00 - 4:00 PM
521 Cory Hall (Hogan Room)
Title: Packet Transport Mechanisms for Data Center Networks

Abstract: In recent years, we are witnessing a fundamental shift in the computing landscape as much of the world's application workloads move to massive "cloud" data centers. These data centers enable the services we rely on daily, such as web search, social networking, and Internet commerce. They also require huge investments --- 100s of millions of dollars --- to deploy and operate. This has spurred significant interest in the industry and the research community in innovation for data centers, and in particular, for data center networks.
A crucial feature of a data center network is its transport mechanism: the method by which data is transferred from one server to another at the highest possible rate and with the lowest latency. In this talk, I will present measurements from a 6000 server production cluster that reveal shortcomings with today's state-of-the-art Transmission Control Protocol (TCP) in data centers. The shortcomings are rooted on the demands TCP places upon the limited buffer space in commodity data center switches. I will then discuss Data Center TCP (DCTCP), a new transport mechanism we have designed for data centers that is now shipping with Windows Server 2012. DCTCP uses a simple modification of the TCP congestion control algorithm to achieve full throughput while requiring ~10x less buffering in the switches compared to TCP. I will present a control-theoretic analysis of DCTCP's control loop to prove its stability and precisely characterize its rate of convergence. Finally, I will describe HULL, an architecture that builds on DCTCP to deliver near-zero fabric latency: only propagation and switching latency, no queueing.

Bio: Mohammad Alizadeh is a Researcher at Insieme Networks (recently acquired by Cisco Systems). He received his Ph.D. in Electrical Engineering from Stanford University where he was advised by Balaji Prabhakar. Before that, he completed his undergraduate degree in Electrical Engineering at Sharif University of Technology. His research interests are broadly in network systems and algorithms, data center networking, and cloud computing. His dissertation work focused on designing high performance data center transport mechanisms. His research has garnered significant industry interest: the DCTCP algorithm has been implemented in Windows Server 2012; the QCN congestion control algorithm has been standardized as the IEEE 802.1Qau standard; and most recently, the CONGA adaptive load balancing mechanism has been implemented in Cisco's new flagship Application Centric Infrastructure products. Mohammad is a recipient of a Stanford Electrical Engineering Departmental Fellowship, the Caroline and Fabian Pease Stanford Graduate Fellowship, and the Numerical Technologies Inc. Prize and Fellowship.
UC Berkeley Networking
Varun Jog and Ka Kit Lam Last Modification Date: Sunday, January 26, 2014