CS 494: Cloud Data Center Systems

Reading List

Comment: Papers not on the reading list from recent top-tier conferences (e.g., OSDI ‘18, SOSP ‘17, SIGCOMM ‘18, NSDI ‘18, ATC ‘18, SIGMOD ‘18, VLDB ‘18, EuroSys ‘18, ISCA ‘18, ASPLOS ‘18) may also be acceptable with permission from the instructor.

Overview and Architecture
Architecture: Compute+Overall	The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , L.A. Barroso, U. Holzle, Synthesis Lectures on Computer Architecture, 2009. Chapter 1 and 2.
Architecture: Networks	Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network, Singh et al., SIGCOMM 2015.
Architecture: Network Routing	VL2: A Scalable and Flexible Data Center Network, Greenberg et al., SIGCOMM 2009.
Architecture: Storage	The Hadoop Distributed File System, Schvachko et al, MSST, 2010.

Storage Systems
Storage: GFS	The Google File System, Ghemawat et al, SOSP, 2003.
Storage: FDS	Flat Datacenter Storage. Nightingale et. al, OSDI, 2012.
Storage: EC-Cache	EC-Cache: Load-balanced, Low-latency Cluster Caching with Online Erasure Coding. Rashmi et. al, OSDI, 2016.
Storage: f4	f4: Facebook’s Warm BLOB Storage System. Muralidhar et. al, OSDI, 2014.
Storage: Bigtable	Bigtable: A Distributed Storage System for Structured Data. Chang et. al, OSDI, 2006.
Storage: Dynamo	Dynamo: Amazon’s Highly Available Key-value Store. DeCandia et. al, SOSP, 2007.
Storage: Spanner	Spanner: Google’s Globally-Distributed Database. Corbett et. al, OSDI, 2012.
Storage: PhotoCache	An Analysis of Facebook Photo Caching. Huang et. al, SOSP, 2013.
Storage: MemcachedFacebook	Scaling Memcache at Facebook. Nishtala et. al, NSDI, 2013.
Storage: Chubby	The Chubby lock service for loosely-coupled distributed systems. Mike Burrows, OSDI, 2006.

Execution Engines, Resource Negotiators, Schedulers
Execution: MR	MapReduce Simplified Data Processing on Large Clusters, Dean and Ghemawat, OSDI, 2004.
Execution: Dryad	Dryad:Distributed Data-Parallel Programs from Sequential Building Blocks. Isard et. al, EuroSys, 2007.
Execution: CIEL	CIEL: a universal execution engine for distributed data-flow computing. Murray et. al, NSDI, 2011.
Execution: Stragglers	Reining in the Outliers in Map-Reduce Clusters using Mantri, Ananthanarayanan et al, OSDI, 2010.
Execution: DryadLINQ	DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language. Yu et. al, OSDI, 2008.
Execution: Volcano	Encapsulation of parallelism in the Volcano query processing system. Goetz Graefe, SIGMOD, 1990.
Execution: Caching	PACMan: Coordinated Memory Caching for Parallel Jobs, Ananthanarayanan et. al, NSDI, 2012.
ResourceNeg: YARN	Apache Hadoop YARN: Yet Another Resource Negotiator, Vavilapalli et al, SOCC, 2013.
ResourceNeg: Borg	Borg: Large-scale cluster management at Google with Borg. Verma et. al, EuroSys, 2015.
ResourceNeg: Mesos	Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center, Hindman et al, NSDI, 2011.
ResourceNeg: DRF	Dominant Resource Fairness: Fair Allocation of Multiple Resource Types, Ghodsi et al, NSDI, 2011.
Scheduling: Packing	(Carbyne:) Altruistic Scheduling in Multi-Resource Clusters. Grandl et. al, OSDI, 2016.
Scheduling: Packing	(Tetris:) Multi-Resource Packing for Cluster Schedulers, Grandl et. al, SIGCOMM, 2014.
Scheduling: Packing	Quincy: Fair Scheduling for Distributed Computing Clusters. Isard et. al, SOSP, 2009.
Scheduling: Re-Planning	Dynamic Query Re-Planning using QOOP. Mahajan et. al, OSDI, 2018.
Scheduling: Threads	Arachne: Core-Aware Thread Management. Qin et. al, OSDI, 2018.
Scheduling: Cache	RobinHood: Tail Latency Aware Caching – Dynamic Reallocation from Cache-Rich to Cache-Poor. Berger et. al, OSDI, 2018.
Execution: Spark	Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing, Zaharia et al, NSDI, 2012.
Execution: Tez	Apache Tez: A Unifying Framework for Modeling and Building Data Processing Applications, Saha et al, SIGMOD, 2015.
Execution: F2	Fast and Flexible Analytics with F2, Grandl et al, 2017.
Execution: Flare	Flare: Optimizing Apache Spark with Native Compilation for Scale-Up Architectures and Medium-Size Data. Essertel et. al, OSDI, 2018.
Execution: Transactions	Obladi: Oblivious Serializable Transactions in the Cloud. Crooks et. al, OSDI, 2018.
Execution: LoadBalancing1	Ananta: Cloud Scale Load Balancing. Patel et. al, SIGCOMM, 2013.
Execution: LoadBalancing2	Duet: Cloud Scale Load Balancing with Hardware and Software. Gandhi et. al, SIGCOMM, 2014.
Execution: LoadBalancing3	SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs, Miao et. al, SIGCOMM, 2017.

Applications: Batch Analytics and SQL Frameworks
SQL: SparkSQL	Spark SQL: Relational Data Processing in Spark, Armburst et al, SIGMOD, 2015.
SQL: Hive	Major technical advancements in Apache Hive, Huai et al, SIGMOD, 2014.
SQL: Impala	Impala: A Modern, Open-Source SQL Engine for Hadoop. Kornacker et. al, CIDR, 2015.
SQL: Dremel	Dremel: Interactive Analysis of Web-Scale Datasets. Melnik et. al, VLDB, 2010.
SQL: Trill	Trill: A High-Performance Incremental Query Processor for Diverse Analytics. Chandramouli et. al, VLDB, 2014.
SQL: SIMD	Rethinking SIMD Vectorization for In-Memory Databases. Polychroniou et. al, SIGMOD, 2015.
SQL: Joins	Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited. Balkesen et. al, VLDB, 2013.
GeoDistributed: Clarinet	Clarinet: WAN-Aware Optimization for Analytics Queries, Viswanathan et al, OSDI, 2016.
GeoDistributed: Geode	Global Analytics in the Face of Bandwidth and Regulatory Constraints, Vulimiri et al, NSDI, 2015.
AdHoc: TAG	TAG: a Tiny AGgregation Service for Ad-Hoc Sensor Networks. Madden et. al, OSDI, 2002.

Applications: Stream Analytics
Streaming: Storm	Storm @Twitter , Toshniwal et al, SIGMOD, 2014.
Streaming: Heron	Twitter Heron: Stream Processing at Scale, Kulkarni et al, SIGMOD, 2015.
Streaming: FacebookStreaming	Realtime Data Processing at Facebook. Chen et. al, SIGMOD, 2016.
Streaming: SparkStreaming	Discretized Streams: Fault-Tolerant Streaming Computation at Scale, Zaharia et al, SOSP, 2013.
Streaming: Flink	Apache Flink: Stream and Batch Processing in a Single Engine, Carbone et al, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2015.
Streaming: Drizzle	Drizzle: Fast and Adaptable Stream Processing at Scale. Venkataraman et. al, SOSP, 2017.
Streaming: Chi	Chi: A Scalable and Programmable Control Plane for Distributed Stream Processing Systems. Mai et. al, PVLDB, 2018.
Streaming: Gloss	Gloss: Seamless Live Reconfiguration and Reoptimization of Stream Programs. Rajadurai et. al, ASPLOS, 2018.
QMS: Kafka	Kafka Distributed Messaging System for Log Processing, Kreps et al, NetDB Workshop, 2011. Also read this comparison of widely used Queuing Messaging Processing Systems.
Streaming: rStreams	StreamScope: Continuous Reliable Distributed Processing of Big Data Streams, Lin et al, NSDI, 2016.
Streaming: Dataflow	The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing. Akidau et. al, VLDB, 2015.
Streaming: Aurora	Aurora: a new model and architecture for data stream management. Abadi et. al, VLDB, 2003.
Streaming/Execution: Naiad	Naiad: A Timely Dataflow System, Murray et al, SOSP, 2013.
Streaming: Scaling	Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows. Kalavri et. al, OSDI, 2018.

Applications: Graph Processing
GraphProc: Pregel	Pregel: A System for Large-Scale Graph Processing, Malewicz et al, SIGMOD, 2010.
GraphProc: TAO	TAO: Facebook’s Distributed Data Store for the Social Graph. Bronson et. al, USENIX ATC, 2013.
GraphProc: PowerGraph	PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs, Gonzalez et al, OSDI, 2012.
GraphProc: GraphX	GraphX: Graph Processing in a Distributed Dataflow Framework, Gonzalez et al, OSDI, 2014.
GraphProc: Arabesque	Arabesque: A System for Distributed Graph Mining. Teixeira et. al, SOSP, 2015.
GraphProc: RDF	Fast and Concurrent RDF Queries with RDMA-based Distributed Graph Exploration. Shi et. al, OSDI, 2016.
GraphProc: ASAP	ASAP: Fast, Approximate Pattern Mining at Scale. Iyer et. al, OSDI, 2018.
GraphProc: Grappa	Grappa: A Latency-Tolerant Runtime for Large-Scale Irregular Applications. Nelson et. al, USENIX ATC, 2015.
GraphProc: Facebook	One Trillion Edges: Graph Processing at Facebook-Scale. Ching et. al, VLDB, 2015.

Applications: Web, Search, Social
Search: Google	The anatomy of a large-scale hypertextual Web search engine. Brin and Page, Computer Networks and ISDN Systems, 1998.
Social: FacebookAnalytics	Data warehousing and analytics infrastructure at Facebook. Thusoo et. al, SIGMOD, 2010.
Social: FacebookPhoto	Finding a needle in Haystack: Facebook’s photo storage. Beaver et. al, OSDI, 2010.
Social: Unicorn	Unicorn: A System for Searching the Social Graph. Curtiss et. al, VLDB, 2013.

Performance Isolation
PerfIso: Heracles	Heracles: Improving Resource Efficiency at Scale. Lo et. al, ISCA, 2015.
PerfIso: Quasar	Quasar: resource-efficient and QoS-aware cluster management. Delimitrou and Kozyrakis, ASPLOS, 2014.
PerfIso: ZygOS	ZygOS: Achieving Low Tail Latency for Microsecond-scale Networked Tasks. Prekas et. al, SOSP, 2017.
PerfIso: PerfIso	PerfIso: Performance Isolation for Commercial Latency-Sensitive Services. Iorgulescu et. al, USENIX ATC, 2018.

Monitoring, Debugging
Monitor: Pivot	Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems. Mace et. al, SOSP, 2015.
Monitor: Analytics	Making Sense of Performance in Data Analytics Frameworks. Ousterhout et. al, NSDI, 2015.
Monitor: COZ	COZ: Finding Code that Counts with Causal Profiling. Curtsinger et. al, USENIX ATC, 2016.
Monitor: Scuba	Scuba: Diving into Data at Facebook. Abraham et. al, VLDB, 2013.
Monitor: Imbalance	Characterizing Load Imbalance in Real-World Networked Caches. Huanag et. al, HotNets, 2014.
Monitor: End-to-End	The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services., Chow et. al, OSDI, 2014.
Monitor: Consistency	Existential Consistency: Measuring and Understanding Consistency at Facebook. Lu et. al, OSDI, 2015.

Video Analytics
Video: Chameleon	Chameleon: Scalable Adaptation of Video Analytics. Jiang et. al, SIGCOMM, 2018.
Video: Focus	Focus: Querying Large Video Datasets with Low Latency and Low Cost. Hsieh et. al, OSDI, 2018.
Video: NoScope	NoScope: Optimizing Neural Network Queries over Video at Scale. Kang et. al, VLDB, 2017.
Video: TinyThreads	Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads. Fouladi et. al, NSDI, 2017.
Video: SVE	SVE: Distributed Video Processing at Facebook Scale. Huang et. al, SOSP, 2017.

Potpourri: Runtime, New Hardware Models, Serverless, and Approximation
Runtime: Weld	Weld: A Commom Runtime for High Performance Data Analytics, Palkar et al, CIDR, 2017.
DeepLearning: DeepXplore	DeepXplore: Automated Whitebox Testing of Deep Learning Systems, Pei et al, SOSP, 2017.
Serverless: PyWren	Occupy the Cloud: Distributed Computing for the 99%, Jonas et al, SoCC, 2017.
Serverless: OpenLambda	Serverless Computation with OpenLambda. Hendrickson et. al, HotCloud, 2016.
Serverless: Pocket	Pocket: Elastic Ephemeral Storage for Serverless Analytics. Klimovic et. al, OSDI, 2018.
Serverless: Platforms	Peeking Behind the Curtains of Serverless Platforms, Wang et. al, USENIX ATC, 2018.
Serverless: SOCK	SOCK: Rapid Task Provisioning with Serverless-Optimized Containers, Oakes et. al, USENIX ATC, 2018.
Approx: BlinkDB	BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data, Agarwal et al, Eurosys, 2013.
Approx: BlinkML	BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees. Park et. al, SIGMOD, 2019.
Approx: Quickr	Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters. Kandula et. al, SIGMOD, 2016.
RDMA: FaRM	FaRM: Fast Remote Memory. Dragojevic et. al, NSDI, 2014.
RDMA: FaRM2	No compromises: distributed transactions with consistency, availability, and performance. Dragojevic et. al, SOSP, 2015.
RDMA: FaSST	FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs. Kalia et. al, OSDI, 2016.
RDMA: eRPC	Datacenter RPCs can be General and Fast. Kalia et. al, NSDI, 2019.
RDMA: Locks	Distributed Lock Management with RDMA: Decentralization without Starvation. Yoon et. al, SIGMOD, 2018.
RDMA: Infiniswap	Efficient Memory Disaggregation with Infiniswap. Gu et. al, NSDI, 2017.
RDMA: Databases	Accelerating Relational Databases by Leveraging Remote Memory and RDMA. Li et. al, SIGMOD, 2016.
RDMA: FastNetworks	Remote Memory in the Age of Fast Networks. Aguilera et. al, SoCC, 2017.
ML: TPU	In-Datacenter Performance Analysis of a Tensor Processing Unit. Jouppi et. al, ISCA, 2017.
Hardware: Catapult	A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services. Putnam et. al, ISCA, 2014.
Hardware: Strata	Strata: A Cross Media File System. Kwon et. al, SOSP, 2017.
Offload: Floem	Floem: A Programming System for NIC-Accelerated Network Applications. Phothilimthana et. al, OSDI, 2018.
Offload: iPipe	iPipe: A Framework for Building Datacenter Applications Using In-networking Processors. Liu et. al, 2018.
Offload: Access	Direct Universal Access: Making Data Center Resources Available to FPGA. Shu et. al, NSDI, 2019.
Misc: OneSize	“One Size Fits All”: An Idea Whose Time Has Come and Gone. Stonebraker and Çetintemel, ICDE, 2005.

Applications: Machine Learning
ML: ParamServ	Scaling Distributed Machine Learning with the Parameter Server, Li et al, OSDI, 2014.
ML: STRADS	STRADS: A Distributed Framework for Scheduled Model Parallel Machine Learning, Kim et al, EuroSys, 2016.
ML: TensorFlow	TensorFlow: A System for Large-Scale Machine Learning, Abadi et al, OSDI, 2016.
ML: RDBMS	Towards a Unified Architecture for in-RDBMS Analytics. Feng et. al, SIGMOD, 2012.
ML: DimmWitted	DimmWitted: A Study of Main-Memory Statistical Analytics. Zhang and Re, VLDB, 2014.
DeepLearning: MS	Project Adam: Building an Efficient and Scalable Deep Learning Training System, Chilimbi et al, OSDI, 2014.
ML: KeystoneML	KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analystics, Sparks et al, ICDE, 2017.
ML: Clipper	Clipper: A Low-Latency Online Prediction Serving System, Crankshaw et al, NSDI, 2017.
ML: SLAQ	SLAQ: Quality-Driven Scheduling for Distributed Machine Learning, Zhang et al, SoCC, 2017.
ML: TVM	TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. Chen et. al, OSDI, 2018.
ML: Janus	Janus: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs. Jeong et. al, 2018.
ML: Tiresias	Tiresias: A GPU Cluster Manager for Distributed Deep Learning. Gu et. al, NSDI, 2019.
ML: Optimus	Optimus: An Efficient Dynamic Resource Scheduler for Deep Learning Clusters. Peng et. al, EuroSys, 2018.
ML: Gandiva	Gandiva: Introspective Cluster Scheduling for Deep Learning. Xiao et. al, OSDI, 2018.
GraphProc/ML: GraphLab	Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud, Low et al, VLDB, 2012.
ML: TuX2	TuX2: Distributed Graph Computation for Machine Learning. TODO
ML: Gaia	Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds. TODO
ML: Ray	Ray: A Distributed Framework for Emerging AI Applications. Moritz et. al, OSDI, 2018.
ML: MXNet	MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. Chen et. al, Neural Information Processing Systems, Workshop on Machine Learning Systems, 2015.
ML: PipeDream	PipeDream: Fast and Efficient Pipeline Parallel DNN Training. Harlap et. al, SysML, 2018.
ML: DeepCPU	DeepCPU: Serving RNN-based Deep Learning Models 10x Faster. Zhang et. al, USENIX ATC, 2018.
ML: PRETZEL	PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems. Lee et. al, OSDI, 2018.
ML: Facebook	Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective, Hazelwood et. al, HPCA, 2018.