Key Papers Student Email
Storage: f4: f4: Facebook’s Warm BLOB Storage System dporte7
ResourceNeg: YARN Apache Hadoop YARN: Yet Another Resource Negotiator, Vavilapalli et al, SOCC, 2013. kshind2
SQL: Hive Major technical advancements in Apache Hive, Huai et al, SIGMOD, 2014. bjain6
SQL: Impala Impala: A Modern, Open-Source SQL Engine for Hadoop. Kornacker et. al, CIDR, 2015. psingh56
Streaming: SparkStreaming Discretized Streams: Fault-Tolerant Streaming Computation at Scale, Zaharia et al, SOSP, 2013. avenka35
QMS: Kafka Kafka Distributed Messaging System for Log Processing, Kreps et al, NetDB Workshop, 2011. Also read this comparison of widely used Queuing Messaging Processing Systems. wtoher2
Streaming: Dataflow The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing. Akidau et. al, VLDB, 2015. spani2
Streaming: Aurora Aurora: a new model and architecture for data stream management. Abadi et. al, VLDB, 2003. sjamal7
Social: FacebookAnalytics Data warehousing and analytics infrastructure at Facebook. Thusoo et. al, SIGMOD, 2010.q lchen79
Video: Chameleon Chameleon: Scalable Adaptation of Video Analytics. Jiang et. al, SIGCOMM, 2018. srawat5
RDMA: eRPC Datacenter RPCs can be General and Fast. Kalia et. al, NSDI, 2016. sgadho2
ML: TPU In-Datacenter Performance Analysis of a Tensor Processing Unit. Jouppi et. al, ISCA, 2017. cmonta9

Overview

Every student is required to present one paper to the class. One to two students will present each class. Presentations should last at most 20 minutes without interruption. However, presenters should expect questions and interruptions throughout.

Your presentation should be based on the following template. The presentation slides must be sent to the instructor team via piazza by before the paper is presented in class (12:29pm).

Each student presentation should include:

  • Problem: What is the paper trying to solve? How real is the problem?
  • Key idea: What is the main idea, approach, and/or insight. This aspect of the presentation should use examples whenever appropriate. Specifically, the presentation should discuss the technical details so that one can understand the key details without carefully reading it.
  • Novelty and Comparison: What is different from previous work, and why? How does this paper relate to the main paper (and other related work)?
  • Critique: Is there anything you would change in the solution?

Presentation Grading

The student presentations will be graded similarly to the reviews. In more detail, what I’m looking for is:

  • Does the presentation include all sections (Problem, Key idea, Novelty and Comparison, Critique)
  • Are all assertions backed up (e.g. “X is a bad idea” is not acceptable, but “X is a bad idea because Y”) is acceptable
  • Did the student understand the material? Are there factual flaws in the review? For example, if the paper defines a term, does the student use it appropriately? As another example, if students state that a paper is relevant because modern operating systems do things the same way, is that true?
  • Did the student consider whether the evaluation is sufficient? Does it show that the work doesn’t harm regular programs, even if it works well for some programs? Do they evaluate all the goals for the system?
  • Is the presentation well timed?

Assigning grades:

  • If the presentation does an excellent job on all four considerations, and provides genuinely insightful comments about the problem, contributions (going beyond what the paper claims are contributions), evaluation, confusions, it should receive a check plus
  • If two or more of the four criteria are not met, the presentation should receive a check minus
  • Otherwise, it should receive a check. A check plus is worth 1 point, a check is 3/4 point, a check minus is 1/2 point, and not giving a presentation is worth zero points.