Key | Papers | Student Email |
---|---|---|
Storage: f4: | f4: Facebook’s Warm BLOB Storage System | dporte7 |
ResourceNeg: YARN | Apache Hadoop YARN: Yet Another Resource Negotiator, Vavilapalli et al, SOCC, 2013. | kshind2 |
SQL: Hive | Major technical advancements in Apache Hive, Huai et al, SIGMOD, 2014. | bjain6 |
SQL: Impala | Impala: A Modern, Open-Source SQL Engine for Hadoop. Kornacker et. al, CIDR, 2015. | psingh56 |
Streaming: SparkStreaming | Discretized Streams: Fault-Tolerant Streaming Computation at Scale, Zaharia et al, SOSP, 2013. | avenka35 |
QMS: Kafka | Kafka Distributed Messaging System for Log Processing, Kreps et al, NetDB Workshop, 2011. Also read this comparison of widely used Queuing Messaging Processing Systems. | wtoher2 |
Streaming: Dataflow | The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing. Akidau et. al, VLDB, 2015. | spani2 |
Streaming: Aurora | Aurora: a new model and architecture for data stream management. Abadi et. al, VLDB, 2003. | sjamal7 |
Social: FacebookAnalytics | Data warehousing and analytics infrastructure at Facebook. Thusoo et. al, SIGMOD, 2010.q | lchen79 |
Video: Chameleon | Chameleon: Scalable Adaptation of Video Analytics. Jiang et. al, SIGCOMM, 2018. | srawat5 |
RDMA: eRPC | Datacenter RPCs can be General and Fast. Kalia et. al, NSDI, 2016. | sgadho2 |
ML: TPU | In-Datacenter Performance Analysis of a Tensor Processing Unit. Jouppi et. al, ISCA, 2017. | cmonta9 |
Overview
Every student is required to present one paper to the class. One to two students will present each class. Presentations should last at most 20 minutes without interruption. However, presenters should expect questions and interruptions throughout.
Your presentation should be based on the following template. The presentation slides must be sent to the instructor team via piazza by before the paper is presented in class (12:29pm).
Each student presentation should include:
- Problem: What is the paper trying to solve? How real is the problem?
- Key idea: What is the main idea, approach, and/or insight. This aspect of the presentation should use examples whenever appropriate. Specifically, the presentation should discuss the technical details so that one can understand the key details without carefully reading it.
- Novelty and Comparison: What is different from previous work, and why? How does this paper relate to the main paper (and other related work)?
- Critique: Is there anything you would change in the solution?
Presentation Grading
The student presentations will be graded similarly to the reviews. In more detail, what I’m looking for is:
- Does the presentation include all sections (Problem, Key idea, Novelty and Comparison, Critique)
- Are all assertions backed up (e.g. “X is a bad idea” is not acceptable, but “X is a bad idea because Y”) is acceptable
- Did the student understand the material? Are there factual flaws in the review? For example, if the paper defines a term, does the student use it appropriately? As another example, if students state that a paper is relevant because modern operating systems do things the same way, is that true?
- Did the student consider whether the evaluation is sufficient? Does it show that the work doesn’t harm regular programs, even if it works well for some programs? Do they evaluate all the goals for the system?
- Is the presentation well timed?
Assigning grades:
- If the presentation does an excellent job on all four considerations, and provides genuinely insightful comments about the problem, contributions (going beyond what the paper claims are contributions), evaluation, confusions, it should receive a check plus
- If two or more of the four criteria are not met, the presentation should receive a check minus
- Otherwise, it should receive a check. A check plus is worth 1 point, a check is 3/4 point, a check minus is 1/2 point, and not giving a presentation is worth zero points.