Overview

This class explores technologies, techniques, and designs for cloud data center networking, using real production networks at cloud providers like Google, Microsoft, and Amazon as an example. Topics include multipath topologies and routing, load balancing, network virtualization, fault-tolerance, performance isolation, network acceleration (e.g. RDMA), in-network computing, explicit congestion control, and protocol independent programmable networking hardware. Ultimately, the goal is to foster an understanding of the many different aspects of data center networking in a way that is both comprehensive and current.

By the end of this course, you will have a good understanding of the main elements that work together to form a data center network. This includes the topologies and techniques that are used to scale data center networks to hundreds of thousands of servers and the services and accelerators that are used to scale the performance of the distributed applications that are run in data centers.

Evaluation will include in paper reviews and discussion, two homeworks, and a final project.

Prerequisites

This class covers both advanced networking concepts and low level network programming; as such, for this course, thorough understanding of operating systems concepts as covered in CS 361 and networking concepts covered in CS 450 is highly recommended. If you took CS 361 or CS 450 but struggled with either the concepts or homework assignments, you may want to reconsider taking this class until you have gained a bit more experience with low-level programming and the POSIX API.

Peer Instruction

This course will be taught using Peer-Instruction, a teaching model which places stronger emphasis on classroom discussion and student interaction.

Evaluation

Grades are curved based on an aggregate course score. The course grade weighting is:

Task % of total grade
Paper Reviews 25
Class Participation 15*
Homeworks 35
Project 25

PAPER REVIEWS

Every class meeting, we will discuss at least one research paper. By 12:15pm the day of class, everyone will have to submit a Paper review to the Course Discussion Website (Piazza).

CLASS PARTICIPATION

Participation is an incredibly important facet of this course. The baseline Class Participation grade will be based off of both participating in classroom discussion questions and participating in discussion of paper reviews on Piazza. (*) Your class participation grade can grow to a maximum of 25 through exceptional participation. Additional points are a bonus reserved for substantial contributions, entirely at the instructor’s discretion. Exceptional participation includes early reports of errors in assignments, helpful discussion on Piazza, contribution of helpful code to the common good of the class (e.g. test cases and/or testing scripts) and thoughtful discussions during lecture.

HOMEWORKS

Homeworks will consist of approximately 2-3 programming projects with duration between one and two weeks. Be sure to consult the online handout or the course discussion website if you have any questions.

Extra credit will not be awarded for early turnins. Zero credit will be given in any of the following cases:

  • No assignment submitted.
  • An assignment submitted after the due date, without notifying the TA before hand.
  • An assignment submitted after the due date, after you’ve used your two late submissions.
  • An assignment submitted more than one week after the original due date.

Extra credit will be given in the following cases at the professors discretion:

  • Documentation Contributions: An ideal documentation contribution would be a markdown file (e.g., README.md) that gives a step-by-step guide for setting up and running any programs and experiments.
  • Infrastructure Contributions: These include vagrant images, ansible playbooks, cloudlab experiments and images, and other various configuration and installtion scripts. Ideally, such contributions are also well documented.

HOMEWORK LATE POLICY

All assignments are published due date. You have a total of 3 slip days without penalty, but you must clearly indicate the number of slip days you are using on the assignment to the professor via the course discussion website before the assignment is originally due and clearly indicate the number of slip days used in the assignment write-up.

FINAL PROJECT

Students will be required to form groups of 3-4 and to work together to complete a final project for this course. Final projects will be picked from a list compiled by the professor. The principle goal of this project is to create some new measurement or experiment related to the topic of data center networks and applications. To this end, every project is expected to generate at least 1 figure.

The project will be graded on the following different aspects:

  • Write-up: The motivation, methodology, and results of every project must be detailed in a project writeup document in either the format of a PDF or a MarkDown Website (e.g., Jekyll). In addition to describing the project, this write-up should also discuss the implications of the project.
  • Intellectual Contributions: Each project should ideally have some key intellectual contribution. Examples include new algorithms and novel measurements.
  • Infrastructure: A project should generate infrastructure artifacts. These artifacts include source code, experimental scripts, and CloudLab experiments and images.
  • Documentation: A project should be documented such that any other student in this class could be expected to repeat the experiments.

As with the homeworks, extra credit will be awarded for exceptional documentation contributions and infrastructure contributions.

ACADEMIC INTEGRITY

Consulting with your classmates on assignments is encouraged, except where noted. However, turn-ins are individual, and copying code from your classmates or other sources is considered plagiarism. For example, given the question “how did you do X?”, a great response would be “I used function Y, with W as the second argument. I tried Z first, but it doesn’t work.” An inappropriate response would be “here is my code, look for yourself.” You should never look at someone else’s code, or show someone else your code.

To avoid suspicion of plagiarism, you must specify your sources together with all turned-in materials. List classmates you discussed your homework with and webpages from which you got inspiration or copied (short) code snippets. Plagiarism and cheating, as in copying the work of others, paying others to do your work, etc, is obviously prohibited, and will be reported. We will be running MOSS, an automated plagiarism detection tool, on all submissions.

There are consequences to cheating on two levels - the consequences for your grade, and the consequences at the university level. Within class, even the first time cheating on a programming assignment or problem set will result in failing the class.

I also report all academic integrity violations to the dean of students. If it is your first time, the dean of students allows you to informally resolve the case - this means the student agrees that my description of what happened is accurate, and the only repercussions on an institutional level are that it is noted that this happened in your internal, UIC files (i.e. the dean of students can see that this happened, but no professors or other people can, and it is not in your transcript). If this has happened before, in any of your classes, this results in a formal hearing and the dean of students decides on the institutional consequences. After multiple instances of academic integrity violations, students may be suspended or expelled. For all cases, the student has the option to go through a formal hearing if they think that they did not actually violate the academic integrity policy. If the dean of students agrees that they did not, then I revert their grade back to the original grade, and the matter is resolved.

TOPICS COVERED (tentative)

  • Multipath Topologies and Routing
  • Virtual Networking
  • Data Center Load Balancing
  • Fault-Tolerance
  • Performance Isolation
  • Network acceleration and In-Network Computing
  • Atomic Multicast and Network Accelerated Consensus
  • Explicit Congestion Control and Packet Scheduling
  • Protocol Independent Programmable Networking Hardware
  • Network Function Virtualization

After these topics have been covered, the rest of the weeks will be filled with papers select by students from the Reading List