Syllabus
Class Schedule and Materials
| January 7 (W) | Introduction |
|---|---|
| No material assigned | |
| January 12 (M) | History: Database systems |
| Topic: System internals relevant for data management systems. Read: Readings in Database Systems, Chapter 1: Anatomy of a database system (Sections 1-3) Optional reading: Page tables How linux represents the virtual address space | |
| January 14 (W) | History: Database systems |
| Topic: System internals relevant for data management systems. (Continue). Read: Readings in Database Systems, Chapter 1: Anatomy of a database system (Sections 1-3) | |
| January 19 (M) | History: Database systems |
| Topic: Database Transactions. Read: Readings in Database Systems, Chapter 1: Anatomy of a database system (Sections 4-7) Optional reading: Serializability Isolation in database systems Instructor notes Isolation in database systems | |
| January 21 (W) | History: Database systems |
| Topic: Database Transactions (Continue). Read: Readings in Database Systems, Chapter 1: Anatomy of a database system (Sections 4-7) | |
| January 26 (M) | Small Data vs. Big Data |
| Topic: To index or not to index? Read: Properties of Indexes (Ch. 8.3-8.4 Ramakrishnan and Gehrke) Tree-structured indices (Ch. 9 Ramakrishnan and Gehrke) Why MySQL could be slow with large tables? (skip sections on Buffers and Joins) Optional material: B+ trees DBMS Indexing: The Basic Concept Instructor notes: Indexing and B+ trees | |
| January 28 (W) | An excursion into Distributed Systems |
| Topic: Paxos Optional material: John Ousterhout's Paxos video Instructor's notes: Paxos notes | |
| February 2 (M) | History: Database systems |
| Topic: Logging and Recovery in Database systems Read: Ramakrishnan and Gehrke, DBMS, 2nd edition -- Ch.20 (Logging) Instructor notes: ARIES Logging and Recovery | |
| February 4 (W) | History: Database systems |
| Topic: Logging and Recovery in Database systems (Continue) Read: Ramakrishnan and Gehrke, DBMS, 2nd edition -- Ch.20 (Logging) Instructor notes: Redo vs. Undo | |
| February 9 (M) | Reading week |
| NO CLASS | |
| February 11 (W) | Reading week |
| NO CLASS | |
| February 16 (M) | File Systems for Big Data |
| Read: The Google File System Instructor notes: GFS | |
| February 18 (W) | File Systems for Big Data |
| Read: The Google File System | |
| February 23 (M) | In-memory analytics |
| Read: Spark Instructor notes: Spark | |
| February 25 (W) | No class |
| NO CLASS. | |
| March 2 (M) | Delay scheduling and Amazon Dynamo |
| Read: Delay Scheduling:A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling (skip Sections 3.5.1, 5.2-5.4) Dynamo: Amazon's highly available key-value store Instructor notes: Delay Scheduling Amazon Dynamo | |
| March 4 (W) | Delay scheduling and Amazon Dynamo |
| Lecture: Overview of technologies behind Dynamo DB Slides: Vector Clocks Routing in DHTs | |
| March 9 (M) | Amazon Dynamo DB and Google Spanner |
| Discussion of the Dynamo DB paper. No reading assigned | |
| March 11 (W) | Amazon Dynamo DB and Google Spanner |
| Read: Spanner: Google's Globally Distributed Database Watch: Video presentation at OSDI'12 (courtesy of USENIX) | |
| March 16 (M) | Real-time Data Analytics: the RAD stack |
| Read: Building a Data Pipeline That Handles Billions of Events in Real-Time Druid: A Real-Time Analytics Data Store Instructor Notes | |
| March 18 (W) | Real-time Data Analytics: the RAD stack |
| Read: Storm @Twitter Instructor Notes | |
| March 23 (M) | Real-time Data Analytics: the RAD stack |
| Read: Kafka: a Distributed Messaging System for Log Processing Instructor Notes | |
| March 25 (W) | Real-time Data Analytics: the RAD stack |
| Read: Nothing. Other: Two-phase commit | |
| March 30 (M) | Real-time Data Analytics: the RAD stack |
| Read: ZooKeeper: Wait-free coordination for Internet-scale systems Instructor Notes | |
| April 1 (W) | Presentations |
| Groups presenting: Jonathan Shijie, Yao, Siyong Mehrnosh, Yunduz | |
| April 6 (M) | No Class |
| Easter Monday | |
| April 8 (W) | Presentations |
| Groups presenting: Shailesh, Mohmed, Weipu Louis, Frank, Ziqi Honto, Linda, Ankit | |
| April 13 (M) | Presentations |
| Groups presenting: Yongyi, Rao, Yu Joydeep, Manekta, Poonam Amirali, Subir, Naveen Ramtin, Conor, Mohsen Shruthi, Margaret, Pratik Ziyang, Wenqi, Xudong |
Reading list:
Replication Consistency semantics
Replication: Consistency Models
Discretized streams
Large neural networks
Apache Yarn
Dryad
Dryad LINQ
The Chubby lock service for loosely-coupled distributed systems
Scaling memcached at Facebook.
Avoiding coordination in database systems
A Comparison of Fractal Trees to Log-Structured Merge (LSM) Trees
Cassandra
Tachyon
Memcached
Paxos Made Live
ZippyDB?
Updated Wed July 08 2015, 18:28 by fedorova.