Not logged in. Login

Syllabus

Class Schedule and Materials

January 7 (W)Introduction
No material assigned
January 12 (M)History: Database systems
Topic:
System internals relevant for data management systems.
Read:
Readings in Database Systems, Chapter 1: Anatomy of a database system (Sections 1-3)
Optional reading:
Page tables
How linux represents the virtual address space
January 14 (W)History: Database systems
Topic:
System internals relevant for data management systems. (Continue).
Read:
Readings in Database Systems, Chapter 1: Anatomy of a database system (Sections 1-3)
January 19 (M)History: Database systems
Topic:
Database Transactions.
Read:
Readings in Database Systems, Chapter 1: Anatomy of a database system (Sections 4-7)
Optional reading:
Serializability
Isolation in database systems
Instructor notes
Isolation in database systems
January 21 (W)History: Database systems
Topic:
Database Transactions (Continue).
Read:
Readings in Database Systems, Chapter 1: Anatomy of a database system (Sections 4-7)
January 26 (M)Small Data vs. Big Data
Topic:
To index or not to index?
Read:
Properties of Indexes (Ch. 8.3-8.4 Ramakrishnan and Gehrke)
Tree-structured indices (Ch. 9 Ramakrishnan and Gehrke)
Why MySQL could be slow with large tables? (skip sections on Buffers and Joins)
Optional material:
B+ trees
DBMS Indexing: The Basic Concept
Instructor notes:
Indexing and B+ trees
January 28 (W)An excursion into Distributed Systems
Topic:
Paxos
Optional material:
John Ousterhout's Paxos video
Instructor's notes:
Paxos notes
February 2 (M)History: Database systems
Topic:
Logging and Recovery in Database systems
Read:
Ramakrishnan and Gehrke, DBMS, 2nd edition -- Ch.20 (Logging)
Instructor notes:
ARIES Logging and Recovery
February 4 (W)History: Database systems
Topic:
Logging and Recovery in Database systems (Continue)
Read:
Ramakrishnan and Gehrke, DBMS, 2nd edition -- Ch.20 (Logging)
Instructor notes:
Redo vs. Undo
February 9 (M)Reading week
NO CLASS
February 11 (W)Reading week
NO CLASS
February 16 (M)File Systems for Big Data
Read:
The Google File System
Instructor notes:
GFS
February 18 (W)File Systems for Big Data
Read:
The Google File System
February 23 (M)In-memory analytics
Read:
Spark
Instructor notes:
Spark
February 25 (W)No class
NO CLASS.
March 2 (M)Delay scheduling and Amazon Dynamo
Read:
Delay Scheduling:A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling (skip Sections 3.5.1, 5.2-5.4)
Dynamo: Amazon's highly available key-value store
Instructor notes:
Delay Scheduling
Amazon Dynamo
March 4 (W)Delay scheduling and Amazon Dynamo
Lecture: Overview of technologies behind Dynamo DB
Slides:
Vector Clocks
Routing in DHTs
March 9 (M)Amazon Dynamo DB and Google Spanner
Discussion of the Dynamo DB paper. No reading assigned
March 11 (W)Amazon Dynamo DB and Google Spanner
Read:
Spanner: Google's Globally Distributed Database
Watch:
Video presentation at OSDI'12 (courtesy of USENIX)
March 16 (M)Real-time Data Analytics: the RAD stack
Read:
Building a Data Pipeline That Handles Billions of Events in Real-Time
Druid: A Real-Time Analytics Data Store
Instructor Notes
March 18 (W)Real-time Data Analytics: the RAD stack
Read:
Storm @Twitter
Instructor Notes
March 23 (M)Real-time Data Analytics: the RAD stack
Read:
Kafka: a Distributed Messaging System for Log Processing
Instructor Notes
March 25 (W)Real-time Data Analytics: the RAD stack
Read:
Nothing.
Other:
Two-phase commit
March 30 (M)Real-time Data Analytics: the RAD stack
Read:
ZooKeeper: Wait-free coordination for Internet-scale systems
Instructor Notes
April 1 (W)Presentations
Groups presenting:
Jonathan
Shijie, Yao, Siyong
Mehrnosh, Yunduz
April 6 (M)No Class
Easter Monday
April 8 (W)Presentations
Groups presenting:
Shailesh, Mohmed, Weipu
Louis, Frank, Ziqi
Honto, Linda, Ankit
April 13 (M)Presentations
Groups presenting:
Yongyi, Rao, Yu
Joydeep, Manekta, Poonam
Amirali, Subir, Naveen
Ramtin, Conor, Mohsen
Shruthi, Margaret, Pratik
Ziyang, Wenqi, Xudong

Reading list:

Replication Consistency semantics
Replication: Consistency Models
Discretized streams
Large neural networks
Apache Yarn
Dryad
Dryad LINQ
The Chubby lock service for loosely-coupled distributed systems
Scaling memcached at Facebook.
Avoiding coordination in database systems
A Comparison of Fractal Trees to Log-Structured Merge (LSM) Trees
Cassandra
Tachyon
Memcached
Paxos Made Live
ZippyDB?

Updated Wed July 08 2015, 18:28 by fedorova.