CIS 6930.5: Federated Distributed Systems

Fall 2006

Professor: Adriana Iamnitchi (Anda)
Semester: Fall 2006
Time and Venue: MW: 3:30pm-4:45pm in NES104
Office Hours: Wednesdays 1:00-3:00pm and by appointment
Office: ENB 334


announcements
syllabus course format
schedule
projects


Announcements

09/23/2006:
There will be no class meeting on Wednesday, October 4, due to Anda's travel. No reviews due, time to work on projects.
09/06/2006
:
There will be no class on Monday, September 11, due to Anda's travel. Reviews are due that day at midnight (a 12-hour extension).
08/28/2006
:
Welcome! Please join the class discussion board on H20. If you attend the class and haven't received an invite with all details, please email me.

top

Syllabus

Federated distributed systems are collections of Internet-connected autonomous computing nodes spread across administrative domains. Participation in these federated systems allows access to potentially unique or large sets of resources such as data, storage space, computing power, or services. Examples of federated systems include computational grids, peer-to-peer networks, and wide-area testbeds such as PlanetLab.

This course is a tour through various research topics in federated distributed systems. We will explore solutions and learn design principles for building large network-based computational systems. Our readings and discussions will help us identify research problems and understand methods and general approaches to design, implement and evaluate distributed systems. Topics include resource management (discovery, allocation), data management (replication, location), security, fault-tolerance, system characterization, and overlay construction. Our discussions will often be grounded in the context of deployed distributed systems such as Grids and peer-to-peer networks.

The course involves discussions of four papers a week and a final project.

Grading is based on paper reviews and contributions to the class discussions (45%) and the final project (55%).

Reading materials: Most of the papers are available on the Internet. There is no required textbook for this class.

Prerequisites:  This is a graduate-level class. Undergraduate students are welcome with instructor's consent (email to anda at cse dot usf dot edu).

top


Course format

The course is structured to provide (a) an in-depth understanding of current topics in large-scale, distributed system research; (b) experience with reviewing and presenting advanced technical material; (c) exercising writing papers. The class workload has a participation component and a final project.

Participation

In each class we discuss two research papers. Read the papers before class (be an efficient reader!)  and write a review for each paper that includes the following:

  1. State the main contribution of the paper
  2. Critique the main contribution. 
    1. Rate the significance of the paper on a scale of 5 (breakthrough), 4 (significant contribution), 3 (modest contribution), 2 (incremental contribution), 1 (no contribution or negative contribution). Explain your rating in a sentence or two.
    2. Rate how convincing the methodology is. You may consider some of the following questions (use what is relevant): do the claims and conclusions follow from the experiments? Are the assumptions realistic? Are the experiments well designed? Are there different experiments that would be more convincing? Are there other alternatives the authors should have considered? (And, of course, is the paper free of methodological errors?)
    3. What is the most important limitation of the approach?
  3. What are the three strongest and/or most interesting ideas in the paper?
  4. What are the three most striking weaknesses in the paper?
  5. Name three questions that you would like to ask the authors.
  6. Detail an interesting extension to the work not mentioned in the future work section.
  7. Optional comments on the paper that you’d like to see discussed in class.

Reviews must be submitted by noon before class to the relevant Rotisserie Discussion on H2O. Papers are discussed in class. Discussions will be lead by one or more students and may include a brief (5-minute) presentation of the paper. Discussion leaders do not need to submit reviews, but they need to:

top

Final Project

The final project is an opportunity for hands-on research in distributed systems. It involves literature survey, programming, running experiments or analytical modeling, analyzing results and writing a 10-page report. A list of project ideas is posted, but students are highly encouraged to propose topics of their own interest.  Teams of two students are highly recommended. Please see me if you want to form a 3-student team.

Milestones (tentative dates):

top


Schedule

Recent Talks by Ian Foster
The Anatomy of the Grid: Enabling Scalable Virtual Organizations, Foster, Kesselman, Tuecke.  IJSA, 2001.
On Death, Taxes, and the Convergence of Peer-to-Peer and Grid Computing. Ian Foster and Adriana Iamnitchi, IPTPS'03 [pdf] [ps]
 Scooped, Again, Jonathan Ledlie, Jeff Shneidman, Margo Seltzer, and John Huth. IPTPS'03 (pdf, ps, html)
DATE
TOPICS AND ARTICLES
EXTRA PAPERS (optional unless you're doing a project in this area)
DISCUSSION
LEADERS
8/28
Introduction to the class, goals, and structure. [ppt]

Anda
8/30 Dive-in:
  1. Automated Worm Fingerprinting, Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage, OSDI 2004.
  2. Planet Scale Software Updates

Anda
9/4
Labor Day.
9/6
Real systems (1): BitTorrent
  1. Exploiting BitTorrent For Fun (But Not Profit). Nikitas Liogkas, Robert Nelson, Eddie Kohler, Lixia Zhang.
    Talk slides: PPT
  2. Understanding BitTorrent: An Experimental Perspective [html
  1. A Case for Efficient Execution of Data-Intense Applications with BitTorrent on Computational Desktop Grids, Bei, Fedak, Cappello
Jeff K.
9/11
Real Systems (2): Peer-to-Peer
  1. A Survey and Comparison of Peer-to-Peer Overlay Network Schemes, Lua et. al
  2. Is P2P dying or just hiding? Karagiannis et al, [pdf]
Gnutella Protocol Specification
9/13
Real systems (3): Skype
  1. An Experimental Study of the Skype Peer-to-Peer VoIP System. Saikat Guha (Cornell), Neil Daswani, Ravi Jain.
    Talk slides: PDF
  2. Quantifying Skype User Satisfaction

Nicolas
9/14
Project proposals due. [12pt font, 1 page]
9/18
In-class discussion of project proposals.
No reviews required. Good reading:
  1. You and Your Research, R. W. Hamming [pdf] [html]
  2. Technology and Courage, I. Sutherland [pdf]

Anda.
9/20
Basics of Distributed Systems (1): Time and Synchronization
  1. Time, clocks and the ordering of events in a distributed system, Leslie Lamport, 1978 (Also, refer to this page for overview and interesting historical background: PODC Influential Paper Award 2000)
  2. Self-stabilizing systems in spite of distributed control, Communications of the ACM, 1974, 17(11):643-644, Edsger W. Dijkstra (Also, refer to this page for overview and interesting historical background: PODC Influential Paper Award 2002)

Anda.
9/25
Basics of Distributed Systems (2): Consensus
  1. The Byzantine Generals Problem, L. Lamport et al, TOPLAS 1982
  2. Impossibility of Distributed Consensus with One Faulty Process, Ficher, Lynch, Paterson, 1984

Anda.
9/27
Basics of the Internet: Design Principles and Topology
  1. End-to-end Arguments in System Design, J. Saltzer, D. Reed, and D. Clark, ACM Transactions on Computer Systems, Vol. 2, No. 4, pp. 195-206, 1984.[pdf]
  2. On Power-Law Relationships of the Internet Topology, Faloutsos, Faloutsos, and Faloutsos, SIGCOMM 1999 [pdf]

Jeff C.
10/2
Real Systems (4): Google and Ganglia
  1. The Google File System, Ghemawat et al. [pdf]
  2. The Ganglia Distributed Monitoring System: Design, Implementation, and Experience. Massie, Chun, and Culler. Parallel Computing, Vol. 30, Issue 7, July 2004. [PS]
Kevin
10/4
No class due to Anda's travel. Time to work on final projects.


10/9
Real headaches: Spammers
  1. Understanding the Network-level Behavior of Spammers
  2. Distributed Quota Enforcement for Spam Control Michael Walfish, J.D. Zamfirescu, Hari Balakrishnan, and David Karger, Scott Shenker

Alex
10/10
Literature surveys due [12pt font, 3 pages]
10/11
Security
  1. SybilGuard: Defending against Sybil Attacks via Social Networks
  2. DDoS Defense by Offense

Mayur
10/16
Top 500 Supercomputers and their applications (lecture)
Earth Simulator
Overview of the Blue Gene/L System Architecture
Anda
10/18
Designing and Evaluating Parallel Programs (lecture)
Designing and Building Parallel Programs
by Ian Foster
Reevaluating Amdhal's Law
Anda
10/23
PlanetLab (1):
  1. The Design Principles of PlanetLab
  2. PlanetLab Architecture: An Overview

Ayodele
10/25
PlanetLab (2):
  1. A Case for Informed Service Placement on PlanetLab
and Hand-on Demo by Ayodele and Mayur
 
Mayur and Ayodele
10/30 Grid Computing: What Is It Really About? (lecture)
Anda
11/1
Evaluating Decentralized Systems:
  1. Fallacies in Evaluating Decentralized Systems
    Andreas Haeberlen, Alan Mislove, Ansley Post (Rice University / Max Planck Institute for Software Systems), Peter Druschel (Max Planck Institute for Software Systems).
    Talk slides: PPT
  2. Using PlanetLab for Network Research

Anda
11/6
Play Time: Massive Multiplayer Online Games
  1. Agents-based Modeling for a Peer-to-Peer MMOG Architecture

Also, in preparation for this class, read the following:
http://www.raphkoster.com/gaming/mudtimeline.shtml
http://en.wikipedia.org/wiki/MMORPG
Enabling Massively Multi-Player Online Gaming Applications on a P2P Architecture

A Propagation of Virtual Space Information Using a P2P Architecture for Massively Multiplayer Online Games
Jeff and Jeff
11/8
Web Services

Lydia
11/13 Invited Lecture: Ian Taylor:
From Web services to P2P and Grids


11/15
No class due to Supercomputing conference in Tampa.

11/17
Midterm project reports due [12pt font, 5 pages]. Hard deadline.
11/20
Midterm project status report: in class 10-minute presentation and discussions
11/22
Akamai:
  1. ACMS: The Akamai Configuration Management System
  2. Drafting Behind Akamai (Travelocity-Based Detouring)

 Adam
11/27
All we know about botnets [a totally different game]
Sunitha, Rod
11/29
You, the Reviewer. [a new game]
Tim.
12/4
Final project presentations (1).
12/6
Final project presentations (2).
12/15
Final project reports due. [12pt font, 10 pages]

In the final schedule (voted by last year's class participants):

  1. Why Markets Could (But Don't Currently) Solve Resource Allocation Problems in Systems, Jeffrey Shneidman et al, HOTOS 2005
  2. The impact of DHT routing on resilience and proximity [pdf]

Paper Candidates: these papers are considered for the final schedule

  1. Shark: Scaling File Servers via Cooperative Caching
    Siddhartha Annapureddy, Michael J. Freedman, and David Mazières
    To appear in 2nd USENIX/ACM Symposium on Networked Systems Design and Implementation
    (NSDI '05) Boston, MA, May 2005.
    [ ps ] [ ps.gz ] [ pdf ]   Slides: [ pdf ]
  2. Minimizing Churn in Distributed Systems
  3. Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines

  4. Tycoon: an Implementation of a Distributed, Market-based Resource Allocation System
  5. RE: Reliable Email, Scott Garriss, Carnegie Mellon University; Michael Kaminsky, Intel Research Pittsburgh; Michael J. Freedman, New York University and Stanford University; Brad Karp, University College London; David Mazières, Stanford University; Haifeng Yu, Intel Research Pittsburgh and Carnegie Mellon University
  6. PRESTO: Feedback-driven Data Management in Sensor Networks, Ming Li, Deepak Ganesan, and Prashant Shenoy, University of Massachusetts, Amherst
  7. Practical Data-Centric Storage, Cheng Tien Ee, University of California, Berkeley; Sylvia Ratnasamy, Intel Research Berkeley; Scott Shenker, ICSI and University of California, Berkeley
  8. Virtualization Aware File Systems: Getting Beyond the Limitations of Virtual Disks, Ben Pfaff, Tal Garfinkel, and Mendel Rosenblum, Stanford University
  9. Olive: Distributed Point-in-Time Branching Storage for Real Systems, Marcos K. Aguilera, Susan Spence, and Alistair Veitch, Hewlett-Packard Laboratories, Palo Alto
  10. OverCite: A Distributed, Cooperative CiteSeer, Jeremy Stribling, MIT Computer Science and Artificial Intelligence Laboratory; Jinyang Li, New York University and MIT Computer Science and Artificial Intelligence Laboratory via University of California, Berkeley; Isaac G. Councill, Pennsylvania State University; M. Frans Kaashoek and Robert Morris, MIT Computer Science and Artificial Intelligence Laboratory
  11. Colyseus: A Distributed Architecture for Online Multiplayer Games, Ashwin Bharambe, Jeffrey Pang, and Srinivasan Seshan, Carnegie Mellon University
  12. Pip: Detecting the Unexpected in Distributed Systems, Patrick Reynolds, Duke University; Charles Killian, University of California, San Diego; Janet L. Wiener, Jeffrey C. Mogul, and Mehul A. Shah, Hewlett-Packard Laboratories, Palo Alto; Amin Vahdat, University of California, San Diego
  13. Exploiting Availability Prediction in Distributed Systems, James W. Mickens and Brian D. Noble, University of Michigan
  14. Efficient Replica Maintenance for Distributed Storage Systems , Byung-Gon Chun, University of California, Berkeley; Frank Dabek, MIT Computer Science and Artificial Intelligence Laboratory; Andreas Haeberlen, Rice University/MPI-SWS; Emil Sit, MIT Computer Science and Artificial Intelligence Laboratory; Hakim Weatherspoon, University of California, Berkeley; M. Frans Kaashoek, MIT Computer Science and Artificial Intelligence Laboratory; John Kubiatowicz, University of California, Berkeley; Robert Morris, MIT Computer Science and Artificial Intelligence Laboratory
  15. Understanding Pollution Dynamics in P2P File Sharing.
    Uichin Lee (UCLA), Min Choi (KAIST), Junghoo Cho, M. Y. Sanadidi, Mario Gerla (UCLA).
  16. Proactive replication for data durability.
    Emil Sit (MIT), Andreas Haeberlen (Rice / MPI-SWS), Frank Dabek (MIT), Byung-Gon Chun, Hakim Weatherspoon (UC Berkeley), Robert Morris, M. Frans Kaashoek (MIT), John Kubiatowicz (UC Berkeley).
  17. F2F: Reliable Storage in Open Networks.
    Jinyang Li, Frank Dabek (MIT).
  18. On Object Maintenance in Peer-to-Peer Systems.
    Kiran Tati, Geoffrey M. Voelker (UCSD).
    Talk slides: PPT
  19. Tribler: A Social-Based Peer-to-Peer System.
    J. Pouwelse, P. Garbacki, J. Wang (Delft University of Technology), A. Bakker (Vrije Universiteit), J.Yang, A.Iosup, D.Epema, M.Reinders (Delft University of Technology), M. van Steen (Vrije Universiteit), H.Sips (Delft University of Technology).
  20. Fair File Swarming with FOX.
    Dave Levin, Rob Sherwood, Bobby Bhattacharjee (University of Maryland, College Park).
  21. Group Therapy for Systems: Using link attestations to manage failures
    Michael J. Freedman (NYU), Ion Stoica (UC Berkeley), David Mazieres (Stanford), Scott Shenker (ICSI/UC Berkeley).

top


Projects

Some ideas will be suggested in class. You're strongly encouraged to propose your own project ideas. Be innovative and aim high!

top


announcements syllabus
course format
schedule
projects


Adriana Iamnitchi (anda at cse usf edu)