Instructor: Doug Lea
Class: 9:35 T-Th
Office/Lab hours. Normally T-W-Th 11-12:30, but send mail at any time with questions or to set up online meeting
Course home page:
Prereqs: CSC241 (CSC221 also strongly encouraged).


The design, implementation, and analysis of data structures and algorithms for data stores, data streams, graphs, and related domains, along with their use in interactive networked applications.



Upon completion of this course, students will demonstrate ability to:


Notes and links for most topics are available in the course notes. However, you may consider buying the following book that will be referred to as a basic source on most topics, and that you might want to keep around as a good reference: Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Cliff Stein. MIT Press / McGraw-Hill.

See also files from in-class coding: BitSet64, HT, DigitalTree, Persistent HT

You will also be expected to work through several online tutorials and related readings.


Subject to minor change:
Two exams (30%)
Exam questions cover the design and theory of operation of data stores, graphs, and related data structures and algorithms, as well as comparison of different algorithms with respect to algorithmic efficiency and appropriateness for solving given problems. See Practice exam.
Exercises (10%)
Very short in-class or take-home exercises extending examples covered in class. About one per week. Ungraded but discussed in follow-up classes.
3 programming assignments (60%)
Projects entail specialized versions of data stores and graphs in support of interactive network-based applications. You may work with a partner, but must also submit a brief writeup (or internal documentation) explaining collaboration strategy. Programs may not be submitted unless they successfully run according to specification. Two percent credit is taken off per day late.

You must demo within 2 days of submitting. As a substitute for live demos, you may accompany your program source and screenshots or remote screenshare of successful operation with a brief (approx one-page) design and experience account of your approach and how it changed over time. (This may be embedded as main program documentation.) If you maintain commit logs or time logs every time you work on project, a lightly edited collection of these should suffice. I may send follow-up questions before accepting.

  1. Assignment 1
  2. Assignment 2
  3. Assignment 3


As detailed in course notes, here are the targeted topics for lectures, not including classes for review, exams, or coverage of project-specific applications for data analytics. Topics don't always fit exactly into class times, so these don't always exactly correspond to dates. Some or all of the final three topics will be presented if time is available.
  1. Overview; Interfaces vs implementations; applications
  2. Bitsets: relation to discrete math structures, big-O analysis
  3. Hashing: probabilistic analysis, chained tables
  4. Hashing: hash functions, non-chained tables
  5. Balanced trees: average vs worst case analysis; 2-3-4 trees
  6. Red-black trees; as 2-3-4 emulation; graphical and code views
  7. Traversal, range queries, and deletion for (balanced) trees
  8. Secondary storage: latencies, buffering, file systems, data representation
  9. Data representations for persistence; marshalling, Java APIs
  10. B-tree and hash based persistent data stores
  11. Overview other data structures for data stores: tries, prefix, skiplists
  12. Graph terminology, properties, representations, search
  13. Graph traversal: depth-first, breadh-first, paths
  14. Spanning trees and cycles; union-find algorithms
  15. Priority queues and their role in weighted graph problems; Kruskal MST
  16. Weighted graph path and tree algorithms: Prim/Dijkstra
  17. Network Flow: Edmonds/Karp; overview of others and applications
  18. Dynamic programming; application to all-pairs paths
  19. NP complete problems; definitions, examples
  20. Algorithm design techniques: divide and conquer, greedy, etc
  21. Analysis for general techniques; overview of master theorem
  22. Probabilistic and approximate algorithms
  23. Compression: Huffman, LZW, overview of others
  24. Strings: exact matching; overview of other problems and algorithms
  25. Storage management and garbage collection

Campus references

The CS and College policies and resources pages provide more information about:

Doug Lea