Assignment 3
- Loader: From at least 4 different starting Wikipedia sites,
recursively collect a total of at least 400 Wikipedia sites by
following links (ignore Wikipedia navigation links). Store the
edges along with similarity-based distance metrics persistently
(possibly just in a Serialized file).
- Application: Write a program (either GUI or web-based) that
recreates the graph from step 1, and reports the number of
disjoint sets (based on one of the roots) as a connectivity check.
Allow a user to select any two sites, and display the shortest
(with resepect to weights) path between them. The path can be
indicated by a series of sites, at each step indicating the links
not taken; but you are encouraged to also graphically display
paths.
Doug Lea