This extends Assignment 1 using persistent data structures
and additional similarity metrics. It requires two programs.
- Create B-tree-based (for primary keys) and/or
hash-based (possibly for frequencies) persistent data
structures for data and properties used in similarity
analyses, extending or changing those in assignment 1 if
applicable; and load it with (all) data.
- Optionally, pre-categorize keys into 5 to 10 clusters
using k-means, k-mediods, or a similar metric. (You can
instead perform categorization in the application
Extend Assignment 1 to display a category (cluster) and most
similar key from the above data structures.