CISC220 F2023 Lab10
Lab #10
1. Hash tables
Starter code for an abstract class String_HT and two derived classes Chain_String_HT and Probe_String_HT is provided here. Several dictionaries ranging in size from 100 to ~84K English words, as well as text files for testing spellchecking, are also supplied along with the code.
The executable is called as follows: hashcheck <command filename>. Each command file initializes a hash table from a dictionary on the first line:
- CHAIN filename: Count words in filename, compute appropriately-size hash table, call constructor of Chain_String_HT class which uses separate chaining for collision resolution, and insert every word into that hash table. Chains are implemented with STL vector class. Only output is printing the dictionary hash table's WORD_COUNT TABLE_SIZE followed by a newline
- PROBE filename: Same as previous command but call constructor for Probe_String_HT class which uses quadratic probing for collision resolution. This scheme should use lazy deletion and follow the probing sequence f(1) = +1, f(2) = -4, f(3) = +9, f(4) = -16, and so on.
The initialization line is followed by 0 or more 1-argument "commands" (one per line) which will trigger calls to member functions of the hash table object:
- INSERT word: Insert word into hash table following appropriate collision scheme. Update word count and table size variables as necessary. If the word is already in the table, don't insert a duplicate. If the new load factor would be >= MAX_LOAD then re-hash by calling expand_and_rehash(). Once again, only output (after the insertion and bookkeeping) WORD_COUNT TABLE_SIZE
- REMOVE word: Remove word from hash table if it is there. Again output (after the removal and bookkeeping) WORD_COUNT TABLE_SIZE
- FIND word: If word<tt> is found in hash table, do and print nothing. If <tt>word is NOT found, print WORD NUM_COLLISIONS
- SPELLCHECK filename: Find every word in the file filename and store all NOT found words and the associated number of collisions in the badwords STL map. Then print size of badwords on its own line and iterate through badwords, printing every WORD NUM_COLLISIONS pair
2. Programming tasks
These are core String_HT functions that are required but not directly tested. Since they don't generate output but have side effects on the data structures above, every print function should either call one of these or use the modified data structure(s).
- calculate_neighbor_words(string &)
- DFS_traversal(string &)
- BFS_traversal(string &)
These String_HT functions will be directly tested and correspond one-to-one with the commands listed above:
- [0.5 points] print_num_neighbors(string &)
- [0.5 points] print_neighbors(string &)
- [1 point] DFS_print_connected(string &, string &)
- [1 point] DFS_print_num_connected(string &)
- [0.5 points] BFS_print_path_length(string &, string &)
- [1 point] BFS_print_path(string &, string &)
- [0.5 points] BFS_print_longest_path()
You may use AI on any part of this lab, but no human partner.
3. Submission
Submit 2 files to Gradescope: (1) your README and (2) your modified main.cpp. The README should contain your name, complete declarations of AI use, notes on any limitations or issues with your code, and your interpretation of any ambiguities in the assignment instructions. main.cpp should also contain your name and per-function comments on AI usage.
