CISC220 F2021 Lab9

From class_wiki
Revision as of 16:15, 10 November 2021 by Cer (talk | contribs)
Jump to: navigation, search

Lab #9

1. File compression with Huffman coding

As discussed in class on Nov. 9, tries and Huffman coding can be used to compress files/messages by analyzing character frequencies and choosing codes accordingly.

Most of a Huffman class is provided in starter code here. It implements a TrieNode class that stores parent and child links as well as the character being encoded and its freqency. The "forest of tries" discussed in lecture on 11/9 is implemented with a priority queue of TrieNode pointers (each representing the root of a subtrie), ordered on the combined frequency of each subtrie.

Several major functions are unfinished. In order to separate them from the finished ones, they are located in main.cpp -- this is where you will be writing code.

The Huffman memberfunctions to finish in main.cpp:

  • [1 point] merge_two_least_frequent_subtries()
  • [1 point] compute_all_codes_from_trie(TrieNode *T)
  • [1 point] calculate_huffman_file_size()

2. Testing

  • [1 point] short_doi.txt requires x bytes in raw form. How many bytes does it take when compressed using your completed Huffman class?

3. Submission

  • Make a PDF file <Your Name>_Lab9_README.pdf with your answers to the testing questions above
  • Rename your code directory <Your Last Name>_Lab9 and create a single tar/zip/rar file out of it named <Your Last Name>_Lab9.tar (or .zip or .rar, etc.).
  • Submit it in Canvas by midnight at the end of Tuesday, November 16