Difference between revisions of "CISC220 F2021 Lab9"

From class_wiki
Jump to: navigation, search
Line 5: Line 5:
 
As discussed in class on Nov. 9, tries and Huffman coding can be used to compress files/messages by analyzing character frequencies and choosing codes accordingly.
 
As discussed in class on Nov. 9, tries and Huffman coding can be used to compress files/messages by analyzing character frequencies and choosing codes accordingly.
  
Most of a Huffman class is provided in starter code [http://nameless.cis.udel.edu/class_data/220_f2010/code/huffman.tar here].  It implements a ''TrieNode'' class that stores parent and child links as well as the character being encoded and its freqency.
+
Most of a Huffman class is provided in starter code [http://nameless.cis.udel.edu/class_data/cisc220/huffman.tar here].  It implements a ''TrieNode'' class that stores parent and child links as well as the character being encoded and its freqency.
 
The "forest of tries" discussed in lecture on 11/9 is implemented with a priority queue of TrieNode pointers (each representing the root of a subtrie), ordered on the combined frequency of each subtrie.
 
The "forest of tries" discussed in lecture on 11/9 is implemented with a priority queue of TrieNode pointers (each representing the root of a subtrie), ordered on the combined frequency of each subtrie.
  

Revision as of 16:16, 10 November 2021

Lab #9

1. File compression with Huffman coding

As discussed in class on Nov. 9, tries and Huffman coding can be used to compress files/messages by analyzing character frequencies and choosing codes accordingly.

Most of a Huffman class is provided in starter code here. It implements a TrieNode class that stores parent and child links as well as the character being encoded and its freqency. The "forest of tries" discussed in lecture on 11/9 is implemented with a priority queue of TrieNode pointers (each representing the root of a subtrie), ordered on the combined frequency of each subtrie.

Several major functions are unfinished. In order to separate them from the finished ones, they are located in main.cpp -- this is where you will be writing code.

The Huffman memberfunctions to finish in main.cpp:

  • [1 point] merge_two_least_frequent_subtries()
  • [1 point] compute_all_codes_from_trie(TrieNode *T)
  • [1 point] calculate_huffman_file_size()

2. Testing

  • [1 point] short_doi.txt requires x bytes in raw form. How many bytes does it take when compressed using your completed Huffman class?

3. Submission

  • Make a PDF file <Your Name>_Lab9_README.pdf with your answers to the testing questions above
  • Rename your code directory <Your Last Name>_Lab9 and create a single tar/zip/rar file out of it named <Your Last Name>_Lab9.tar (or .zip or .rar, etc.).
  • Submit it in Canvas by midnight at the end of Tuesday, November 16