Difference between revisions of "CISC220 F2023 Lab5"
(→1. Binary Search Tree (BST) implementation [4 points]) |
(→1. Binary Search Tree (BST) implementation [4 points]) |
||
(18 intermediate revisions by the same user not shown) | |||
Line 4: | Line 4: | ||
* <tt>insert(key)</tt>. Put key into the BST. When a key is inserted more than once, there should not be multiple BSTNodes created, but rather the <tt>number</tt> member variable of the original BSTNode should be incremented | * <tt>insert(key)</tt>. Put key into the BST. When a key is inserted more than once, there should not be multiple BSTNodes created, but rather the <tt>number</tt> member variable of the original BSTNode should be incremented | ||
− | * <tt>remove(key)</tt>: Delete key from the BST if it exists | + | * <tt>remove(key)</tt>: Delete key from the BST if it exists. To make the autograder happy, when removing a node with two children, always choose its "replacement" from the LEFT subtree |
* <tt>int frequency(key)</tt>: How many of a given key are in the BST? This is the <tt>number</tt> variable referred to above | * <tt>int frequency(key)</tt>: How many of a given key are in the BST? This is the <tt>number</tt> variable referred to above | ||
* <tt>key findMin()</tt>: What is the smallest key in the BST? | * <tt>key findMin()</tt>: What is the smallest key in the BST? | ||
* <tt>key findMax()</tt>: What is the largest key in the BST? | * <tt>key findMax()</tt>: What is the largest key in the BST? | ||
− | * <tt>int computeHeight()</tt>: What is the height of the BST tree? | + | * <tt>int computeHeight()</tt>: What is the height of the BST tree? '''You can use AI tools to help write this function if you wish''' |
For comparison and debugging purposes, a naive implementation called '''BSTree_Slow''' which uses an unordered [http://www.cplusplus.com/reference/vector/vector/ '''STL vector'''] of pointers to BSTNodes (the ''left'', ''right'', and ''parent'' pointer variables of each node are NULL and unused) is provided. This version does not attempt to keep things ''organized'', so it is rather inefficient. Note that this '''NOT''' the same as representing a binary tree's structure with an array as in Lab #4. Also, height is undefined for this implementation so <tt>computeHeight()</tt> returns -1. | For comparison and debugging purposes, a naive implementation called '''BSTree_Slow''' which uses an unordered [http://www.cplusplus.com/reference/vector/vector/ '''STL vector'''] of pointers to BSTNodes (the ''left'', ''right'', and ''parent'' pointer variables of each node are NULL and unused) is provided. This version does not attempt to keep things ''organized'', so it is rather inefficient. Note that this '''NOT''' the same as representing a binary tree's structure with an array as in Lab #4. Also, height is undefined for this implementation so <tt>computeHeight()</tt> returns -1. | ||
− | <tt>main.cpp</tt> (linked above) contains driver code to demonstrate (aka test) the functionality of the BSTree_X class -- you can choose whether the '''Fast''' or '''Slow''' version is running by commenting/uncommenting the appropriate class declaration. The driver works as follows: two input filenames are expected as arguments to the program. The first input file contains words which are automatically added to the tree with <tt>insert</tt>. The second file contains one or more of the following commands: | + | <tt>main.cpp</tt> (linked above) contains driver code to demonstrate (aka test) the functionality of the BSTree_X class -- you can choose whether the '''Fast''' or '''Slow''' version is running by commenting/uncommenting the appropriate class declaration. The driver works as follows: two input filenames are expected as arguments to the program. The first input file contains words which are automatically added to the tree with <tt>insert()</tt>. The second file contains one or more of the following commands: |
− | * SIZE: | + | * SIZE: Print the result of the tree <tt>size()</tt> member function |
− | * HEIGHT: | + | * HEIGHT: Print the result of the tree <tt>size</tt> function |
− | * MIN: | + | * MIN: Print the result of the tree <tt>size</tt> function |
− | * MAX: | + | * MAX: Print the result of the tree <tt>size</tt> function |
− | * FREQUENCY arg: | + | * FREQUENCY arg: Print the result of calling <tt>frequency(arg)</tt> |
− | * REMOVE arg: | + | * REMOVE arg: Call <tt>remove(arg)</tt>, with the special interpretation of arg == "MAX" as an instruction to <tt>remove(findMax())</tt>, and similarly for arg == "MIN". Otherwise arg is just interpreted as a literal string |
You should not have to change how any of the driver code works in <tt>main.cpp</tt>, except to choose which version of the BST you are running and possibly insert print statements for debugging. Just be aware that in Gradescope <tt>main.cpp</tt> will set the BST version to '''Fast'''. | You should not have to change how any of the driver code works in <tt>main.cpp</tt>, except to choose which version of the BST you are running and possibly insert print statements for debugging. Just be aware that in Gradescope <tt>main.cpp</tt> will set the BST version to '''Fast'''. | ||
+ | |||
+ | '''You may work on this lab with a partner''' | ||
===2. BST timing comparison [1 point] === | ===2. BST timing comparison [1 point] === | ||
− | + | In addition to the normal parts of your README file, add a small section with a comparison of '''BSTree_Fast''' vs. '''BSTree_Slow''' with respect to ''efficiency.'' Specifically, devise, carry out, and report the results of several timing tests with (a) input files varying in size from small to large, and (b) command files containing multiple <tt>frequency</tt>, <tt>remove</tt>, and/or <tt>findMin()/findMax()</tt> instructions. To perform precise timing on a Linux, Mac, or Windows machine, there is useful advice [http://www.songho.ca/misc/timer/timer.html here]. You should discuss what specific tests you performed, show with a table or graph the observed relationship between input file size and time to perform the chosen operations, and make some kind of conclusion about patterns or trends that you see. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | To carry out timing and hard-code specific "studies," you may be changing <tt>main.cpp</tt>. Please ''quote'' the relevant changes that you made to the provided <tt>main.cpp</tt> inside your README. | |
− | |||
− | |||
− | |||
− | |||
− | |||
===3. Submission=== | ===3. Submission=== | ||
− | + | As usual, submit two files through Gradescope: (1) <tt>README</tt> (or README.PDF -- actual nice formatting and/or figures for the timing comparison would be awesome!) and (2) your modified <tt>bst_fast_and_slow.hh</tt>. The README should contain your name and your partner's name, any AI tool citations/explanations, notes on any limitations or issues with your code, and your interpretation of any ambiguities in the assignment instructions. The code file should also have your name(s) and AI citations per the syllabus. | |
− | |||
− | |||
− | |||
− |
Latest revision as of 19:06, 29 September 2023
1. Binary Search Tree (BST) implementation [4 points]
Your task for this lab is to fill in the key member functions of a templatized binary search tree class BSTree_Fast (starter code and some test files are linked here). The class is defined in bst_fast_and_slow.hh -- this is where you should make all of your additions. Specifically, you need to complete the following functions using a traditional pointer-based linked structure as detailed in Chap. 6.2 of the textbook:
- insert(key). Put key into the BST. When a key is inserted more than once, there should not be multiple BSTNodes created, but rather the number member variable of the original BSTNode should be incremented
- remove(key): Delete key from the BST if it exists. To make the autograder happy, when removing a node with two children, always choose its "replacement" from the LEFT subtree
- int frequency(key): How many of a given key are in the BST? This is the number variable referred to above
- key findMin(): What is the smallest key in the BST?
- key findMax(): What is the largest key in the BST?
- int computeHeight(): What is the height of the BST tree? You can use AI tools to help write this function if you wish
For comparison and debugging purposes, a naive implementation called BSTree_Slow which uses an unordered STL vector of pointers to BSTNodes (the left, right, and parent pointer variables of each node are NULL and unused) is provided. This version does not attempt to keep things organized, so it is rather inefficient. Note that this NOT the same as representing a binary tree's structure with an array as in Lab #4. Also, height is undefined for this implementation so computeHeight() returns -1.
main.cpp (linked above) contains driver code to demonstrate (aka test) the functionality of the BSTree_X class -- you can choose whether the Fast or Slow version is running by commenting/uncommenting the appropriate class declaration. The driver works as follows: two input filenames are expected as arguments to the program. The first input file contains words which are automatically added to the tree with insert(). The second file contains one or more of the following commands:
- SIZE: Print the result of the tree size() member function
- HEIGHT: Print the result of the tree size function
- MIN: Print the result of the tree size function
- MAX: Print the result of the tree size function
- FREQUENCY arg: Print the result of calling frequency(arg)
- REMOVE arg: Call remove(arg), with the special interpretation of arg == "MAX" as an instruction to remove(findMax()), and similarly for arg == "MIN". Otherwise arg is just interpreted as a literal string
You should not have to change how any of the driver code works in main.cpp, except to choose which version of the BST you are running and possibly insert print statements for debugging. Just be aware that in Gradescope main.cpp will set the BST version to Fast.
You may work on this lab with a partner
2. BST timing comparison [1 point]
In addition to the normal parts of your README file, add a small section with a comparison of BSTree_Fast vs. BSTree_Slow with respect to efficiency. Specifically, devise, carry out, and report the results of several timing tests with (a) input files varying in size from small to large, and (b) command files containing multiple frequency, remove, and/or findMin()/findMax() instructions. To perform precise timing on a Linux, Mac, or Windows machine, there is useful advice here. You should discuss what specific tests you performed, show with a table or graph the observed relationship between input file size and time to perform the chosen operations, and make some kind of conclusion about patterns or trends that you see.
To carry out timing and hard-code specific "studies," you may be changing main.cpp. Please quote the relevant changes that you made to the provided main.cpp inside your README.
3. Submission
As usual, submit two files through Gradescope: (1) README (or README.PDF -- actual nice formatting and/or figures for the timing comparison would be awesome!) and (2) your modified bst_fast_and_slow.hh. The README should contain your name and your partner's name, any AI tool citations/explanations, notes on any limitations or issues with your code, and your interpretation of any ambiguities in the assignment instructions. The code file should also have your name(s) and AI citations per the syllabus.