Difference between revisions of "CISC220 F2021 Lab5"

From class_wiki
Jump to: navigation, search
(Created page with "==Lab #2== The ''Drozdek'' references below are to the textbook. It shouldn't matter whether you have the 3rd or 4th edition. The exercises don't make it clear, but "comp...")
 
(Lab #2)
 
Line 1: Line 1:
==Lab #2==
+
===1. Binary Search Tree (BST) implementation===
  
The ''Drozdek'' references below are to the textbookIt shouldn't matter whether you have the 3rd or 4th edition.
+
Write two versions of a templatized binary search tree class '''BSTree_Fast''' and '''BSTree_Slow''' (details below).  For each version you should define the following ''FOUR'' member functions: ''insert()'', ''contains()'', ''findMin()'', and ''findMax()'' [2 points for ''Fast'' version, 1 point for ''Slow''].  Both classes will use the templatized class '''BSTNode''' to store an inserted elementWhen an element (aka ''key'') is inserted more than once, there should not be multiple BSTNodes created, but rather the ''number'' member variable of the original BSTNode should be incremented.
  
The exercises don't make it clear, but "computational complexity" means "big O complexity". Please write your f function (number of instructions, counting ''assignments only'') as well as your g function.
+
Starter code is provided [http://nameless.cis.udel.edu/class_data/220_f2014/code/bst_fast_and_slow.tar here]
  
===1. Written problems===
+
* '''BSTree_Fast'''
 +
** Implement the BST using a traditional pointer-based linked structure as detailed in Chap. 6.2 of the textbook.
 +
** Write a member function ''print()'' which does the following [0.5 points]:
 +
*** Prints a count of how many unique keys there are
 +
*** Prints how many keys total there are
 +
*** Prints the ''min'' and ''max'' keys
 +
* '''BSTree_Slow'''
 +
** Implement the BST using an unordered [http://www.cplusplus.com/reference/vector/vector/ '''STL vector'''] of pointers to BSTNodes (the ''left'' and ''right'' pointer variables of each node will be NULL and unused).  Do not attempt to keep things ''organized'': ''contains()'' should be a simple loop through the vector, ''insert()'' should just be ''push_back()'' if this is the first occurrence of the key, and so on.  Note that this '''NOT''' the same as representing a binary tree's structure with an array as described in lecture and Chap. 6.2 of the textbook. 
 +
** Write a member function ''print()'' which does the same as above [0.5 points]
  
* '''(0.75 points)''' Drozdek exercise 2.11.7 about changing the longest subarray algorithm.  Please explain your reasoning.
+
===2. BST testing and comparison===  
** '''Note: There is a typo in the modified outer loop code.  "n==i" should read "n - i" '''
 
* '''(0.75 points)''' Drozdek exercise 2.11.8 about the ''k''th smallest integer
 
* '''(2 points)''' Drozdek exercise 2.11.10 with four loops.  Show the steps of your calculations.
 
* '''(1.5 points)''' Depending on the "shape" of the tree, ''insert()'' for a binary search tree might be as good as ''O(log n)'' (where ''n'' is the number of nodes in the tree) or it might be as bad as ''O(n)''.  Explain in detail what tree characteristics lead to both of these outcomes.
 
  
===2. Submission===
+
Write a driver function which creates and fills a '''separate''' '''BSTree_Fast''' and '''BSTree_Slow''' of C++ ''string''s '''for each''' of the following files by reading from them word-by-word and calling insert() repeatedly. Note that all of these files have been "cleaned" (punctuation removed, all lower-case).
  
* Make a '''PDF file''' <Your Name>_Lab5_README.pdf with your answers to the written exercises
+
* [http://nameless.cis.udel.edu/class_data/220_f2014/data/cleaned_doi.txt DOI] (about 1K total words)
* No need to compress or archive the PDF, since that's all you're turning in.
+
* [http://nameless.cis.udel.edu/class_data/220_f2014/data/cleaned_bts.txt BTS] (~14K words)
* Submit it in Canvas by ''midnight at the end of Thursday, October 9''
+
* [http://nameless.cis.udel.edu/class_data/220_f2014/data/cleaned_greatexp.txt GE] (~185K words)   
 +
 
 +
In your README PDF, answer the following questions ''for each file'' using the information from print() or your own custom functions.  ''Your BSTree_Fast and BSTree_Slow should give the same answers for the first three questions'' [1 point for all questions answered]
 +
 
 +
* How many unique words are there?
 +
* How many different words total are there? 
 +
* What are the alphabetically ''first'' and ''last'' words?
 +
 
 +
''How long'' does it takes for '''BSTree_Fast''' vs. '''BSTree_Slow''' to carry out the following ''FOUR'' operations for each of the three input files [1 point total]?  (To get precise timing under Unix, use the ''gettimeofday()'' function.  There is a code snippet under the Unix, Linux and Mac section [http://www.songho.ca/misc/timer/timer.html here] that should help.)
 +
* Create the BST
 +
* Search for each of these words with contains():
 +
** love
 +
** death
 +
** tyranny
 +
 
 +
===3. Submission===
 +
 
 +
* Include a '''PDF file''' <Your Name>_README.pdf which contains:
 +
** Answers to the questions above.  Do NOT include the full listing from your print() functions.  However, a call to print() for both versions of the BST should be in your driver code.
 +
** The timing measurements requested, in a nicely-formatted table, as well as comments on why you think the timings vary for the different (1) BST implementations, (2) operations being performed, and (3) files from which the BSTs were created. 
 +
* Rename your code directory <Your Last Name>_Lab5 and create a single tar/zip/rar file out of it named <Your Last Name>_Project1.tar (or .zip or .rar, etc.).  
 +
* Submit it in Canvas by ''midnight at the end of Tuesday, October 5''

Latest revision as of 14:33, 29 September 2021

1. Binary Search Tree (BST) implementation

Write two versions of a templatized binary search tree class BSTree_Fast and BSTree_Slow (details below). For each version you should define the following FOUR member functions: insert(), contains(), findMin(), and findMax() [2 points for Fast version, 1 point for Slow]. Both classes will use the templatized class BSTNode to store an inserted element. When an element (aka key) is inserted more than once, there should not be multiple BSTNodes created, but rather the number member variable of the original BSTNode should be incremented.

Starter code is provided here

  • BSTree_Fast
    • Implement the BST using a traditional pointer-based linked structure as detailed in Chap. 6.2 of the textbook.
    • Write a member function print() which does the following [0.5 points]:
      • Prints a count of how many unique keys there are
      • Prints how many keys total there are
      • Prints the min and max keys
  • BSTree_Slow
    • Implement the BST using an unordered STL vector of pointers to BSTNodes (the left and right pointer variables of each node will be NULL and unused). Do not attempt to keep things organized: contains() should be a simple loop through the vector, insert() should just be push_back() if this is the first occurrence of the key, and so on. Note that this NOT the same as representing a binary tree's structure with an array as described in lecture and Chap. 6.2 of the textbook.
    • Write a member function print() which does the same as above [0.5 points]

2. BST testing and comparison

Write a driver function which creates and fills a separate BSTree_Fast and BSTree_Slow of C++ strings for each of the following files by reading from them word-by-word and calling insert() repeatedly. Note that all of these files have been "cleaned" (punctuation removed, all lower-case).

  • DOI (about 1K total words)
  • BTS (~14K words)
  • GE (~185K words)

In your README PDF, answer the following questions for each file using the information from print() or your own custom functions. Your BSTree_Fast and BSTree_Slow should give the same answers for the first three questions [1 point for all questions answered]

  • How many unique words are there?
  • How many different words total are there?
  • What are the alphabetically first and last words?

How long does it takes for BSTree_Fast vs. BSTree_Slow to carry out the following FOUR operations for each of the three input files [1 point total]? (To get precise timing under Unix, use the gettimeofday() function. There is a code snippet under the Unix, Linux and Mac section here that should help.)

  • Create the BST
  • Search for each of these words with contains():
    • love
    • death
    • tyranny

3. Submission

  • Include a PDF file <Your Name>_README.pdf which contains:
    • Answers to the questions above. Do NOT include the full listing from your print() functions. However, a call to print() for both versions of the BST should be in your driver code.
    • The timing measurements requested, in a nicely-formatted table, as well as comments on why you think the timings vary for the different (1) BST implementations, (2) operations being performed, and (3) files from which the BSTs were created.
  • Rename your code directory <Your Last Name>_Lab5 and create a single tar/zip/rar file out of it named <Your Last Name>_Project1.tar (or .zip or .rar, etc.).
  • Submit it in Canvas by midnight at the end of Tuesday, October 5