Difference between revisions of "CISC181 F2017 Lab5"

From class_wiki
Jump to: navigation, search
(WordStats)
(Instructions)
 
(5 intermediate revisions by the same user not shown)
Line 8: Line 8:
 
===Instructions===
 
===Instructions===
  
In this lab you will analyze text files and get experience working with strings and characters.   
+
In this lab you will analyze text files and get experience working with strings and characters by calculating some statistics about words in a file that you read.
 +
Your program will repeatedly prompt the user in the console to enter the name of a text file (relative to a base directory of your choosing) or 'q' to quitIf they do not want to quit, open the file with <tt>FileInputStream</tt> and read it with a <tt>Scanner</tt>, using this regular expression as your delimiter/word separator (see the [https://docs.google.com/presentation/d/11XdFO8KrfDq8i3WYCB3T5bFSqeKn1rIGW0re0Ok7-hI/edit?usp=sharing slides for Oct. 11]):
  
====WordStats====
+
<nowiki>"[\\s.!?,;:\\-()_\"]+"</nowiki>
  
In this exercise you will repeatedly prompt the user in the console to enter the name of a text file (relative to a base directory of your choosing) or 'q' to quit.  If they do not want to quit, open the file with <tt>FileInputStream</tt> and read it with a <tt>Scanner</tt>, using this regular expression as your delimiter/word separator:
 
 
<nowiki>
 
"[\\s.!?,;:\\-()_\"]+"
 
</nowiki>
 
 
'''Be careful about cutting and pasting this into Android Studio.  I have seen extra backslashes inserted automatically for several students, so make sure your delimiter string matches what you see here'''
 
'''Be careful about cutting and pasting this into Android Studio.  I have seen extra backslashes inserted automatically for several students, so make sure your delimiter string matches what you see here'''
  
After reading every word in the file, print the following information:
+
After reading every word in the file, print the following information, each on its own line:
  
 
# Number of words  
 
# Number of words  
# Longest word.  Note that if there are multiple words which "tie", the expected behavior is to output the first one found
+
# Longest overall word.  Note that if there are multiple words which "tie", the expected behavior is to output the first one found
 +
# Longest capitalized word (first character is a capital letter) 
 
# Word with most consonants.  Treat 'y' as a consonant
 
# Word with most consonants.  Treat 'y' as a consonant
 
# Alphabetically first word with 4 or more letters (treating upper-case and lower-case the same).  Do not count words that start with a non-alphabetic character  
 
# Alphabetically first word with 4 or more letters (treating upper-case and lower-case the same).  Do not count words that start with a non-alphabetic character  
Line 31: Line 28:
 
All of this should be be in a public class <tt>WordStats</tt>.  <tt>WordStats</tt> should have a constructor which takes the base directory string as its sole parameter, and handles the console input and looping itself.  '''Note: <tt>FileInputStream</tt>'s constructor wants the full-path name of the file.  The user should only have to enter the simple file name.  So the <tt>WordStats</tt> constructor parameter is the full path to the directory that the files live in, then concatenate that with the filename typed by the user'''
 
All of this should be be in a public class <tt>WordStats</tt>.  <tt>WordStats</tt> should have a constructor which takes the base directory string as its sole parameter, and handles the console input and looping itself.  '''Note: <tt>FileInputStream</tt>'s constructor wants the full-path name of the file.  The user should only have to enter the simple file name.  So the <tt>WordStats</tt> constructor parameter is the full path to the directory that the files live in, then concatenate that with the filename typed by the user'''
  
Please instantiate your <tt>WordStats</tt> class by creating an object in <tt>main()</tt>.  You should test it with the following files:
+
Please instantiate your <tt>WordStats</tt> class by creating an object in <tt>main()</tt>.  You should test it with the following FOUR files:
  
 
* [http://nameless.cis.udel.edu/class_data/181_s2017/getty.txt getty]
 
* [http://nameless.cis.udel.edu/class_data/181_s2017/getty.txt getty]
Line 47: Line 44:
 
===Submission===
 
===Submission===
  
Submit the following files:
+
Submit the following files on Sakai, each containing your name in a comment:
  
 
* <tt>Lab5.java</tt>
 
* <tt>Lab5.java</tt>
 
* <tt>WordStats.java</tt>
 
* <tt>WordStats.java</tt>

Latest revision as of 06:59, 10 October 2017

Preliminaries

  • Make a new project with n = 5 (following these instructions)
  • Name your main class "Lab5" (when creating a new module in the instructions above, in the Java class name field)
  • Modify Lab5.java by adding your name and section number in a comment before the Lab5 class body.

Instructions

In this lab you will analyze text files and get experience working with strings and characters by calculating some statistics about words in a file that you read. Your program will repeatedly prompt the user in the console to enter the name of a text file (relative to a base directory of your choosing) or 'q' to quit. If they do not want to quit, open the file with FileInputStream and read it with a Scanner, using this regular expression as your delimiter/word separator (see the slides for Oct. 11):

"[\\s.!?,;:\\-()_\"]+"

Be careful about cutting and pasting this into Android Studio. I have seen extra backslashes inserted automatically for several students, so make sure your delimiter string matches what you see here

After reading every word in the file, print the following information, each on its own line:

  1. Number of words
  2. Longest overall word. Note that if there are multiple words which "tie", the expected behavior is to output the first one found
  3. Longest capitalized word (first character is a capital letter)
  4. Word with most consonants. Treat 'y' as a consonant
  5. Alphabetically first word with 4 or more letters (treating upper-case and lower-case the same). Do not count words that start with a non-alphabetic character
  6. Alphabetically last word with 4 or more letters (treating upper-case and lower-case the same)

After printing this information, make sure to close the file, then prompt the user again until they want to quit.

All of this should be be in a public class WordStats. WordStats should have a constructor which takes the base directory string as its sole parameter, and handles the console input and looping itself. Note: FileInputStream's constructor wants the full-path name of the file. The user should only have to enter the simple file name. So the WordStats constructor parameter is the full path to the directory that the files live in, then concatenate that with the filename typed by the user

Please instantiate your WordStats class by creating an object in main(). You should test it with the following FOUR files:

Here is *some* of the expected output for the above files:

  • getty.txt: 268 words, longest word = "proposition"
  • doi.txt: 1325 words, longest word = "undistinguished"
  • bts.txt: 14574 words, longest word = "obstreperousness"
  • greatexp.txt: 186685 words, longest word = "architectooralooral"

Submission

Submit the following files on Sakai, each containing your name in a comment:

  • Lab5.java
  • WordStats.java