Name:
Answer Key

SEE BODY TAG

SEE BODY TAG

 

SEE BODY TAG


Write scripts for the following questions, and upload them to the course website. You do not need to upload output files.

Begin each script with a comment that includes your name, the date, and the question number.

Here are some data files that can be downloaded and used for testing purposes. Download these files and save them in your working directory.


  1. Name this script "q:lines".

    [5 pts] Write a Python3 script that opens and reads a disk file. Your script must prompt for and receive a file name. It must then open and read each line of the file.

    [10 pts] Your script must calculate and report the number of lines in the file, and the average length of the lines. Report the average length to two decimal places.

    [5 pts] Also open a disk file for writing, named "output.txt". Write the file name and the report to it.

    Here are example outputs for some of the data files listed above:

    $ python3 ./q:lines.py
    What file? limerick-05.txt
    5 lines
    average line length = 28.40
    $
    $ python3 ./q:lines.py
    What file? sp0.txt
    48 lines
    average line length = 64.77
    $
    $ python3 ./q:lines.py
    What file? naval-ship-names.txt
    128 lines
    average line length = 65.52
    $
    $ python3 ./q:lines.py
    What file? happy.txt
    52 lines
    average line length = 33.17
    $
    $ python3 q:lines.py
    What file? earth-quarter.txt
    2826 lines
    average line length = 46.90
    $
    
  2. Name this script "q:words".

    [10 pts] Write a Python3 script that expects to be run from the command line, with one argument. The argument is a filename. You must open and read the contents of the file. (Hint: It's simpler to read the entire file at once, instead of line-by-line.)

    [10 pts] Your script must report the longest word and its length; the average word length; and the shortest word and its length. (If there is more than one word of a given length, any of them is good enough.)

    [5 pts] Use at least one list comprehension to do the calculations.

    Here are example outputs for some of the data files listed above:
    $ python3 q:words.py
    Usage: q:words.py <datafile>
    $
    $ python3 q:words.py limerick-05.txt
      25 words
       10 up-to-date
     4.68 average length
        2 to
    $
    $ python3 q:words.py happy.txt
     278 words
       14 refrigeration.
     5.00 average length
        1 a
    $
    $ python3 q:words.py sp0.txt
     538 words
       14 north-easterly
     4.64 average length
        1 a
    $
    $ python3 q:words.py naval-ship-names.txt
    1430 words
       15 perpendiculars,
     4.86 average length
        1 t
    $
    $ python3 q:words.py earth-quarter
    23091 words
       39 http://www.gutenberg.org/5/8/9/1/58912/
     4.68 average length
        1 a
    $
    
  3. Name this script "q:worddistrib".

    [5 pts] Write a Python3 script that expects to be run from the command line, with one argument. The argument is a filename. You must open and read the contents of the file.

    [5 pts] Your script must build a dictionary of distinct words (as keys) and the number of times each occurs.

    [5 pts] For each word/key, build a list of the positions (indexes) that the word appears within the file.

    [5 pts] Words should be converted to lowercase (or uppercase, your choice) — this is needed so that "the", "The", and "THE" are seen as the same word, rather than three different words.

    [10 pts] Report the number of distinct words, the average number of occurrences of each word, and the word that appears most frequently along with how many times it appears. If more than one word occurs that many times, report all of them.

    Here are example outputs for some of the data files listed above:
    $ python3 q:worddistrib.py
    Usage: q:worddistrib.py <datafile&gs;
    $
    $ python3 q:worddistrib.py limerick-05.txt
    most frequent: 2
        he
        his
        of
        the
        to
        wizard
    $
    $ python3 q:worddistrib.py happy.txt
    most frequent: 18
        fun
        happy
    $
    $ python3 q:worddistrib.py p0.txt
    most frequent: 62
        spam,
    $
    $ python3 q:worddistrib.py naval-ship-names.txt
    most frequent: 87
        the
    $
    $ python3 q:worddistrib.py earth-quarter.txt
    most frequent: 1331
        the
    $
    
  4. Name this script "q:url".

    [10 pts] Write a Python3 script that retrieves and reads a URL (webpage). The URL must be obtained from the command line; also obtain an output-filename from the command line. Example URLs are:

    • "http://bloomu.edu/computer-science/"
    • "http://whitehouse.gov".

    [10 pts] Write the contents of the URL to a text file. The output filename comes from the command line, as in the examples below.

    [5 pts] For each file, count the number of occurrences of the character "<" and of the character ">", and display the counts.

    An example run might look like this (counts and file sizes may vary over time):
    $ python3 q:url.py
    Usage: q:url.py <URL> <output-filename>
    $
    $ python3 ./q:url.py  https://montcs.bloomu.edu montcs.html
      265 occurrences of <
      265 occurrences of >
    $
    $ python3 ./q:url.py  http://whitehouse.gov  wh.html
     1259 occurrences of <
     1253 occurrences of >
    $ ls -l *.html
    -rw-r--r-- 1 bobmon bobmon   6726 Mar 15 09:12 montcs.html
    -rw-r--r-- 1 bobmon bobmon 116665 Mar 15 09:12 wh.html
    $
    


  • Name this script "q:floats".

    Write a Python3 script that expects to be run from the command line, with one argument. The argument is a file name; the file can be assumed to be in the same directory as the script itself.

    [10 pts] The files contain a series of floating-point numbers, one per line. Your script must read all the numbers and build a list of them. (Remember to convert them from text strings to actual numbers!) Then calculate and report some statistics on the list.

    The required outputs are number of items, average (arithmetic mean), geometric mean, and harmonic mean. Your script must define separate functions to calculate these values. They are defined as:

    Each of these functions will need to loop over the array to total up the sum or product.

    Example runs are shown here:

    $python3 python3 question2.py small.data
    data: small.data
      n: 10
    average: -6.471000
    geomean: 36.919416
    harmean: -53.035068
    $python3 python3 question2.py big.data
    data: big.data
      n: 10000
    average: 0.599920
    geomean: inf
    harmean: 37.305815
    $python3 python3 question2.py fracs.data
    data: fracs.data
      n: 100
    average: 0.744130
    geomean: 0.729085
    harmean: 0.714036