Statistical Analysis of Rainfall

Special Note: You should read the Statistical Analysis and Geometry section in the Water Management module before coming to the lab.

Each group will analyze rainfall data collected from a particular location in the US Virgin Islands. Your lab instructor will assign to your group the specific location. The data can be accessed through the links given below. The instructions below outline how to analyze the data using Corel Quattro Pro software. If you are familiar with and prefer Excel, it is also available.

Getting Started

If you are not presently reading this lab exercise from an Internet browser, you will need to do so in order to access the data. This lab is called "Rainfall Statistics Lab" and can be found from the Schedule link on the SCI 301 Website.

Open the Corel Quattro software. On most of the lab computers, this can be done by clicking on the calculator icon in the menu bar at the bottom of the screen. Once the program is running, you should have a standard spreadsheet before you. Save this sheet under a file name that unambiguously identifies the location of the data and your group members. (For example, if Deron and Kelolo are analyzing Wintberg data, Wint-DeKe would make a nice name.) You may save it directly onto the hard drive, but it is recommended that you also save it to your personal floppy disk.

Once your instructor specifies the data set for your group, you can click on the appropriate link to bring the data set into the browser window. The data before you will be in 3 column frames, one for the year, one for the month, and one for the rainfall.

Sorting and Cleaning the Data

First, you will need to do some data housecleaning. In the rainfall column will be at least a few rows with the entry 9999... These entries represents months in which no data was available and must be removed. In addition, every 13th row will be marked ANN in the month column that represents the total rainfall for the previous 12 entries. These must also be removed.

All three columns will now be re-ordered from least rainfall to greatest. Consequently, all the 9999 data will be at the end. The ANN data will probably also be at the end. But you will need to check all the data to be sure. All of the rows to be discarded that are together can be cleared at once by dragging the mouse down the far-left column with the row numbers. Under the Edit menu, select Clear Row(s).

(Note: There is an alternative method for selecting columns for the "Sort" option, that works for most other dialog box entries as well. Click the small box to the right of any field. The dialog box will contract into a small rectangle above the spreadsheet. Drag the mouse over all the data you wish to place in that field, then open the dialog box again. The proper entries should appear in the field. Your instructor can help you with this method.)

At this point, you may wish to save the file again as a safety precaution. Do so under a different name, such as "Wint-DeKe2". You do not want to overwrite the original file containing the unsorted data.

Descriptive Statistics

Frequency Distributions

The first frequency histogram you will make displays the number of months that have rainfall within specified ranges or "bins". Although you may eventually wish to change the bin size, start with bins of width 1". The idea is to count the number of months that have 0-1 inches of rain, 1-2 inches of rain, etc. To do this with the Quattro Pro software, you must first create a new column that specifies the upper limit of each bin. Each cell entry represents the upper limit of that bin, and previous cell is automatically taken as the lower limit.

You are now ready to plot a histogram of the data. Note that there are many ways to plot this data. If you are not familiar with Quattro Pro, the following instructions should get you through the process.

Printing the histogram is straighforward. It is recommended that you first select "Print Preview" under the File menu to be sure that the histogram is finalized and no changes or additions are needed. If you are satisfied, then print a copy for each member of your group.

Monthly Averages

In many regions of the Caribbean there are wet months and dry months. A plot of average rainfall vs the month can verify whether this is true for your data set.

Spreadsheet Water Management

In this section you will use the power of spreadsheets to investigate the effects of water usage, rainfall, roof size and cistern size on water management. Each month a family uses water, collects and stores water. You have data for monthly rainfall, but you will also need to make some assumptions for the other parameters. You will be provided with typical values for the first pass through the analysis. Following that, you will be able to make your own analysis with estimates based upon your own family or personal experience.

  1. The UVI Extension Office estimates water usage at about 25 gallons per day per person. For this example, we will assume a family of four. The monthly usage, denoted "Usage", will then be 25 x 4 x 30, or

    Usage ~ 3000 gallons/month.

  2. A typical 3-bedroom 2-bath house might have a roof area of approximately 40 ft by 40 ft = 1600 ft2. The volume of water collected in any month is the roof area times the "depth" of rainfall for that month. Noting that the rainfall data is given in inches, the total volume collected, in units of cubic feet, would be 1600 x (monthly rainfall /12). Since the usage is in gallons, we will use the conversion 1 ft3 = 7.48 gallons. The number of gallons collected in a month, denoted "Collected", would be 1600 x (monthly rainfall / 12) x 7.48, or

    Collected ~ 1000 x (Rainfall in inches) gallons/month

    For a month with 3" of rainfall, this would be 3,000 gallons, exactly equal to the usage.

  3. The Virgin Islands building code dictates that the cistern size should be at least 10 gallons for every square foot of roof area. For our example, that would be 16,000 gallons.

  4. And finally, there is the starting amount of water in the cistern, denoted "Start". In practice this could be any value from 0 to 16,000 gallons. You will try several values, but begin with a half-full cistern,

    Start ~ 8,000 gallons

Using the above estimates, you can calculate the amount of water in the cistern at the end of each month, denoted "Final". It is the simple formula

Final = Start + Collected - Usage = Start + 1000 x (Rainfall in inches) - 3000

For the each month, "Start" will be the "Final" value from the previous month. Although you could do this by hand, the spreadsheet software will make quick work of it.

Column D should now be filled with the amount of water left in the cistern at the end of each month. Scan down the column and note the values. The cistern can never be less than empty or more than full. Hence, values outside 0 and 16,000 are not valid. To account for these limitations, you will need to add one more level of sophistication to the analysis. (If this is not the case for your data set, the instructor will make a change so you may proceed with the next section.)

Since the cistern will never contain more than 16,000, all calculated values above this should be set equal to 16,000. (Rain water coming into a full cistern simply overflows and is lost!) There is an easy way to do this. The "min" function, @min(a,b,c, ...) takes the minmum value of the numbers inside the parentheses. Here's how to use this function for our purposes.

Report

Be sure to include a copy of your spreadsheet. For each section of this lab, you should at a minimum comment on the following: