Evaluation of Several Metrics of Benthic Macroinvertibrates

This study simulates individuals that are modeled after benthic macroinvertebrate populations from streams in Vermont. Using the EPA biological scoring criteria for assessing the overall condition of streams, we analyzed two metrics (parameters) associated with these criteria. The two metrics, taxa richness and Community Loss Index (CLI) have "critical" values that were set by the EPA. These "critical" values although well established are not be justified statistically. In order to analyze these "critical" values, we analyzed three different options: sample size, taxonomic level, and changing the "critical" value to achieve our goal. We asked the question, how often would two samples of 100 organisms, which are both taken from the same reference population, be considered different? By simulating two different samples from the same population 100 times, we could obtain an estimate of percentage of times one would conclude the population to be different. Generally, the best results occurred at the genus level with a slight modification to the "critical" value and a sample size of 100 for the taxa richness metric. For CLI, the genus level with a sample size of 200 and no "critical" value change produced the best results.

Benthic macroinvertebrate species are differentially sensitive to many biotic and abiotic factors in their environment. Consequently, macroinvertebrate community structure has commonly been used as an indicator of the condition of an aquatic system (Armitage et al, 1983). These organisms present several advantages to biologists when discovering conditions of streams. The primary benefit is the fact that they are good indicators of localized conditions, because many benthic macroinvertebrates have limited migration patterns or a sessile mode of life, they are particularly well suited for assessing site-specific impacts (EPA 1999). The main rationale for consideration is the expectation that these organisms integrate variations in water quality and therefore have the ability to reflect transient, adverse conditions that can be missed by sporadic sampling of chemistry (Isaac and Szal 2000). In addition, macroinvertebrates integrate the effects of short-term environmental variations, mostly because of their complex life cycle of approximately one year or more. Sensitive life stages will respond quickly to stress, but the overall community will tend to respond more slowly (EPA 1989). Another consideration is the fact that sampling is relatively easy, requiring only a few people and more importantly, has a minimal detrimental effect on the resident biota. Also, identification of benthic macroinvertebrates can be achieved with only a minimal effort. In fact, an experienced biologist can readily classify macroinvertebrates to the family level and many "intolerant" organisms to lower levels as well. Lastly, benthic macroinvertebrate assemblages are made up of species that constitute a broad range of trophic levels and pollution tolerances, thus providing strong information for interpreting cumulative effects (EPA 1989). With the growing problem of pollution levels evident globally, there has been a recent surge in the importance of benthic macroinvertebrates. Detailed observations as well as careful monitoring of these organisms in their habitat have become extremely important. There has also been an increasing attempt to develop proper methods of sampling macroinvertebrates. The Environmental Protection Agency (EPA) has become the greatest influence towards regulating and enforcing methods to maintaining safe conditions of water quality. In fact, since 1989 the EPA has produced substantial guidance and documentation on both bioassessment strategies and implementation policy on biological surveys and criteria for water resource programs (EPA 1999). Since the invoking of the Clean Water Act, it has become necessary to implement a methodical means of monitoring of streams that was greatly lacking across the country. It 2 was further recognized that it was crucial to collect, compile, analyze and interpret environmental data rapidly to facilitate management decisions and resultant actions for control and/or mitigation of impairment (EPA 1989).
Thus, the concept of Rapid Bioassessment Protocols (RBP's) was established.
Originally, these RBP's had five main goals to achieve. First, the procedures for biological surveys needed to be cost-effective as well as scientifically valid.
Second, provisions for multiple site investigations in a field season needed to be made. Next, quick tum-around of results for management decisions was also necessary. Also, scientific reports, which are easily translated to management and the public, had to be written. Finally, the procedures that were to be established needed to be harmless to the environment. After these concepts were established, it was determined that these protocols were to be centered on benthic macroinvertebrates. Since the original RBP' s were created, there have been several revisions made to sharpen the precision of testing and sampling of these organisms due to advances in technology.
Actually, there were three different protocolS developed for benthic macroinvertebrates, each one with a more intensive evaluation than the previous. These protocols have three components in common, water quality/physical characteristics, habitat assessment, and a biosurvey. Upon gathering observations made in the assessment of habitat, water quality, physical characteristics, and the qualitative biosurvey, an investigator can conclude whether a site is impaired. The data analysis scheme used integrates several community, population, and functional parameters into a single 3 evaluation of biotic integrity (EPA 1989). Each component, or metric, depicts a different component of community structure and has a specific range of sensitivity to pollution stress. This integrated approach provides more assurance of a valid assessment because a variety of parameters are evaluated (EPA 1989). There are at least 8 different metrics that are considered to be important in the analysis of the status of a biological macroinvertebrate community. Of these metrics, there are two which will be evaluated, taxa (species) richness and community loss index (CLI).
The oldest and the simplest concept of species diversity is species richness: the number of species in the community (Krebs, 1999). This measure reflects the health of a community through a measurement of the variety of taxa present (EPA 1989). Basically, taxa richness increases as the quality of water increases. Developed by Courtemanche and Davies (1987), the Community Loss Index (CLI) shows a dissimilarity with values increasing as the degree of dissimilarity from a reference station increases (EPA 1989). It can also be used to describe a difference between a reference station and a station of comparison. Even though there are other metrics, these two metrics help to give a well rounded look at the community, given the data that was presented. (b) Score is a ratio of reference site to study site x 100.
(c) Determination of Functional Feeding Group is independent of taxonomic grouping.
(d) Scoring criteria evaluate actual percent contribution, not percent comparability to the reference station.
(e) Range of values obtained. A comparison to the reference station is incorporated in these indices.
Associated with each of the metrics are "critical" values. Table 1, shown above, comes directly from the EPA and depicts the metrics, the "critical" values, and also the methodology of classifying a stream. However, since there was no documentation, we determined that none of these "critical" values have statistical justification. It was decided to analyze these "critical" values in order to determine, statistically, how good they are. A slight modification was made to all of the "critical" values that did not meet the requirements of the original "critical" value. A notable discrepancy between the standard "critical" value and a modified "critical" value could result in a change of the "critical" value, which now has a statistical foundation.
The basic concept behind these RBP's is a quick and easy method of selecting and identifying 100 organisms from a sample to the lowest taxon possible. The main question that arises is how dependable is this type of sample in designating the health of a benthic community through various indices? More importantly, how often would two samples of 100 organisms, which are both taken from the same reference population, be considered different? The problem that is of concern is the Type I error or a level of a decision process. Statistically, the Type I error is rejecting the null hypothesis of equality when in fact the null hypothesis is true (McCJave, Dietrich and Sincich, 1997). In this case, every time a sample is considered to be different from another from the same population, essentially, a Type I error is committed. In general, an acceptable probability for a decision process Type I error level is 0.05.
It is generally much easier and less time consuming to identify taxa to the family level than it is to get to the genus or even the species level.
Basically, a trained biologist can identify organisms to the species level, whereas, a student in biology can correctly identify organisms to the family level. One way of reducing the cost of benthic surveys could be to shorten the time needed to identify and reduce the expertise required for identification, by using a lower level of taxonomic resolution, that is, identifying specimens to higher taxonomic levels such as family or phylum. (Olsgard,Somerfield,and 6 Carr, 1997). In order to evaluate this, a sample of 200 organisms identified at the family level was compared with a sample of 100 organisms identified at the genus (species) level, from the same population. Basically, a sample of 200 organisms identified at the family level can be less expensive and less time consuming, compared to a sample of 100 organisms at the genus level. Both of these provide an easier method of identifying macroinvertebrates that can be significantly cheaper and display certain characteristics of stream metrics.
Using these ideas compiled with the two metrics stated above, it will be shown how reliable this method of sampling (RPB) is to characterize the overall health of a benthic community.

II. Methodology
As mentioned above, the RBP's give specific guidelines for each of the metrics. Together, all of the metrics helps biologists categorize the overall condition of a stream. For example, a value that is less than 0.80 or 80% of a reference site for taxa richness would indicate a degradation of that stream.
Naturally, as the value increases to 1 or 100o/o would indicate a greater similarity of the two sites. The critical value associated with the metric, CLI, Upon observation, it can be seen that the range of values for this metric vary from zero to "infinity" and it naturally follows that as this value increases to 7 infinity there is a greater difference between the reference site and the site of comparison. This trend towards "infinity" indicates a degradation of the stream and therefore values greater than 0.50 would be considered impaired.
In order to analyze these two "critical" values, a reference site and a site of comparison was needed. Specifically, we used Monte Carlo simulation to simulate sampling from each of the Vermont reference sites in our database.
This database consisted of counts of species that were found by biologists at certain streams in Vermont. The species were taxonomically identified to both family and genus level. For the purpose of this paper, the first simulation on a population was deemed the reference site and the second simulation on the same population were considered to be the site of comparison. If two samples of 100 organisms, a reference and comparison site, were simulated from the same population, how many times would they be determined to be different?
In order to do this the metrics, taxa richness and CLI, will be calculated from each simulated reference site and site of comparison and examined individually.
Data that was obtained from Vermont streams was the basis for creating population models using Monte Carlo simulations. These Vermont streams were sampled reference sites. All sites were considered unimpaired, thus constituting a range of reference stream populations that we used to make In addition to recording the estimates of Type I error rates for taxa richness and CLI, we also calculated the ·"true" richness and evenness for each Vermont sampled community. These two values were picked because they represent the two measurements that make up the concept of species diversity that is widely acknowledged by biologists. Taxa richness was easily computed by simply counting the number of different species in each of the stream samples. For many decades field ecologists had known that most communities 9 of plants and animals contain a few dominant species and many species that are relatively uncommon. Evenness measures attempt to quantify this unequal representation against a hypothetical community in which all species are equally common (Krebs, 1999). Basically, evenness is a measure of how evenly spread all of the species in a community are. There are several different methods of calculating evenness measures, but the Shannon-Wiener function was most suitable. Strictly speaking, the Shannon-Wiener measure of information content should be used only on random samples drawn from a large community in which the total number of species is known (Pielou 1966).
The Shannon-Wiener function is as follows: H' = index of evenness pi = proportion of total sample belonging to ith species s = number of species The logx function represents that any"base of logarithms can be used, since they are all convertible to one another by a constant multiplier. The Shannon-Wiener measure increases with the number of species in the community and in theory can reach very large values. The theoretical maximum value is log (S) and the minimum value as N approaches S is log where N is the number of individuals in the community (Fager 1972). Since the Shannon-Wiener measure can range from 0 to=, it was standardized to a 0 to 1 scale by dividing H' by log (S), where S is the total number of species (taxa richness).
The next change that was made consisted of altering the sample size. A sample size of 100 is recommended by the EPA for the RPB protocols. An increase in sample size could actually augment the accuracy of the sample taken. It was noted that if Type I error was less than 0.05 in a sample size of 100, an increase in sample size would also result in a Type I error less than 0.05. If Type I error was greater than 0.05 in a sample of 100, there becomes a better chance that an increase in sample size could reduce the chances of making a Type I error. An increase to a sample size of 200 was made to all of the stream sites that had failed to be below the required 0.05 value. We also examined the possibility of increasing the sample size to 300 in some cases to ensure that we obtained results that were statistically acceptable. Doberstein, Karr, and Conquest (2000) developed an experiment using 100, 200, 300, 500, 700, and 1000 simulated organisms to determine the best sample size for certain metrics.
It is known that identifying species to the genus (species) level can be a rather tedious job. In fact, as stated above, a trained biologist is generally required. Identification of species to the family level is much easier as well as less time consuming. The next part of this research consisted of Monte Carlo simulations done on organisms identified to the family level. A sample of 100 organisms identified at the family level would be much more cost effective than a sample of 100 organisms identified at the genus (species) level. First of 11 all, in order to be able to compare family and genus levels, it is essential that each of the individual stream sites be exactly the same. For example, the organisms identified at the family and genus level must both come from the same site and actually from the same sample. However, the streams from the family level were not all the same as the streams from the genus level, so the streams that were common to both were used for comparisons. All of the previous information that was gathered for the genus level observations was similarly gathered for the family level. After finishing the 100-organism sample for the family level, we then examined the possibility of obtaining a 200-organism sample at the family level as well as a 300-organism sample.
The final alteration that was made consisted of now changing the "critical" value that was given by the RBP. These two values for taxa richness and CLI were 0.80 and 0.50 respectively. Because there is no known statistically valid reason for these values, it was decided to analyze exactly how accurate these values are and examine to see if a slight variation to them is needed. For this purpose, the 100-organism sample from the genus (species) level was used to analyze this change in "critical" levels.

III. Simulation
The population models used in this study were generated using Monte Carlo simulations. Monte Carlo methods are stochastic techniques -meaning they are based on the use of random numbers to investigate problems (Woller 1996). Using a random number generator from Excel species were "simulated" and recorded based on the computed proportions from the raw data. For each Vermont stream we knew a vector of species proportions [p1, p2, ... , pk].
Using this as a multinomial probability model, we simulated counts of species.
This simulation is equivalent to sampling organisms with replacement from existing stream samples as done by Doberstein, Karr, and Conquest (2000).
One completion of simulations on a single stream population consisted of generating either two lOO(samples) by lOO(repetitions) matrices, two 200(samples) by lOO(repetitions) matrices or two 300(samples) by lOO(repetitions) matrices of random samples of organisms. There were a total of 38 stream sites for the genus level and a total of 40 sites for the family level.
All of the streams were used for the 100 sample simulations for both genus and family level. There were a total of 28 streams that were common to both the genus data and the family data. These streams were used in all of the comparisons as mentioned above.

IV. Results
A: Sample size of 100 at the Genus Level All of the descriptions in this section represent sites in which we simulated a sample size of 100 organisms-from the Vermont streams. The data from these streams were all identified to the genus level. Figure 1 depicts a graph of the "true" taxa richness values plotted against the alpha level for richness. As stated above, alpha level or Type I error, is the probability of saying the sites are different when in fact the sites are the same. The associated "critical" value for richness is 0.80 as stated above. There are 15 13 sites out of the 38 different sites that fall above the 0.05 alpha level and 2 that are considered borderline since they are exactly 0.05. This is a high number of failures at 44.74% and is generally unacceptable. Upon looking at the graph, there does not seem to be any kind of trend that is associated with either high or low richness values. Figure 2 represents the "true" taxa richness values plotted against the alpha level for CLI. The critical value is 0.50 for CLI.
Observe that there are 7 sites out of 38 sites that fall above the 0.05 alpha level and 2 that have alpha values that are exactly 0.05 for a total of 9 out of 38 (23.68% ). This is still a high rate of failure but somewhat better. There does not seem to be any type of trend associated with this graph either. Figure 3 shows the graph of the "true" evenness values against the alpha values for taxa richness. Since the graph is plotted with the same alpha values, we have the same number of failures as in Figure 1. There seems to be some kind of indication that the higher evenness values corresponding to lower alpha levels and lower evenness values having higher alpha levels. However, eliminating the 3 outliers with low evenness values could. eliminate the possibility of any trend. Figure 4 represents the "true" evenness values graphed against the alpha levels for CLI. Because the graph utilizes· the same alpha values, there is the same number of failures as in Figure 2. Similar to Figure 3, there seems to be evidence of higher evenness values having lower alpha levels. This trend appears to be somewhat stronger than in Figure 3. If the same 3 outliers are removed, an argument can still be made that there still is a trend. Figure 5 reveals the graph of actual richness vs. actual evenness of each site. As we can 14 see, there is no specific richness and evenness combination that would suggest either high or low alpha levels. Figure 6 is the same type of graph as Figure 5 except we are using the CLI "critical" values instead of taxa richness. As in Figure 5, there does not appear to be any high or low richness and evenness values that have alpha levels that are similar.

B: Sample size of 200 at the Genus Level
As stated above all of the sites in this category were the sites that had alpha levels that were at or exceeded 0.05. Therefore there are 17 for the richness alpha values and 9 for the CLI alpha values. Figure 7 shows the graph of "true" richness versus alpha level for richness. We can see that now out of the 17 that failed before, only 2 sites exceeded the alpha level 0.05 and 2 others were exactly at 0.05. Assuming that all of the sites in the sample size of 100 that were below 0.05 level would also have been below 0.05 in the sample size of 200, we would now have a total of only 4 out of 38 that were at or exceeded the 0.05 level. This is about 10.5%, which is much more acceptable statistically. It is noted that by changing the sample size from 100 to 200, there is a substantial improvement that has occurred. We have gone from 44.74%, at sample size of 100, equal to or above the 0.05 alpha level to 10.5%, at the sample size of 200. Upon observing Figure 7, we can see that there is generally no trend. However, by removing the point (36, 0,08) one could detect some kind of rising trend from left to right, but this is a very weak trend at best. Figure 8 represents the graph of "true" richness versus the alpha level for CLI. Immediately we can see that there are no values that are above the 15 0.05 alpha value. This means that none of the sites, including the one's which had alpha values below 0.05 for the sample size of 100, had high alpha values associated with the CLI values. Figure 9 displays the graph of "true" evenness versus the alpha level for richness. As in Figure 7, there are 2 above the 0.05 alpha level and 2 that are exactly 0.05 which is much better than the sample size of 100. The biggest feature of this graph is the somewhat apparent trend occurring. There is a downward slope from left to right. This suggests that the sites with higher failure rates are associated with lower evenness values. Figure 10 shows graph of "true" evenness versus the alpha level for CLI. As in Figure 8, there are no alpha values above 0.05, therefore an attempt to discover any kind of trend is unnecessary. Figure 11 depicts the diagram of "true" richness values plotted against the "true" evenness values with the alpha levels for richness as the "third dimension", as in Figure 5. Since there are only four values that either exceeded or were exactly 0.05, it is nearly impossible to definitively state that there is any trend at all. Finally, Figure 12 displays the graph of "true" richness vs. "true" evenness, with the alpha level for CLI as the "third dimension". As in Figure 8 and Figure 10, there are no values that exceed the 0.05 alpha level therefore it is unnecessary to determine if any trend is apparent. (2.6%) that had an alpha value at or above 0.05. This is a low percentage rate however the sample size is a rather large increase from the original sample size of 100.

D: Sample size of 100 at the Family Level
For all of the graphs in this section, there are a total of 40 sites that were examined, as stated above, and the richness and CLI "critical" values are the same, at 0.80 and 0.50 respectively. Figure 13 represents the graph of "true" richness versus the alpha level for richness. As we can see, there are 30 sites that exceed the 0.05 alpha level and one, which is exactly at 0.05. With a percentage of 77.5%, this means that 3 out of every 4 sites failed to be below the alpha level of 0.05, with a good portion of them even exceeding the 0.10 alpha level. There does not appear to be any trend associated with "true" richness values, since, all of the data seems to be scattered around the middle.  Figure 15 depicts the graph of "true" evenness versus the alpha level for richness. Looking at Figure 15, we observe that again there are a total of 31 sites that exceed or are exactly 0.05, for reasons stated above. There is an interesting observation, however, we notice that occurs in Figure 15. For values of "true" evenness up to 0.70, there are no sites that have alpha levels that are less than 0.05. In other words, all of the sites that had "true" evenness values less than 0.70 had an alpha level above 0.05. Figure 16 shows the relationship of the "true" evenness versus the alpha level for CLI. As in Figure   14, there are only a total of four sites that are at or above the 0.05 alpha level.
Observing Figure 16, a downward slope can be identified. Upon looking at the values that are less than 0.70 for "true" evenness, we can see that there is only one site that has an alpha level of 0.00 and all of the higher alpha levels occur below this "true" evenness value. In addition, there are 11 sites that have alpha levels of 0.00 and are above 0.70. Figure 17 graphs the "true" richness values versus the "true" evenness values in which symbols are given to a specific range of alpha levels for richness as in Figure 5. Looking at the sites that had alpha levels less than 0.05, we can observe that all of them have evenness values greater than 0.70, however, there is no relationship between both richness and evenness in accordance with alpha levels. Finally, Figure 18 reveals the "true" richness versus the "true" evenness values with symbols representing a specific range of alpha levels for CLI, as in Figure 6. Since there are only four observations that are above the alpha level of 0.05, it is 18 difficult to detect any type of trend associated with high alpha levels.
However, there are two notes to make: 1) All four for CLI. Since there are only 4 for CLI, there are no graphs to represent the data because no observations can be made with so little data. Figure 19 represents the "true" richness versus the alpha level for richness. We can observe that there are 9 sites that have alpha levels greater than 0.05 and 2 that have alpha levels of 0.05 exactly. If we assume that the sites that had alpha levels less than 0.05 at a sample size of 100 also have alpha levels less than 0.05 at a sample size of 200. We have a total of 11 out of 40 (27 .5%) sites that "fail" to be below 0.05 alpha level. Referring to Figure 19 again, it is apparent that there is no specific pattern associated with the richness values. Next, Figure 20 displays the "true" evenness versus alpha levels for richness. As in Figure 19, there are now only 11 "failure" sites. Looking at only the sites with alpha levels greater than 0.05, we observe that there is an apparent downward 19 trend. Figure 21 shows the graph of "true" richness values versus "true" evenness values, as in Figure 5. Other than the fact that there is a relationship between both low richness and evenness values and high richness and evenness values, there is no observable trend associated with the alpha levels included. Since there were still a rather large number of sites that had alpha values at or above 0.05 for richness only, we decided to increase the sample size again from 200 to 300. This would provide us with fewer sites with alpha values at or above 0.05. From Section E we still had 11 out of 40 (27 .5%) sites for richness that "qualified" for this section. Figure 22 shows the graph of "true" richness versus the alpha value for richness and Figure 23 displays the graph of "true" evenness versus the alpha values for richness. We can observe from Figure 22 or 23 that there are 3 out of the 11 sites that still have alpha values at or above 0.05. Taking the results from Section D and E, this gives us only 3 out of 40 (7.5%) sites that still have alpha values at or above 0.05 after a sample size of 300.

G: Sample size of 100 at the Genus level ("critical" value change)
An alternative to changing the sample size and/or identification level would be to look at and possibly adjust the "critical" levels for richness and CLI. Upon examining the data for taxa richness, it was discovered that we wanted a less strict "critical" value since we wanted to reduce the chances of rejection. In order to obtain this, the data had to be carefully observed prior to judgment. A lower value would decrease the chances for a Type I rejection and therefore help to not make a wrong decision. It was decided that the change to the "critical" value would be from 0.80 to 0.75 for richness. We would reduce the "critical" value for richness by 0.05 until there were an acceptable percentage of alpha values. The change for CLI was from 0.50 to 2 1 0.55. The "critical" value for CLI would increase by 0.05 until an acceptable percentage of alpha values was found. Since we are only changing the "critical" value, it is not necessary to discuss trends since the data are not changing at all. Any type of trend discovered before would also hold true even after changing the "critical" value. For the sake of completion there are graphs for both "true" richness and "true" evenness. Figures 24 and 25 represent the graph of "true" richness and "true" evenness versus alpha level for richness with the "critical" value of 0.75, respectively. It is apparent from Figure 24 and Figure  Therefore there is significant statistical evidence that changing the rule has a very beneficial influence. Finally, Figures 26 and 27 depict the "true" richness and "true" evenness versus the alpha levels for CLI with a "critical" value of 0.55, respectively. We can see that there -is one site that has a value greater than the 0.05 alpha level and 3 that are equal to 0.05. There were 4 out of 38 (10.5%) sites with the change in CLI "critical" value and 9 out of 38 (23.68%) sites from the 100-sample size at the genus level with no change in "critical" value. From this, we observe that there are better results when there is a change in CU from 0.50 to 0.55. Since there were still 9 out of 38 sites with the 22 change from 0.50 to 0.55, we increased the "critical" value to 0.60. Figure 28 and Figure 29 represent the graphs of "true" richness and "true" evenness versus alpha level for CLI with the "critical" value of 0.60, respectively. We can observe from Figure 28 or 29 that there are no sites that exceed the 0.05 alpha value. Since there are no sites that have alpha values above 0.05, the "critical" value of 0.60 was considered the ending point for richness.

H: Sample size of 100 at the Family level ("critical" value change)
We also decided to change the "critical" value for the family level in addition to the change from Section G. The change for richness was the same as in Section E, from 0.80 to 0.75 and decreasing by 0.05 until an acceptable percentage of alpha values is reached. The change for CLI was also the same as in Section E. Figure 30 and Figure 31 show the graphs of "true" richness and "true" evenness versus alpha level for richness with the "critical" value of 0.75, respectively. We can observe from both Figure 30 and 31 that there are 8 sites that have alpha levels above 0.05 and 2 that have alpha levels at 0.05 exactly. If we compare the results from the iOO-sample size at the family level without a "critical" value change, we can observe a significant decrease in the number of sites that have alpha levels above 0.05. By changing the "critical" value from 0.80 to 0.75, the number drops from 31 out of 40 (77.5%) sites that had alpha levels above 0.05 with no change, to 9 out of 40 (22.5%) sites that had alpha levels above 0.05 with the change from 0.80 to 0.75. It is obvious that there is a distinct difference between the two "critical" values and it is apparent that the "critical" value of 0.75 is statistically more valid and accurate 23 than the 0.80 value. The next step was to change the "critical" value for richness from 0.75 to 0.70. This would prove to improve the results even further than the change from 0.80 to 0.75. Figures 32 and 33 display the "true" richness and "true" evenness versus the alpha values for richness with the "critical" value of 0.70, respectively. Observing either Figure 32 or 33, we can see that there are 2 sites that have alpha values above 0.05 and one site that has an alpha value exactly at 0.05. By changing the "critical" value from 0.75 to 0.70 we now have only 3 out of 40 (7 .5%) sites that have alpha values above 0.05. This is a good percentage without changing the "critical" value too much. Figures 34 and 35 represent the graphs for "true" richness and "true" evenness versus the alpha values for CLI with the "critical" value of 0.55, respectively. Upon observing Figure 34 and 35, it is apparent that there are 2 sites that have alpha levels greater than 0.05 and one that is 0.05 exactly. In comparison of the data obtained from the 100-sample size at the family level with no "critical" value change and the 100-sample size at the family level with a change, we can see that again there is. a small increase between the two. Now we have values that go from 4 out of 40 (10%) sites with no "critical" value change to 3 out of 40 (7.5%) sites with the change from 0.50 to 0.55.
There is not much of a drop in the number of sites with alpha levels above 0.05, however we are getting better results. In order to make this, 3 out of 40 (7 .5%) sites better, we increased the "critical" value again, from 0.55 to 0.60. Figure 36 and Figure 37 represent the graphs for "true" richness and "true" evenness versus the alpha level for CLI with the "critical" value of 0.60, 24 respectively. Upon observing either of the two figures, we observe that there is one site that had an alpha value greater than 0.05. Figure 36 has 15 sites that have the same values for more than one site. By changing the "critical" value from 0.55 to 0.60, we have decreased the number of sites with alpha values above 0.05. This number went from 3 out of 40 (7.5%) with the "critical" value of 0.55, to I out of 40 (2.5%) with the "critical" value of 0.60.
v. observe that through all of the comparisons, the genus level yields better results. Now changing to Table 3 for CLI, we hold the "critical" value and the sample size the same and compare the genus level with the family level. By doing this, it is obvious that the genus level with a sample size of 200 and "critical" value of 0.50 gives the best results of 0.0%. However, the family level with a sample size of 200 and "critical" value of 0.50 also gives excellent results with a 2.5% of alpha levels above 0.05.

C: The effects of changing the "critical" value
This section deals with changing the "critical" value while holding the sample size and the taxonomic levels constant. Again, it is obvious that a change in the right direction will certainly decrease the percentage of sites with alpha values greater than 0.05. However, it is noted that a deviation that is too far from the original "critical" value may be found to be unacceptable. We were looking for a low percentage of alpha values greater than 0.05 while also not straying too far from the established "critical" value. First of all, we look at the richness metric from Table 2. Looking at the genus level with a sample size of 100 and comparing the "critical" value of 0.80 with the "critical" value of 0.75, there is a significant drop from 44.74% to 5.3%. This was determined to be a good value that also did not deviate from the original "critical" value of 0.80. Moving on to the family level, we can see that changing the "critical" value from 0.80 to 0.70 yielded a 7 .5% of sites with alpha values greater than 0.05. This number is statistically good, however since the difference in "critical" value is already at 0.10, we decided that any other change away from 28 the established might be considered unacceptable. Changing to Table 3 and CLI, we now have the established "critical" value of 0.50. At the genus level, we observe that the new "critical" value of 0.60 gives a 0.0% of alpha values above 0.05. This is good because we did not deviate from the original "critical" value and obtained a 0.0%. At the family level, we obtained a 2.5% at the "critical" value of 0.60. For this, we could have changed the "critical" value more, but this could have been too far from the original "critical" value.
It is noted that changing the "critical" value is not an easy task like changing the sample size. These values were established by the EPA and although statistically these numbers suggest that the both of the "critical" values need to be changed there is much more involvement than the scope of this research.

D: Recommendations
The results of this simulation study suggest that for the richness metric, the most efficient method of obtaining the lowest percentage of sites with alpha values exceeding 0.05 would be to change the "critical" value from 0.80 to 0.75 at the genus level with a sample size of 100. This may not be the easiest idea to accomplish since the EPA has already established these "critical" values. However, this is a good indicator that there might need to be further investigation into changing these values.