“
suffered greater wetland loss than watersheds with smaller surrounding populations. Most watersheds have suffered no or only very modest losses (less than 3 percent during the decade in question), and few watersheds have suffered more than a 4 percent loss. The distribution is thus heavily skewed toward watersheds with little wetland losses (that is, to the left) and is clearly not normally distributed.6 To increase normality, the variable is transformed by twice taking the square root, x.25. The transformed variable is then normally distributed: the Kolmogorov-Smirnov statistic is 0.82 (p = .51 > .05). The variable also appears visually normal for each of the population subgroups. There are four population groups, designed to ensure an adequate number of observations in each. Boxplot analysis of the transformed variable indicates four large and three small outliers (not shown). Examination suggests that these are plausible and representative values, which are therefore retained. Later, however, we will examine the effect of these seven observations on the robustness of statistical results. Descriptive analysis of the variables is shown in Table 13.1. Generally, large populations tend to have larger average wetland losses, but the standard deviations are large relative to (the difference between) these means, raising considerable question as to whether these differences are indeed statistically significant. Also, the untransformed variable shows that the mean wetland loss is less among watersheds with “Medium I” populations than in those with “Small” populations (1.77 versus 2.52). The transformed variable shows the opposite order (1.06 versus 0.97). Further investigation shows this to be the effect of the three small outliers and two large outliers on the calculation of the mean of the untransformed variable in the “Small” group. Variable transformation minimizes this effect. These outliers also increase the standard deviation of the “Small” group. Using ANOVA, we find that the transformed variable has unequal variances across the four groups (Levene’s statistic = 2.83, p = .41 < .05). Visual inspection, shown in Figure 13.2, indicates that differences are not substantial for observations within the group interquartile ranges, the areas indicated by the boxes. The differences seem mostly caused by observations located in the whiskers of the “Small” group, which include the five outliers mentioned earlier. (The other two outliers remain outliers and are shown.) For now, we conclude that no substantial differences in variances exist, but we later test the robustness of this conclusion with consideration of these observations (see Figure 13.2). Table 13.1 Variable Transformation We now proceed with the ANOVA analysis. First, Table 13.2 shows that the global F-test statistic is 2.91, p = .038 < .05. Thus, at least one pair of means is significantly different. (The term sum of squares is explained in note 1.) Getting Started Try ANOVA on some data of your choice. Second, which pairs are significantly different? We use the Bonferroni post-hoc test because relatively few comparisons are made (there are only four groups). The computer-generated results (not shown in Table 13.2) indicate that the only significant difference concerns the means of the “Small” and “Large” groups. This difference (1.26 - 0.97 = 0.29 [of transformed values]) is significant at the 5 percent level (p = .028). The Tukey and Scheffe tests lead to the same conclusion (respectively, p = .024 and .044). (It should be noted that post-hoc tests also exist for when equal variances are not assumed. In our example, these tests lead to the same result.7) This result is consistent with a visual reexamination of Figure 13.2, which shows that differences between group means are indeed small. The Tukey and
”
”
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)