“
COEFFICIENT The nonparametric alternative, Spearman’s rank correlation coefficient (r, or “rho”), looks at correlation among the ranks of the data rather than among the values. The ranks of data are determined as shown in Table 14.2 (adapted from Table 11.8): Table 14.2 Ranks of Two Variables In Greater Depth … Box 14.1 Crime and Poverty An analyst wants to examine empirically the relationship between crime and income in cities across the United States. The CD that accompanies the workbook Exercising Essential Statistics includes a Community Indicators dataset with assorted indicators of conditions in 98 cities such as Akron, Ohio; Phoenix, Arizona; New Orleans, Louisiana; and Seattle, Washington. The measures include median household income, total population (both from the 2000 U.S. Census), and total violent crimes (FBI, Uniform Crime Reporting, 2004). In the sample, household income ranges from $26,309 (Newark, New Jersey) to $71,765 (San Jose, California), and the median household income is $42,316. Per-capita violent crime ranges from 0.15 percent (Glendale, California) to 2.04 percent (Las Vegas, Nevada), and the median violent crime rate per capita is 0.78 percent. There are four types of violent crimes: murder and nonnegligent manslaughter, forcible rape, robbery, and aggravated assault. A measure of total violent crime per capita is calculated because larger cities are apt to have more crime. The analyst wants to examine whether income is associated with per-capita violent crime. The scatterplot of these two continuous variables shows that a negative relationship appears to be present: The Pearson’s correlation coefficient is –.532 (p < .01), and the Spearman’s correlation coefficient is –.552 (p < .01). The simple regression model shows R2 = .283. The regression model is as follows (t-test statistic in parentheses): The regression line is shown on the scatterplot. Interpreting these results, we see that the R-square value of .283 indicates a moderate relationship between these two variables. Clearly, some cities with modest median household incomes have a high crime rate. However, removing these cities does not greatly alter the findings. Also, an assumption of regression is that the error term is normally distributed, and further examination of the error shows that it is somewhat skewed. The techniques for examining the distribution of the error term are discussed in Chapter 15, but again, addressing this problem does not significantly alter the finding that the two variables are significantly related to each other, and that the relationship is of moderate strength. With this result in hand, further analysis shows, for example, by how much violent crime decreases for each increase in household income. For each increase of $10,000 in average household income, the violent crime rate drops 0.25 percent. For a city experiencing the median 0.78 percent crime rate, this would be a considerable improvement, indeed. Note also that the scatterplot shows considerable variation in the crime rate for cities at or below the median household income, in contrast to those well above it. Policy analysts may well wish to examine conditions that give rise to variation in crime rates among cities with lower incomes. Because Spearman’s rank correlation coefficient examines correlation among the ranks of variables, it can also be used with ordinal-level data.9 For the data in Table 14.2, Spearman’s rank correlation coefficient is .900 (p = .035).10 Spearman’s p-squared coefficient has a “percent variation explained” interpretation, similar
”
”
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)