Tukey Quotes

We've searched our database for all the quotes and captions related to Tukey. Here they are! All 13 of them:

Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise.
John W. Tukey
It is better to have an approximate answer to the right question than an exact answer to the wrong question. —JOHN TUKEY,
Josh Kaufman (The Personal MBA: Master the Art of Business)
If a thing is not worth doing, it is not worth doing well.
John W. Tukey
I mumbled something about how it was easy to calculate e to any power using that series (you just substitute the power for x). “Oh yeah?” they said, “Well, then, what’s e to the 3.3?” said some joker—I think it was Tukey. I say, “That’s easy. It’s 27.11.” Tukey knows it isn’t so easy to compute all that in your head. “Hey! How’d you do that?” Another guy says, “You know Feynman, he’s just faking it. It’s not really right.” They go to get a table, and while they’re doing that, I put on a few more figures: “27.1126,” I say. They find it in the table. “It’s right! But how’d you do it!” “I just summed the series.” “Nobody can sum the series that fast. You must just happen to know that one. How about e to the 3?” “Look,” I say. “It’s hard work! Only one a day!” “Hah! It’s a fake!” they say, happily. “All right,” I say, “It’s 20.085.
Richard P. Feynman (Surely You're Joking, Mr. Feynman! Adventures of a Curious Character)
The best thing about being a statistician is that you get to play in everyone's backyard
John Wilder Tukey
The term bit (the contraction, by 40 bits, of “binary digit”) was coined by statistician John W. Tukey shortly after he joined von Neumann’s project in November of 1945.
George Dyson (Turing's Cathedral: The Origins of the Digital Universe)
The word bit is a contraction of binary digit that was coined by the statistician John Tukey in the mid 1940s.
Brian W. Kernighan (D Is for Digital)
The picturing of data allows us to be sensitive not only to the multiple hypotheses that we hold, but to the many more we have not yet thought of, regard as unlikely, or think impossible.
Tukey 1974
Beyond craftsmanship lies invention, and it is here that lean, spare, fast programs are born. Almost always these are the result of strategic breakthrough rather than tactical cleverness. Sometimes the strategic breakthrough will be a new algorithm, such as the Cooley-Tukey Fast Fourier Transform or the substitution of an n log n sort for an n2 set of comparisons. Much more often, strategic breakthrough will come from redoing the representation of the data or tables. This is where the heart of a program lies.
Frederick P. Brooks Jr. (The Mythical Man-Month: Essays on Software Engineering)
Scheffe tests also produce “homogeneous subsets,” that is, groups that have statistically identical means. Both the three largest and the three smallest populations have identical means. The Tukey levels of statistical significance are, respectively, .725 and .165 (both > .05). This is shown in Table 13.3. Figure 13.2 Group Boxplots Table 13.2 ANOVA Table Third, is the increase in means linear? This test is an option on many statistical software packages that produces an additional line of output in the ANOVA table, called the “linear term for unweighted sum of squares,” with the appropriate F-test. Here, that F-test statistic is 7.85, p = .006 < .01, and so we conclude that the apparent linear increase is indeed significant: wetland loss is linearly associated with the increased surrounding population of watersheds.8 Figure 13.2 does not clearly show this, but the enlarged Y-axis in Figure 13.3 does. Fourth, are our findings robust? One concern is that the statistical validity is affected by observations that statistically (although not substantively) are outliers. Removing the seven outliers identified earlier does not affect our conclusions. The resulting variable remains normally distributed, and there are no (new) outliers for any group. The resulting variable has equal variances across the groups (Levene’s test = 1.03, p = .38 > .05). The global F-test is 3.44 (p = .019 < .05), and the Bonferroni post-hoc test similarly finds that only the differences between the “Small” and “Large” group means are significant (p = .031). The increase remains linear (F = 6.74, p = .011 < .05). Thus, we conclude that the presence of observations with large values does not alter our conclusions. Table 13.3 Homogeneous Subsets Figure 13.3 Watershed Loss, by Population We also test the robustness of conclusions for different variable transformations. The extreme skewness of the untransformed variable allows for only a limited range of root transformations that produce normality. Within this range (power 0.222 through 0.275), the preceding conclusions are replicated fully. Natural log and base-10 log transformations also result in normality and replicate these results, except that the post-hoc tests fail to identify that the means of the “Large” and “Small” groups are significantly different. However, the global F-test is (marginally) significant (F = 2.80, p = .043 < .05), which suggests that this difference is too small to detect with this transformation. A single, independent-samples t-test for this difference is significant (t = 2.47, p = .017 < .05), suggesting that this problem may have been exacerbated by the limited number of observations. In sum, we find converging evidence for our conclusions. As this example also shows, when using statistics, analysts frequently must exercise judgment and justify their decisions.9 Finally, what is the practical significance of this analysis? The wetland loss among watersheds with large surrounding populations is [(3.21 – 2.52)/2.52 =] 27.4 percent greater than among those surrounded by small populations. It is up to managers and elected officials to determine whether a difference of this magnitude warrants intervention in watersheds with large surrounding populations.10
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)
The Scheffe test is the most conservative, the Tukey test is best when many comparisons are made (when there are many groups), and the Bonferroni test is preferred when few comparisons are made. However, these post-hoc tests often support the same conclusions.3 To illustrate, let’s say the independent variable has three categories. Then, a post-hoc test will examine hypotheses for whether . In addition, these tests will also examine which categories have means that are not significantly different from each other, hence, providing homogeneous subsets. An example of this approach is given later in this chapter. Knowing such subsets can be useful when the independent variable has many categories (for example, classes of employees). Figure 13.1 ANOVA: Significant and Insignificant Differences Eta-squared (η2) is a measure of association for mixed nominal-interval variables and is appropriate for ANOVA. Its values range from zero to one, and it is interpreted as the percentage of variation explained. It is a directional measure, and computer programs produce two statistics, alternating specification of the dependent variable. Finally, ANOVA can be used for testing interval-ordinal relationships. We can ask whether the change in means follows a linear pattern that is either increasing or decreasing. For example, assume we want to know whether incomes increase according to the political orientation of respondents, when measured on a seven-point Likert scale that ranges from very liberal to very conservative. If a linear pattern of increase exists, then a linear relationship is said to exist between these variables. Most statistical software packages can test for a variety of progressive relationships. ANOVA Assumptions ANOVA assumptions are essentially the same as those of the t-test: (1) the dependent variable is continuous, and the independent variable is ordinal or nominal, (2) the groups have equal variances, (3) observations are independent, and (4) the variable is normally distributed in each of the groups. The assumptions are tested in a similar manner. Relative to the t-test, ANOVA requires a little more concern regarding the assumptions of normality and homogeneity. First, like the t-test, ANOVA is not robust for the presence of outliers, and analysts examine the presence of outliers for each group. Also, ANOVA appears to be less robust than the t-test for deviations from normality. Second, regarding groups having equal variances, our main concern with homogeneity is that there are no substantial differences in the amount of variance across the groups; the test of homogeneity is a strict test, testing for any departure from equal variances, and in practice, groups may have neither equal variances nor substantial differences in the amount of variances. In these instances, a visual finding of no substantial differences suffices. Other strategies for dealing with heterogeneity are variable transformations and the removal of outliers, which increase variance, especially in small groups. Such outliers are detected by examining boxplots for each group separately. Also, some statistical software packages (such as SPSS), now offer post-hoc tests when equal variances are not assumed.4 A Working Example The U.S. Environmental Protection Agency (EPA) measured the percentage of wetland loss in watersheds between 1982 and 1992, the most recent period for which data are available (government statistics are sometimes a little old).5 An analyst wants to know whether watersheds with large surrounding populations have
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)
categorical and the dependent variable is continuous. The logic of this approach is shown graphically in Figure 13.1. The overall group mean is (the mean of means). The boxplots represent the scores of observations within each group. (As before, the horizontal lines indicate means, rather than medians.) Recall that variance is a measure of dispersion. In both parts of the figure, w is the within-group variance, and b is the between-group variance. Each graph has three within-group variances and three between-group variances, although only one of each is shown. Note in part A that the between-group variances are larger than the within-group variances, which results in a large F-test statistic using the above formula, making it easier to reject the null hypothesis. Conversely, in part B the within-group variances are larger than the between-group variances, causing a smaller F-test statistic and making it more difficult to reject the null hypothesis. The hypotheses are written as follows: H0: No differences between any of the group means exist in the population. HA: At least one difference between group means exists in the population. Note how the alternate hypothesis is phrased, because the logical opposite of “no differences between any of the group means” is that at least one pair of means differs. H0 is also called the global F-test because it tests for differences among any means. The formulas for calculating the between-group variances and within-group variances are quite cumbersome for all but the simplest of designs.1 In any event, statistical software calculates the F-test statistic and reports the level at which it is significant.2 When the preceding null hypothesis is rejected, analysts will also want to know which differences are significant. For example, analysts will want to know which pairs of differences in watershed pollution are significant across regions. Although one approach might be to use the t-test to sequentially test each pair of differences, this should not be done. It would not only be a most tedious undertaking but would also inadvertently and adversely affect the level of significance: the chance of finding a significant pair by chance alone increases as more pairs are examined. Specifically, the probability of rejecting the null hypothesis in one of two tests is [1 – 0.952 =] .098, the probability of rejecting it in one of three tests is [1 – 0.953 =] .143, and so forth. Thus, sequential testing of differences does not reflect the true level of significance for such tests and should not be used. Post-hoc tests test all possible group differences and yet maintain the true level of significance. Post-hoc tests vary in their methods of calculating test statistics and holding experiment-wide error rates constant. Three popular post-hoc tests are the Tukey, Bonferroni, and Scheffe tests.
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)
suffered greater wetland loss than watersheds with smaller surrounding populations. Most watersheds have suffered no or only very modest losses (less than 3 percent during the decade in question), and few watersheds have suffered more than a 4 percent loss. The distribution is thus heavily skewed toward watersheds with little wetland losses (that is, to the left) and is clearly not normally distributed.6 To increase normality, the variable is transformed by twice taking the square root, x.25. The transformed variable is then normally distributed: the Kolmogorov-Smirnov statistic is 0.82 (p = .51 > .05). The variable also appears visually normal for each of the population subgroups. There are four population groups, designed to ensure an adequate number of observations in each. Boxplot analysis of the transformed variable indicates four large and three small outliers (not shown). Examination suggests that these are plausible and representative values, which are therefore retained. Later, however, we will examine the effect of these seven observations on the robustness of statistical results. Descriptive analysis of the variables is shown in Table 13.1. Generally, large populations tend to have larger average wetland losses, but the standard deviations are large relative to (the difference between) these means, raising considerable question as to whether these differences are indeed statistically significant. Also, the untransformed variable shows that the mean wetland loss is less among watersheds with “Medium I” populations than in those with “Small” populations (1.77 versus 2.52). The transformed variable shows the opposite order (1.06 versus 0.97). Further investigation shows this to be the effect of the three small outliers and two large outliers on the calculation of the mean of the untransformed variable in the “Small” group. Variable transformation minimizes this effect. These outliers also increase the standard deviation of the “Small” group. Using ANOVA, we find that the transformed variable has unequal variances across the four groups (Levene’s statistic = 2.83, p = .41 < .05). Visual inspection, shown in Figure 13.2, indicates that differences are not substantial for observations within the group interquartile ranges, the areas indicated by the boxes. The differences seem mostly caused by observations located in the whiskers of the “Small” group, which include the five outliers mentioned earlier. (The other two outliers remain outliers and are shown.) For now, we conclude that no substantial differences in variances exist, but we later test the robustness of this conclusion with consideration of these observations (see Figure 13.2). Table 13.1 Variable Transformation We now proceed with the ANOVA analysis. First, Table 13.2 shows that the global F-test statistic is 2.91, p = .038 < .05. Thus, at least one pair of means is significantly different. (The term sum of squares is explained in note 1.) Getting Started Try ANOVA on some data of your choice. Second, which pairs are significantly different? We use the Bonferroni post-hoc test because relatively few comparisons are made (there are only four groups). The computer-generated results (not shown in Table 13.2) indicate that the only significant difference concerns the means of the “Small” and “Large” groups. This difference (1.26 - 0.97 = 0.29 [of transformed values]) is significant at the 5 percent level (p = .028). The Tukey and Scheffe tests lead to the same conclusion (respectively, p = .024 and .044). (It should be noted that post-hoc tests also exist for when equal variances are not assumed. In our example, these tests lead to the same result.7) This result is consistent with a visual reexamination of Figure 13.2, which shows that differences between group means are indeed small. The Tukey and
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)