Statistical Inference Quotes

We've searched our database for all the quotes and captions related to Statistical Inference. Here they are! All 44 of them:

Statistical inference is really just the marriage of two concepts that we’ve already discussed: data and probability (with a little help from the central limit theorem).
Charles Wheelan (Naked Statistics: Stripping the Dread from the Data)
I have stressed this distinction because it is an important one. It defines the fundamental difference between probability and statistics: the former concerns predictions based on fixed probabilities; the latter concerns the inference of those probabilities based on observed data.
Leonard Mlodinow
Who needs theory when you have so much information? But this is categorically the wrong attitude to take toward forecasting, especially in a field like economics where the data is so noisy. Statistical inferences are much stronger when backed up by theory or at least some deeper thinking about their root causes.
Nate Silver (The Signal and the Noise: Why So Many Predictions Fail-but Some Don't)
Fighting for the acceptance of Bayesian networks in AI was a picnic compared with the fight I had to wage for causal diagrams [in the stormy waters of statistics].
Judea Pearl (The Book of Why: The New Science of Cause and Effect)
We now have many statistical software packages. Their power is incredible, but the pioneers of statistical inference would have mixed feelings, for they always insisted that people think before using a routine. In the old days routines took endless hours to apply, so one had to spend a lot of time thinking in order to justify using a routine. Now one enters data and presses a button. One result is that people seem to be cowed into not asking silly questions, such as: What hypothesis are you testing? What distribution is it that you say is not normal? What population are you talking about? Where did this base rate come from? Most important of all: Whose judgments do you use to calibrate scores on your questionnaires? Are those judgments generally agreed to by the qualified experts in the entire community?
Ian Hacking (Rewriting the Soul: Multiple Personality and the Sciences of Memory)
To teach students any psychology they did not know before, you must surprise them. But which surprise will do? Nisbett and Borgida found that when they presented their students with a surprising statistical fact, the students managed to learn nothing at all. But when the students were surprised by individual cases…they immediately made the generalization… Nisbett and Borgida summarize the results in a memorable sentence: ‘Subjects’ unwillingness to deduce the particular from the general was matched only by their willingness to infer the general from the particular.
Daniel Kahneman (Thinking, Fast and Slow)
We may at once admit that any inference from the particular to the general must be attended with some degree of uncertainty, but this is not the same as to admit that such inference cannot be absolutely rigorous, for the nature and degree of the uncertainty may itself be capable of rigorous expression.
Sir Ronald Fisher
How do we learn? Is there a better way? What can we predict? Can we trust what we’ve learned? Rival schools of thought within machine learning have very different answers to these questions. The main ones are five in number, and we’ll devote a chapter to each. Symbolists view learning as the inverse of deduction and take ideas from philosophy, psychology, and logic. Connectionists reverse engineer the brain and are inspired by neuroscience and physics. Evolutionaries simulate evolution on the computer and draw on genetics and evolutionary biology. Bayesians believe learning is a form of probabilistic inference and have their roots in statistics. Analogizers learn by extrapolating from similarity judgments and are influenced by psychology and mathematical optimization.
Pedro Domingos (The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World)
To teach students any psychology they did not know before, you must surprise them. But which surprise will do? Nisbett and Borgida found that when they presented their students with a surprising statistical fact, the students managed to learn nothing at all. But when the students were surprised by individual cases—two nice people who had not helped—they immediately made the generalization and inferred that helping is more difficult than they had thought.
Daniel Kahneman (Thinking, Fast and Slow)
Subjects’ unwillingness to deduce the particular from the general was matched only by their willingness to infer the general from the particular. This is a profoundly important conclusion. People who are taught surprising statistical facts about human behavior may be impressed to the point of telling their friends about what they have heard, but this does not mean that their understanding of the world has really changed. The test of learning psychology is whether your understanding of situations you encounter has changed, not whether you have learned a new fact.
Daniel Kahneman (Thinking, Fast and Slow)
Due to the various pragmatic obstacles, it is rare for a mission-critical analysis to be done in the “fully Bayesian” manner, i.e., without the use of tried-and-true frequentist tools at the various stages. Philosophy and beauty aside, the reliability and efficiency of the underlying computations required by the Bayesian framework are the main practical issues. A central technical issue at the heart of this is that it is much easier to do optimization (reliably and efficiently) in high dimensions than it is to do integration in high dimensions. Thus the workhorse machine learning methods, while there are ongoing efforts to adapt them to Bayesian framework, are almost all rooted in frequentist methods. A work-around is to perform MAP inference, which is optimization based. Most users of Bayesian estimation methods, in practice, are likely to use a mix of Bayesian and frequentist tools. The reverse is also true—frequentist data analysts, even if they stay formally within the frequentist framework, are often influenced by “Bayesian thinking,” referring to “priors” and “posteriors.” The most advisable position is probably to know both paradigms well, in order to make informed judgments about which tools to apply in which situations.
Jake VanderPlas (Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data (Princeton Series in Modern Observational Astronomy, 1))
Modern statistics is built on the idea of models — probability models in particular. [...] The standard approach to any new problem is to identify the sources of variation, to describe those sources by probability distributions and then to use the model thus created to estimate, predict or test hypotheses about the undetermined parts of that model. […] A statistical model involves the identification of those elements of our problem which are subject to uncontrolled variation and a specification of that variation in terms of probability distributions. Therein lies the strength of the statistical approach and the source of many misunderstandings. Paradoxically, misunderstandings arise both from the lack of an adequate model and from over reliance on a model. […] At one level is the failure to recognise that there are many aspects of a model which cannot be tested empirically. At a higher level is the failure is to recognise that any model is, necessarily, an assumption in itself. The model is not the real world itself but a representation of that world as perceived by ourselves. This point is emphasised when, as may easily happen, two or more models make exactly the same predictions about the data. Even worse, two models may make predictions which are so close that no data we are ever likely to have can ever distinguish between them. […] All model-dependant inference is necessarily conditional on the model. This stricture needs, especially, to be borne in mind when using Bayesian methods. Such methods are totally model-dependent and thus all are vulnerable to this criticism. The problem can apparently be circumvented, of course, by embedding the model in a larger model in which any uncertainties are, themselves, expressed in probability distributions. However, in doing this we are embarking on a potentially infinite regress which quickly gets lost in a fog of uncertainty.
David J. Bartholomew (Unobserved Variables: Models and Misunderstandings (SpringerBriefs in Statistics))
The radial patterning of Protestantism allows us to use a county’s proximity to Wittenberg to isolate—in a statistical sense—that part of the variation in Protestantism that we know is due to a county’s proximity to Wittenberg and not to greater literacy or other factors. In a sense, we can think of this as an experiment in which different counties were experimentally assigned different dosages of Protestantism to test for its effects. Distance from Wittenberg allows us to figure out how big that experimental dosage was. Then, we can see if this “assigned” dosage of Protestantism is still associated with greater literacy and more schools. If it is, we can infer from this natural experiment that Protestantism did indeed cause greater literacy.16 The results of this statistical razzle-dazzle are striking. Not only do Prussian counties closer to Wittenberg have higher shares of Protestants, but those additional Protestants are associated with greater literacy and more schools. This indicates that the wave of Protestantism created by the Reformation raised literacy and schooling rates in its wake. Despite Prussia’s having a high average literacy rate in 1871, counties made up entirely of Protestants had literacy rates nearly 20 percentile points higher than those that were all Catholic.18 FIGURE P.2. The percentage of Protestants in Prussian counties in 1871.17 The map highlights some German cities, including the epicenter of the Reformation, Wittenberg, and Mainz, the charter town where Johannes Gutenberg produced his eponymous printing press. These same patterns can be spotted elsewhere in 19th-century Europe—and today—in missionized regions around the globe. In 19th-century Switzerland, other aftershocks of the Reformation have been detected in a battery of cognitive tests given to Swiss army recruits. Young men from all-Protestant districts were not only 11 percentile points more likely to be “high performers” on reading tests compared to those from all-Catholic districts, but this advantage bled over into their scores in math, history, and writing. These relationships hold even when a district’s population density, fertility, and economic complexity are kept constant. As in Prussia, the closer a community was to one of the two epicenters of the Swiss Reformation—Zurich or Geneva—the more Protestants it had in the 19th century. Notably, proximity to other Swiss cities, such as Bern and Basel, doesn’t reveal this relationship. As is the case in Prussia, this setup allows us to finger Protestantism as driving the spread of greater literacy as well as the smaller improvements in writing and math abilities.
Joseph Henrich (The WEIRDest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous)
examined the cures from cancer that resulted from a visit to Lourdes in France, where people were healed by simple contact with the holy waters, and found out the interesting fact that, of the total cancer patients who visited the place, the cure rate was, if anything, lower than the statistical one for spontaneous remissions. It was lower than the average for those who did not go to Lourdes! Should a statistician infer here that cancer patients’ odds of surviving deteriorates after a visit to Lourdes?
Nassim Nicholas Taleb (Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets (Incerto, #1))
the idea of reverse inference is really not very different from the concept of decoding that was seen in the work of Jim Haxby, and you would be correct: in each case we are using neuroimaging data to try to infer the mental state of an individual. The main difference is that the reverse inference that I ridiculed from the New York Times was based not on a formal statistical model but rather on the researcher’s own judgment. However, it is possible to develop statistical models that can let us quantify exactly how well we can decode what a person is thinking about from fMRI data,
Russell A. Poldrack (The New Mind Readers: What Neuroimaging Can and Cannot Reveal about Our Thoughts)
Probability is orderly opinion and inference from data is nothing other than the revision of such opinion in the light of relevant new information.
Eliezer S. Yudkowsky
Helmholtz called this concept of vision “unconscious inference,” where inference refers to the idea that the brain conjectures what might be out there, and unconscious reminds us that we have no awareness of the process. We have no access to the rapid and automatic machinery that gathers and estimates the statistics of the world. We’re merely the beneficiaries riding on top of the machinery, enjoying the play of light and shadows.
David Eagleman (Incognito: The Secret Lives of the Brain)
But everything, absolutely all the library work, had also been data. Collectible information that could be assessed and analyzed, that inferences could be made from. Some might argue that information and data, numbers and charts and statistics, aren't concerned with what feels "good" or "bad" (or any number of things in between), but I disagree. All data is tied back to emotions - to some original question, concern, desire, hypothesis that can be traced back to the feelings of a researcher, or a scientist, or whoever formed a hypothesis, asked a question, became interested in measuring something, tried to solve a problem, or cure a virus, and so forth.
Amanda Oliver (Overdue: Reckoning with the Public Library)
Excellence in Statistics: Rigor Statisticians are specialists in coming to conclusions beyond your data safely—they are your best protection against fooling yourself in an uncertain world. To them, inferring something sloppily is a greater sin than leaving your mind a blank slate, so expect a good statistician to put the brakes on your exuberance. They care deeply about whether the methods applied are right for the problem and they agonize over which inferences are valid from the information at hand. The result? A perspective that helps leaders make important decisions in a risk-controlled manner. In other words, they use data to minimize the chance that you’ll come to an unwise conclusion. Excellence in Machine Learning: Performance You might be an applied machine-learning/AI engineer if your response to “I bet you couldn’t build a model that passes testing at 99.99999% accuracy” is “Watch me.” With the coding chops to build both prototypes and production systems that work and the stubborn resilience to fail every hour for several years if that’s what it takes, machine-learning specialists know that they won’t find the perfect solution in a textbook. Instead, they’ll be engaged in a marathon of trial and error. Having great intuition for how long it’ll take them to try each new option is a huge plus and is more valuable than an intimate knowledge of how the algorithms work (though it’s nice to have both). Performance means more than clearing a metric—it also means reliable, scalable, and easy-to-maintain models that perform well in production. Engineering excellence is a must. The result? A system that automates a tricky task well enough to pass your statistician’s strict testing bar and deliver the audacious performance a business leader demands. Wide Versus Deep What the previous two roles have in common is that they both provide high-effort solutions to specific problems. If the problems they tackle aren’t worth solving, you end up wasting their time and your money. A frequent lament among business leaders is, “Our data science group is useless.” And the problem usually lies in an absence of analytics expertise. Statisticians and machine-learning engineers are narrow-and-deep workers—the shape of a rabbit hole, incidentally—so it’s really important to point them at problems that deserve the effort. If your experts are carefully solving the wrong problems, your investment in data science will suffer low returns. To ensure that you can make good use of narrow-and-deep experts, you either need to be sure you already have the right problem or you need a wide-and-shallow approach to finding one.
Harvard Business Review (Strategic Analytics: The Insights You Need from Harvard Business Review (HBR Insights Series))
Chapter 3), the resulting index variables typically are continuous as well. When variables are continuous, we should not recode them as categorical variables just to use the techniques of the previous chapters. Continuous variables provide valuable information about distances between categories and often have a broader range of values than ordinal variables. Recoding continuous variables as categorical variables is discouraged because it results in a loss of information; we should use tests such as the t-test. Statistics involving continuous variables usually require more test assumptions. Many of these tests are referred to as parametric statistics; this term refers to the fact that they make assumptions about the distribution of data and also that they are used to make inferences about population parameters. Formally, the term parametric means that a test makes assumptions about the distribution of the underlying population. Parametric tests have more test assumptions than nonparametric tests, most typically that the variable is
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)
The null hypothesis of normality is that the variable is normally distributed: thus, we do not want to reject the null hypothesis. A problem with statistical tests of normality is that they are very sensitive to small samples and minor deviations from normality. The extreme sensitivity of these tests implies the following: whereas failure to reject the null hypo thesis indicates normal distribution of a variable, rejecting the null hypothesis does not indicate that the variable is not normally distributed. It is acceptable to consider variables as being normally distributed when they visually appear to be so, even when the null hypothesis of normality is rejected by normality tests. Of course, variables are preferred that are supported by both visual inspection and normality tests. In Greater Depth … Box 12.1 Why Normality? The reasons for the normality assumption are twofold: First, the features of the normal distribution are well-established and are used in many parametric tests for making inferences and hypothesis testing. Second, probability theory suggests that random samples will often be normally distributed, and that the means of these samples can be used as estimates of population means. The latter reason is informed by the central limit theorem, which states that an infinite number of relatively large samples will be normally distributed, regardless of the distribution of the population. An infinite number of samples is also called a sampling distribution. The central limit theorem is usually illustrated as follows. Assume that we know the population distribution, which has only six data elements with the following values: 1, 2, 3, 4, 5, or 6. Next, we write each of these six numbers on a separate sheet of paper, and draw repeated samples of three numbers each (that is, n = 3). We
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)
second variable, we find that Z = 2.103, p = .035. This value is larger than that obtained by the parametric test, p = .019.21 SUMMARY When analysts need to determine whether two groups have different means of a continuous variable, the t-test is the tool of choice. This situation arises, for example, when analysts compare measurements at two points in time or the responses of two different groups. There are three common t-tests, involving independent samples, dependent (paired) samples, and the one-sample t-test. T-tests are parametric tests, which means that variables in these tests must meet certain assumptions, notably that they are normally distributed. The requirement of normally distributed variables follows from how parametric tests make inferences. Specifically, t-tests have four assumptions: One variable is continuous, and the other variable is dichotomous.
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)
Chapter 3), the resulting index variables typically are continuous as well. When variables are continuous, we should not recode them as categorical variables just to use the techniques of the previous chapters. Continuous variables provide valuable information about distances between categories and often have a broader range of values than ordinal variables. Recoding continuous variables as categorical variables is discouraged because it results in a loss of information; we should use tests such as the t-test. Statistics involving continuous variables usually require more test assumptions. Many of these tests are referred to as parametric statistics; this term refers to the fact that they make assumptions about the distribution of data and also that they are used to make inferences about population parameters. Formally, the term parametric means that a test makes assumptions about the distribution of the underlying population. Parametric tests have more test assumptions than nonparametric tests, most typically that the variable is continuous and normally distributed (see Chapter 7). These and other test assumptions are also part of t-tests. This chapter focuses on three common t-tests: for independent samples, for dependent (paired) samples, and the one-sample t-test. For each, we provide examples and discuss test assumptions. This chapter also discusses nonparametric alternatives to t-tests, which analysts will want to consider when t-test assumptions cannot be met for their variables. As a general rule, a bias exists toward using parametric tests because they are more powerful than nonparametric tests. Nonparametric alternatives to parametric tests often transform continuous testing variables into other types of variables, such as rankings, which reduces information about them. Although nonparametric statistics are easier to use because they have fewer assumptions, parametric tests are more likely to find statistical evidence that two variables are associated; their tests often have lower p-values than nonparametric statistics.1
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)
second variable, we find that Z = 2.103, p = .035. This value is larger than that obtained by the parametric test, p = .019.21 SUMMARY When analysts need to determine whether two groups have different means of a continuous variable, the t-test is the tool of choice. This situation arises, for example, when analysts compare measurements at two points in time or the responses of two different groups. There are three common t-tests, involving independent samples, dependent (paired) samples, and the one-sample t-test. T-tests are parametric tests, which means that variables in these tests must meet certain assumptions, notably that they are normally distributed. The requirement of normally distributed variables follows from how parametric tests make inferences. Specifically, t-tests have four assumptions: One variable is continuous, and the other variable is dichotomous. The two distributions have equal variances. The observations are independent. The two distributions are normally distributed. The assumption of homogeneous variances does not apply to dependent-samples and one-sample t-tests because both are based on only a single variable for testing significance. When assumptions of normality are not met, variable transformation may be used. The search for alternative ways for dealing with normality problems may lead analysts to consider nonparametric alternatives. The chief advantage of nonparametric tests is that they do not require continuous variables to be normally distributed. The chief disadvantage is that they yield higher levels of statistical significance, making it less likely that the null hypothesis may be rejected. A nonparametric alternative for the independent-samples t-test is the Mann-Whitney test, and the nonparametric alternative for the dependent-samples t-test is the Wilcoxon
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)
In early 2014, the Department of Justice and Education issued guidelines pressuring public school districts to adopt racial quotas when disciplining children. The basis for this guidance was studies showing that black children were over three times more likely to face serious punishment--suspension or expulsion--for misbehaving at school. The government concluded that school districts were engaging in massive illegal discrimination against black students. In fact, however, the government had no basis for its conclusion. The Supreme Court has explicitly stated that racial disparities in punishment do not by themselves prove discrimination, as they may just be consistent with the underlying rates of misbehavior by each group. There are no valid statistics (and the government hasn't cited any) from which one can infer that black students and white students would be expected to engage in serious misbehavior in school at the same rate. Unless there is some reason to expect kids to behave completely differently at school than outside of it, the school discipline figures are in line with what one would expect. African-American minors are arrested outside of school for violent crime at a rate approximately 3.5 times their share of the population. Moreover, as former Department of Education attorney Hans Bader notes, the government's own statistics show that white boys were over two times as likely to be suspended as their peers of Asian descent. By the government's logic, this means, absurdly, that school districts must be discriminating against white students and in favor of Asians. As of this writing, Minneapolis education authorities have announced their intention to end the black/white gap in suspensions and expulsions, a plan that struck many observers as announcing the imposition of quotas on school discipline.
David E. Bernstein (Lawless: The Obama Administration's Unprecedented Assault on the Constitution and the Rule of Law)
Philosophers used to think that “testimony” (that is, what other people tell us) was at the bottom of that hierarchy; above were perception, memory, and inference, in descending order of reliability. Yet for reasons that Wittgenstein demonstrated, the ladder is rotten. Because of limitations of time and intellect, we perforce base most of our beliefs on testimony, such as the testimony of scientists concerning cosmological and microscopic phenomena. Many of these beliefs are more reliable than those based on perception, memory, or inference. This is true even though we judge the reliability of testimony largely on the basis of other testimony (I believe that my birth certificate has the date of my birth right in part because of what I have heard about governmental recording of vital statistics and in part because of what my parents told me)—that is, even though much of our knowledge is based on hearsay, much of it double or triple or even more remote hearsay.
Richard A. Posner (Law, Pragmatism, and Democracy)
There are waves in history, as we have seen, including some vast tsunamis. But the idea that those waves are like waves of light and sound is an illusion. In the 1920s, the Soviet economist Nikolai Kondratieff sought to show that there were such patterns in capitalism, inferring from British, French, and German economic statistics the existence of fifty-year cycles of expansion followed by depression.114 For this contribution, which continues to be influential with many investors today, Stalin had Kondratieff arrested, imprisoned, and later shot. Unfortunately, modern research dispels the idea of such regularity in economic life. The economic historian Paul Schmelzing’s meticulous reconstruction of interest rates back to the thirteenth century points instead to a long-run, “supra-secular” decline in nominal rates, driven mostly by the process of capital accumulation, punctuated periodically but randomly by inflationary episodes nearly always associated with wars.115 Yet
Niall Ferguson (Doom: The Politics of Catastrophe)
Against that background, the assumption that discriminatory bias can be automatically inferred when there are differences in socioeconomic outcomes—and that the source of that bias can be determined by where the statistics were collected—seems indefensible. Yet that seemingly invincible fallacy guides much of what is said and done in our educational institutions, in the media and in government policies.
Thomas Sowell (Discrimination and Disparities)
Think back for a moment to the mental imagery used to explain regression analysis in the last chapter. We divide our data sample into different “rooms” in which each observation is identical except for one variable, which then allows us to isolate the effect of that variable while controlling for other potential confounding factors. We may have 692 individuals in our sample who have used both cocaine and heroin. However, we may have only 3 individuals who have used cocaine but not heroin and 2 individuals who have used heroin and not cocaine. Any inference about the independent effect of just one drug or the other is going to be based on these tiny samples.
Charles Wheelan (Naked Statistics: Stripping the Dread from the Data)
Second, like most other statistical inference, regression analysis builds only a circumstantial case. An association between two variables is like a fingerprint at the scene of the crime. It points us in the right direction, but it’s rarely enough to convict. (And sometimes a fingerprint at the scene of a crime doesn’t belong to the perpetrator.) Any regression analysis needs a theoretical underpinning: Why are the explanatory variables in the equation? What phenomena from other disciplines can explain the observed results? For instance, why do we think that wearing purple shoes would boost performance on the math portion of the SAT or that eating popcorn can help prevent prostate cancer? The results need to be replicated, or at least consistent with other findings.
Charles Wheelan (Naked Statistics: Stripping the Dread from the Data)
Stochastic and Reactive Effects Replication may be difficult to achieve if the phenomenon under study is inherently stochastic, that is, if it changes with time. Moreover, the phenomenon may react to the experimental situation, altering its characteristics because of the experiment. These are particularly sticky problems in the behavioral and social sciences, for it is virtually impossible to guarantee that an individual tested once will be exactly the same when tested later. In fact, when dealing with living organisms, we cannot realistically expect strict stability of behavior over time. Researchers have developed various experimental designs that attempt to counteract this problem of large fluctuations in behavior. Replication is equally problematic in medical research, for the effects of a drug as well as the symptoms of a disease change with time, confounding the observed course of the illness. Was the cure accelerated or held back by the introduction of the test drug? Often the answer can only be inferred based on what happens on average to a group of test patients compared to a group of control patients. Even attempts to keep experimenters and test participants completely blind to the experimental manipulations do not always address the stochastic and reactive elements of the phenomena under study. Besides the possibility that an effect may change over time, some phenomena may be inherently statistical; that is, they may exist only as probabilities or tendencies to occur. Experimenter Effects In a classic book entitled Pitfalls in Human Research, psychologist Theodore X. Barber discusses ten ways in which behavioral research can go wrong.11 These include such things as the “investigator paradigm effect,” in which the investigator’s conceptual framework biases the way an experiment is conducted and interpreted, and the “experimenter personal attributes effect,” where variables such as age, sex, and friendliness interact with the test participants’ responses. A third pitfall is the “experimenter unintentional expectancy effect”; that is, the experimenter’s prior expectations can influence the outcome of an experiment. Researchers’ expectations and prior beliefs affect how their experiments are conducted, how the data are interpreted, and how other investigators’ research is judged. This topic, discussed in chapter 14, is relevant to understanding the criticisms of psi experiments and how the evidence for psi phenomena has often been misinterpreted.
Dean Radin (The Conscious Universe: The Scientific Truth of Psychic Phenomena)
Once we get the regression results, we would calculate a t-statistic, which is the ratio of the observed coefficient to the standard error for that coefficient.* This t-statistic is then evaluated against whatever t-distribution is appropriate for the size of the data sample (since this is largely what determines the number of degrees of freedom). When the t-statistic is sufficiently large, meaning that our observed coefficient is far from what the null hypothesis would predict, we can reject the null hypothesis at some level of statistical significance. Again, this is the same basic process of statistical inference that we have been employing throughout the book. The fewer the degrees of freedom (and therefore the “fatter” the tails of the relevant t-distribution), the higher the t-statistic will have to be in order for us to reject the null hypothesis at some given level of significance. In the hypothetical regression example described above, if we had four degrees of freedom, we would need a t-statistic of at least 2.13 to reject the null hypothesis at the .05 level (in a one-tailed test). However, if we have 20,000 degrees of freedom (which essentially allows us to use the normal distribution), we would need only a t-statistic of 1.65 to reject the null hypothesis at the .05 level in the same one-tailed test.
Charles Wheelan (Naked Statistics: Stripping the Dread from the Data)
Key in on terms like generalization, statistical syllogism, simple induction, argument from analogy, causal relationship, Bayesian inference, inductive inference, algorithmic probability, Kolmogorov complexity.
Kim Stanley Robinson (Aurora)
Life gets a little trickier when we are doing our regression analysis (or other forms of statistical inference) with a small sample of data.
Charles Wheelan (Naked Statistics: Stripping the Dread from the Data)
Statistics cannot prove anything with certainty. Instead, the power of statistical inference derives from observing some pattern or outcome and then using probability to determine the most likely explanation for that outcome.
Charles Wheelan (Naked Statistics: Stripping the Dread from the Data)
On the other hand, a generous capital market is usually associated with the following: fear of missing out on profitable opportunities reduced risk aversion and skepticism (and, accordingly, reduced due diligence) too much money chasing too few deals willingness to buy securities in increased quantity willingness to buy securities of reduced quality high asset prices, low prospective returns, high risk and skimpy risk premiums It’s clear from this list of elements that excessive generosity in the capital markets stems from a shortage of prudence and thus should give investors one of the clearest red flags. The wide-open capital market arises when the news is good, asset prices are rising, optimism is riding high, and all things seem possible. But it invariably brings the issuance of unsound and overpriced securities, and the incurrence of debt levels that ultimately will result in ruin. The point about the quality of new issue securities in a wide-open capital market deserves particular attention. A decrease in risk aversion and skepticism—and increased focus on making sure opportunities aren’t missed rather than on avoiding losses—makes investors open to a greater quantity of issuance. The same factors make investors willing to buy issues of lower quality. When the credit cycle is in its expansion phase, the statistics on new issuance make clear that investors are buying new issues in greater amounts. But the acceptance of securities of lower quality is a bit more subtle. While there are credit ratings and covenants to look at, it can take effort and inference to understand the significance of these things. In feeding frenzies caused by excess availability of funds, recognizing and resisting this trend seems to be beyond the ability of the majority of market participants. This is one of the many reasons why the aftermath of an overly generous capital market includes losses, economic contraction, and a subsequent unwillingness to lend. The bottom line of all of the above is that generous credit markets usually are associated with elevated asset prices and subsequent losses, while credit crunches produce bargain-basement prices and great profit opportunities. (“Open and Shut”)
Howard Marks (Mastering The Market Cycle: Getting the Odds on Your Side)
Since so many research conclusions depend on essentially mathematical ideas-the principles of statistical and probabilistic inference-and since even the best-trained physicians tend to have only a modest mathematical education, physicians end up taking many of these conclusions on faith.
Norman Levitt (Prometheus Bedeviled: Science and the Contradictions of Contemporary Culture)
this book is part of what could be called a new wave in statistics teaching, in which formal probability theory as a basis for statistical inference does not come in till much later
David Spiegelhalter (The Art of Statistics: How to Learn from Data)
I heard reiteration of the following claim: Complex theories do not work; simple algorithms do. One of the goals of this book is to show that, at least in the problems of statistical inference, this is not true. I would like to demonstrate that in the area of science a good old principle is valid: Nothing is more practical than a good theory.
Vladimir N Vapnik
Pearl combines aspects of structural equations models and path diagrams. In this approach, assumptions underlying causal statements are coded as missing links in the path diagrams. Mathematical methods are then used to infer, from these path diagrams, which causal effects can be inferred from the data, and which cannot. Pearl's work is interesting, and many researchers find his arguments that path diagrams are a natural and convenient way to express assumptions about causal structures appealing. In our own work, perhaps influenced by the type of examples arising in social and medical sciences, we have not found this approach to aid drawing of causal inferences, and we do not discuss it further in this text.
Guido Imbens (Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction)
The influence of the biased selection is not on the believability of H but rather on the capability of the test to have unearthed errors. The error probing capability of the testing procedure is being diminished. If you engage in cherry picking, you are not “sincerely trying,” as Popper puts it, to find flaws with claims, but instead you are finding evidence in favor of a well-fitting hypothesis that you deliberately construct – barred only if your intuitions say it’s unbelievable. The job that was supposed to be accomplished by an account of statistics now has to be performed by you. Yet you are the one most likely to follow your preconceived opinions, biases, and pet theories.
Deborah G Mayo (Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars)
If you report results selectively, it becomes easy to prejudge hypotheses: yes, the data may accord amazingly well with a hypothesis H, but such a method is practically guaranteed to issue so good a fit even if H is false and not warranted by the evidence. If it is predetermined that a way will be found to either obtain or interpret data as evidence for H, then data are not being taken seriously in appraising H. H is essentially immune to having its flaws uncovered by the data. H might be said to have “passed” the test, but it is a test that lacks stringency or severity. Everyone understands that this is bad evidence, or no test at all. I call this the severity requirement.
Deborah G Mayo (Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars)
Recognition of the danger of drawing false inferences from incomplete, though correct information has led scientists to a preference for designed experimentation above mere observation of natural phenomena.
John Mandel (The Statistical Analysis of Experimental Data (Dover Books on Mathematics))
Selection Bias: basically, that your inferences will be biased if you use a non-random sample and pretend that it’s random.
Uri Bram (Thinking Statistically)