“
Thus, multiple regression requires two important tasks: (1) specification of independent variables and (2) testing of the error term. An important difference between simple regression and multiple regression is the interpretation of the regression coefficients in multiple regression (b1, b2, b3, …) in the preceding multiple regression model. Although multiple regression produces the same basic statistics discussed in Chapter 14 (see Table 14.1), each of the regression coefficients is interpreted as its effect on the dependent variable, controlled for the effects of all of the other independent variables included in the regression. This phrase is used frequently when explaining multiple regression results. In our example, the regression coefficient b1 shows the effect of x1 on y, controlled for all other variables included in the model. Regression coefficient b2 shows the effect of x2 on y, also controlled for all other variables in the model, including x1. Multiple regression is indeed an important and relatively simple way of taking control variables into account (and much easier than the approach shown in Appendix 10.1). Key Point The regression coefficient is the effect on the dependent variable, controlled for all other independent variables in the model. Note also that the model given here is very different from estimating separate simple regression models for each of the independent variables. The regression coefficients in simple regression do not control for other independent variables, because they are not in the model. The word independent also means that each independent variable should be relatively unaffected by other independent variables in the model. To ensure that independent variables are indeed independent, it is useful to think of the distinctively different types (or categories) of factors that affect a dependent variable. This was the approach taken in the preceding example. There is also a statistical reason for ensuring that independent variables are as independent as possible. When two independent variables are highly correlated with each other (r2 > .60), it sometimes becomes statistically impossible to distinguish the effect of each independent variable on the dependent variable, controlled for the other. The variables are statistically too similar to discern disparate effects. This problem is called multicollinearity and is discussed later in this chapter. This problem is avoided by choosing independent variables that are not highly correlated with each other. A WORKING EXAMPLE Previously (see Chapter 14), the management analyst with the Department of Defense found a statistically significant relationship between teamwork and perceived facility productivity (p <.01). The analyst now wishes to examine whether the impact of teamwork on productivity is robust when controlled for other factors that also affect productivity. This interest is heightened by the low R-square (R2 = 0.074) in Table 14.1, suggesting a weak relationship between teamwork and perceived productivity. A multiple regression model is specified to include the effects of other factors that affect perceived productivity. Thinking about other categories of variables that could affect productivity, the analyst hypothesizes the following: (1) the extent to which employees have adequate technical knowledge to do their jobs, (2) perceptions of having adequate authority to do one’s job well (for example, decision-making flexibility), (3) perceptions that rewards and recognition are distributed fairly (always important for motivation), and (4) the number of sick days. Various items from the employee survey are used to measure these concepts (as discussed in the workbook documentation for the Productivity dataset). After including these factors as additional independent variables, the result shown in Table 15.1 is
”
”
Evan M. Berman (Essential Statistics for Public Managers and Policy Analysts)