1、General Linear ModelGeneral Linear Model4/16/02AnnouncementsBackgroundThe General Linear ModelThere are three reasons for covering this material. This material provides an introduction to the use of dummy variables. o These variables are very useful whenever you have a categorical variable, and are

2、actually more useful in standard multiple regression. This material emphasizes the importance of models o It causes (conducts, leads) us to think about how we want to go about (embark) testing models, and the alternatives ways that we can look at problems. It makes it much easier to talk about the a

3、nalysis of covariance, and related techniques, and to talk about unequal sample sizes and how we want to test them. This topic starts out as a more difficult way of doing what they already know how to do. But it then goes on to present other stuff (thing) in a much simpler way than it could be prese

4、nted in any other way.I have tried to remove much of the stuff that doesnt focus on the three reasons that I gave above. I want students to understand the general concepts, and be able to see that they could be applicable in other settings. I am not trying to show people a harder way to run an analy

5、sis of variance.The approach taken here is basically the approach that any statistical package takes, which may help explain some of the subtleties of those packages.A major subtheme is to show that the analysis of variance, the analysis of covariance, the analysis of multiple regression, and a whol

6、e bunch (host) of other things are just variations on a common theme.I want students to understand the basic idea of coding (dummy) variables, but the specifics (details) are not important.ExampleI am going to use one of the smoking examples from Spilich that we have seen in other contexts. The data

7、 file (Spilich.sav) contains data on all groups, but we are only going to look at the group that was given a standard recall task-a cognitive task.Three basic groups. Nonsmokers (people who never smoked) Delayed smokers (Smokers who had not had a cigarette for several hours) Active smokers (Smokers

8、who smoked during the task.) The dependent variable was the number of errors made during the recall task.Standard Analysis of Variance:Descriptive StatisticsGrand mean = 38.778PlotPlot with error bars (bars represent 95% CI)AnovaIt is clear that there are significant differences between groups. I wi

9、ll even go ahead and compare the Non-smokers with the combined smoking groups, and then the two smoking groups with each other. This is for comparison purposes later.I did this with the one-way procedure and standard contrasts.Here we can see that Non-smokers differ from smokers, but that the two sm

10、oking groups do not differ between themselves.The GLM approachFirst we need to code the data to indicate Groups. We already have Groups as 1, 2, and 3, but we are going to do it differently. o The reason that we have to do it differently is due to the fact that our coding is completely arbitrary. We

11、 could have coded them as 2, 1, and 3. Any regression against group membership would be entirely dependent on the order in which we coded-thats a bad thing. We will set up dummy variables that tell us whether a subject is in Group 1 or not, and whether he/she is in Group 2 or not. o I have called th

12、ese new variables NonSmoke and Delayed, because they identify those who are in those two groups. We dont need to code for Group 3, because if youre not in 1 or 2, you must be in 3. The filter variable below just selected the Cognitive task, and ignored the other two tasks. Task Group Errors distract

13、 filterNonSmokeDelayed2.00 1.00 27.00 126.00 1 1.00 .002.00 1.00 34.00 154.00 1 1.00 .002.00 1.00 19.00 113.00 1 1.00 .00 omitted2.00 2.00 48.00 113.00 1 .00 1.002.00 2.00 29.00 100.00 1 .00 1.002.00 2.00 34.00 114.00 1 .00 1.00 omitted2.00 3.00 34.00 108.00 1 -1.00 -1.002.00 3.00 65.00 191.00 1 -1.

14、00 -1.002.00 3.00 55.00 112.00 1 -1.00 -1.00 omitted (Explain why I used -1 for each dummy variable for people in the last group.This makes the intercept come out to be the grand mean, and expresses the results in distance from the grand mean, rather than distance from the mean of some arbitrary gro

15、up.This idea is important, because if we arent careful it is easy to get answers to tell us about deviations from some single group, and that usually isnt what we are after.Here we come to the first important idea. I have taken a categorical variable with 3 (k) levels and turned it into 2 (k-1)new v

16、ariables. These two variables carry all the information that the single variable did, and are more useful.Regression Approach using Dummy VariablesI will now simply predict Errors using Nonsmoke and Delayed as my predictor variables. This is a standard multiple regression.Look first at the Anova tes

17、t for the regressionF = 4.744, p = .014This is exactly the same result we got when we ran the traditional Anova.Explain why this should be.Look next at the R2 value = .184. This is nothing but eta-squaredGo to the table headed CoefficientsThe following (up to the next major heading) is material that

18、 I find important and helpful, but if it adds to information overload, set it aside for now.Note that the Intercept = 38.778. This is exactly equal to the grand mean of all the groups. Intercept equals grand mean.Note that the slope for Nonsmoke = -9.911. This is exactly equal to the difference betw

19、een the Nonsmoke mean and the grand mean.Note that the slope for Delayed = 1.157. This is the difference between the Delayed mean and the grand mean. Slope equals difference between corresponding predictor mean and grand mean.Why not have a slope for Active?It would be redundant (excessive)-if we kn

20、ow the grand mean and the deviation of the other two groups, we can compute the deviation of the 3rd group. The sum of the deviations from the mean = 0. So, the deviation of the third group is 0 - (-9.911) - 1.156 = 8.755If I had coded for Active and Delayed, and left out NonSmoke, I would get an in

21、tercept of 38.778, slope for active = 8.755, slope for Delayed = 1.157, and could compute slope for NonSmoke = -9.911. This illustrates that the choice is arbitrary and unimportant.Testing ContrastsI forgot to do one additional thing, so I went back and did it. I asked SPSS to compute deviation cont

22、rasts when it ran the Anova.Deviation contrasts are comparisons of each mean with the grand mean. (Again, it doesnt do all three-it leaves out one, which in this case was the last one.)Output below:Note that the tests and the probabilities are exactly the same as the tests (and probabilities) on the

23、 regression equation.Why should this be?What is all of this about?I want to show that Anova and Regression are basically the same procedure. The only difference here between this regression and standard multiple regression is the use of dummy variables.There are a lot of important things here, but t

24、heir importance doesnt show up until we move to more complex analyses.GLM and Factorial AnovaNow things get interesting.First, we will take the same example, but with all three tasks, and create dummy variables for the different tasks as well. (Again, we create dummy variables for only two of the ta

25、sks.)Then we create interaction dummy variables by multiplying our dummies together to create 4 new variables.Nonsmoke*Patrec, Nonsmoke*Cogit, Delayed*Patrec, and Delayed*CognitThe overall Factorial Anova follows:Regression approachWe will start with the complete multiple regression using all dummy

26、variables as predictors. Here we are trying to explain variance in errors as a function of everything we know about groups, tasks, and their interactions.RegressionComment on SSregression as being equivalent to Model in regular Anova (Explain why 8 df.) 10-2=8Comment on the error term.This error ter

27、m is all of the variance in errors than can not be explained on the basis of groups, tasks, or their interactions. This is the standard error term in the factorial analysis of variance.Now students should understand why SPSS presents the Anova summary table the way it does, even if that is a confusi

28、ng way to have chosen to present it.Removing the Interaction Terms gives:The difference in the SSregression is 31744.726 - 29016.074 = 2728.652. This is the SS for the interaction term in the Anova.Removing the Task Terms (after replacing interaction) gives:If we subtract this SSregression from the

29、SSregression in full model, we get31744.726 - 3083.200 = 28661.526This is the effect of TaskLastly, look at the model with dummy variables for Task and Interaction, but no dummy variables for ConditionHere the difference between the full and reduced models is31744.726 - 31390.178 = 354.548This is th

30、e effect of Condition.Notice that each of this is basically what we called a hierarchical model earlier. The difference between the full model and a reduced model is what the extra variable(s) explain over an above (controlling for) the other variables.From here I get the following models:ModelSSregDifferenceSSerrorEffectFull 31744.72613587.200ErrorMain effects29016.1742728.652InteractionTask. + Interaction31390.178354.548ConditionCond + Interaction3083.20028661.526TaskBut we arent done.Yes we are for class. I have left the rest

