An lmer question


Hi folks,
Repost/summary of previous correspondence.  Thanks to Ben Bolker, Spencer Graves and Ken Beath for questions and comments.
My original question and compilation of other emails:

Are there general situations in which we might expect very different
answers from F tests vs. mcmcpvalue with orthogonal contrasts (Helmert)? I have doubts about the MCMC results with the most recent version of lme4 (versions ....-20 on CRAN, and -21 on R-Forge).

I recently tried analyzing a moderately sized (n=~500, no. of subjects = ~200) unbalanced data set where lme and lmer gave similar F ratios.  lmer simulations gave the same F-tests (via simulation) as lme, but the MCMC output (using either Helmert or sum contrasts) depended wildly on terms that were retained in the model. I have simulated the data with the full model and that is available on line.

If you are interested, please run the code below.


(Output appended below.)

I get similar F-test results whether I use lm (and ignore the random
effect of subject), lme, and lmer with an DDF approximation and via simulation (lmer::simulate, of the null hypothesis indicate that F- stats as large (or larger) than my observed F-stats are VERY unlikely,
under the null hypothesis - thanks Ben!).

When I use mcmcpvalue on different fixed effects structures, I get huge changes in empirical P values of
(0.6 to 0.01). In contrast, the F-tests are much more consistent when I change the fixed effect structure.

Is mcmcpvalue much more sensitive to overfitting the model? In
some cases, removing the interactions results in a lower AIC (with ML
fits). The MCMC sample traces look (in my limited experience) withoutpeculiarities, and the densityplots are all quite symmetrical and normal-ish.


In the full model, we have 28 fixed coefs (22 continuous variables or
slope interactions) and about 500 obs of about 200 subjects (2-3 observations per subject). The data are VERY unbalanced. The QQ plots look normal, but highlight the lack of balance (from one to dozens of reps per treatment combo).

OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:OUTPUT:

The following tables are output of factors/covariates from a function I wrote to use with lmer. "P.sum" is an F-test using one hack for denominator degrees of freedom (these agree qualitatively with lme). "P(H|D)" is the empirical P-value generated from Bates' function mcmcpvalue, with my hack for the new structure of the latest mcmcsamp output (NOTE that I checked that these P-values also agree with direct HPDinterval and quantile assessments of individual coefficients).

continuous predictors: ageinj, zses, tsi, tsq, zfad (NOTE: tsq = tsi^2)
categorical predictors: sex, race, grp, id (id=random, subject).

FULL MODEL
> aov.Bayes(modcA, n=1000)$aov.tab
                     Df  Sum.Sq Mean.Sq F.value  P.sum P(H|D)
ageinj                1  127.00  127.00  2.6559 0.1038  0.096
sex                   1  233.32  233.32  4.8793 0.0277  0.008
race                  1  216.49  216.49  4.5273 0.0339  0.372
zses                  1  249.03  249.03  5.2078 0.0229  0.010
tsi                   1  138.05  138.05  2.8870 0.0900  0.005
tsq                   1  393.21  393.21  8.2230 0.0043  0.008
grp                   3  324.45  108.15  2.2617 0.0805  0.648
zfad                  1  988.80  988.80 20.6782 0.0000  0.051
tsi:grp               3  460.46  153.49  3.2098 0.0229  0.364
tsq:grp               3  229.16   76.39  1.5974 0.1892  0.541
grp:zfad              3   89.48   29.83  0.6238 0.5999  0.555
tsi:zfad              1   84.01   84.01  1.7568 0.1857  0.963
tsq:zfad              1   78.01   78.01  1.6314 0.2021  0.633
tsi:grp:zfad          3 1026.88  342.29  7.1582 0.0001  0.812
tsq:grp:zfad          3  123.36   41.12  0.8599 0.4618  0.884
Residuals (Tr(hat)) 469      NA   47.82      NA     NA     NA


FULL MODEL minus one three way interaction changes P(H|D) for the other 3-way.
> aov.Bayes(modcB, n=1000)$aov.tab
                     Df  Sum.Sq Mean.Sq F.value  P.sum P(H|D)
ageinj                1  127.85  127.85   2.670 0.1029  0.094
sex                   1  234.75  234.75   4.902 0.0273  0.008
race                  1  217.92  217.92   4.551 0.0334  0.357
zses                  1  250.62  250.62   5.234 0.0226  0.017
tsi                   1  138.01  138.01   2.882 0.0902  0.007
tsq                   1  393.04  393.04   8.208 0.0044  0.010
grp                   3  326.62  108.87   2.274 0.0793  0.646
zfad                  1  994.61  994.61  20.771 0.0000  0.024
tsi:grp               3  460.45  153.48   3.205 0.0230  0.334
tsq:grp               3  229.34   76.45   1.596 0.1894  0.453
grp:zfad              3   90.07   30.02   0.627 0.5978  0.292
tsi:zfad              1   83.88   83.88   1.752 0.1863  0.777
tsq:zfad              1   77.92   77.92   1.627 0.2027  0.358
tsi:grp:zfad          3 1026.92  342.31   7.148 0.0001  0.015
Residuals (Tr(hat)) 472      NA   47.89      NA     NA     NA
>

NO INTERACTIONS
> aov.Bayes(modcD, n=1000)$aov.tab
                     Df Sum.Sq Mean.Sq F.value  P.sum P(H|D)
ageinj                1  96.66   96.66   1.847 0.1748  0.061
sex                   1 269.07  269.07   5.141 0.0238  0.005
race                  1 214.66  214.66   4.102 0.0434  0.064
zses                  1 257.90  257.90   4.928 0.0269  0.030
tsi                   1 137.91  137.91   2.635 0.1052  0.044
tsq                   1 415.80  415.80   7.945 0.0050  0.069
grp                   3 373.93  124.64   2.382 0.0688  0.001
Residuals (Tr(hat)) 490     NA   52.34      NA     NA     NA

> sessionInfo()
R version 2.7.1 (2008-06-23)
i386-apple-darwin8.10.1

locale:
C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods 
[7] base    

other attached packages:
[1] foreign_0.8-26     lme4_0.999375-21   Matrix_0.999375-10
[4] lattice_0.17-8     Hmisc_3.4-3      

loaded via a namespace (and not attached):
[1] cluster_1.11.11 grid_2.7.1    
>



In the full model, we have 28 fixed coefs (22 continuous variables or
slope interactions) and about 500 obs.

The data are VERY unbalanced.