mecompare
Stata package for comparing marginal effects across models
mecompare (Marginal Effects Compar(e)ison) is a companion package to Mize, Doan, and Long's 2019 Sociological Methodology article "A General Framework for Comparing Predictions and Marginal Effects Across Models."
mecompare automates the calculation of cross-model comparisons of marginal effects. This includes simultaneously estimating the models using SUEST (via gsem) and calculating the marginal effects in both models as well as the cross-model differences.
Download mecompare
To download the mecompare command in Stata:
net install mecompare, from("https://tdmize.github.io/data/mecompare") replace
Help file
To read the help file (also available here):
help mecompare
Additional commands
The mecompare package also includes the command melincom which can be used after mecompare for custom comparisons of the marginal effects produced by mecompare. See details in the help file (also available here):
help melincom
Citation
Please cite the use of the mecompare command by citing the corresponding article:
Mize, Trenton D., Long Doan, and J. Scott Long. 2019. "A General Framework for Comparing Predictions and Marginal Effects Across Models." Sociological Methodology 49(1): 152-189.
Basic syntax
First, fit two individual models and store the estimates using est store. E.g.,
logit dv iv1 iv2, vce(robust)
est store basemod
logit dv iv1 iv2 med1 med2, vce(robust)
est store medmod
Note that mecompare uses the robust variance estimator. Thus, to match its results we strongly recommend you include vce(robust) as an option on the individual models.
Then, use mecompare listing the independent variables you want MEs calculated for and compared across models; binary and nominal variables should be specified with factor syntax using the i. prefix. Specify the two stored models( ) to compare.
mecompare iv1, models(basemod medmod)
Examples
This section shows how mecompare can be used for a variety of cross-model comparisons. These examples are replications of those in Mize, Doan, and Long 2019 plus some additional applications.
All of the code below is also available in this Stata do-file.
Ex 6.1 - Marginal effects to summarize curvilinear relationships and test mediation
This examples uses a linear regression model but with a nonlinear effect due to a polynomial (income and income^2 are in the model). Here, we test whether the addition of a mediator (jobsat) reduces the average effect of income, suggesting mediation.
The amount( ) option is shown here which allows for custom amounts of change to be requested for continuous independent variables.
use "https://tdmize.github.io/data/data/ah4_cme", clear
drop if missing(depsympB, income, inc10, age, woman, race, college, jobsat)
qui reg depsympB c.income##c.income c.age i.woman i.race, vce(robust)
est store basemod
qui reg depsympB c.income##c.income c.age i.woman i.race i.jobsat, vce(robust)
est store medmod
mecompare income, models(basemod medmod) amount(sd)
Model 1 (basemod) is:
regress depsympB c.income##c.income c.age i.woman i.race, vce(robust)
Model 2 (medmod) is:
regress depsympB c.income##c.income c.age i.woman i.race i.jobsat, vce(robust)
Marginal effects and cross-model differences (N_basemod=4307) (N_medmod=4307)
| ME # Estimate Robust SE P>|z|
---------------------------------+---------------------------------------
income + SD |
basemod | 1 -0.816 0.076 0.000
medmod | 2 -0.648 0.074 0.000
Difference | 3 -0.168 0.022 0.000
Example interpretations for effect of income:
In model 1, the ME is -0.816, indicating that a standard deviation increase in income is associated with about 0.8 fewer depressive symptoms. The ME of income is reduced to -0.648 in model 2 because of accounting for job satisfaction. A test of the difference in MEs across models shows that accounting for job satisfaction decreases the average effect of income by 0.168, a significant reduction in effect size (p < . 001), suggesting mediation.
Ex 6.2 - Comparing marginal effects across nested logit models
This example uses a binary logit model and examines how the effect of college changes as variables are added to the model.
use "https://tdmize.github.io/data/data/gss_cme", clear
drop if year < 2000
drop if employed != 1
drop if missing(vhappy, college, wages, occprest, age, married, parent, woman, conserv, reltrad)
qui logit vhappy i.college, vce(robust)
est store mod1
qui logit vhappy i.college i.married i.parent i.woman i.conserv ///
i.reltrad i.year c.age##c.age, vce(robust)
est store mod2
mecompare i.college, models(mod1 mod2)
Model 1 (mod1) is:
logit vhappy i.college, vce(robust)
Model 2 (mod2) is:
logit vhappy i.college i.married i.parent i.woman i.conserv i.reltrad i.year c.age##c.age, vce(robust)
Marginal effects and cross-model differences (N_mod1=9216) (N_mod2=9216)
| ME # Estimate Robust SE P>|z|
---------------------------------+---------------------------------------
college |
College - No Col D |
mod1 | 1 0.072 0.010 0.000
mod2 | 2 0.060 0.011 0.000
Difference | 3 0.012 0.004 0.003
Example interpretation of effect of college:
The AME indicates that, on average, the probability of being happy is 0.072 higher for individuals with a college degree (p < .001). Model 2 adds demographic control variables, reducing the AME for college to 0.060, which remains significant. A direct test of the difference in the AME from model 1 to model 2 shows that adding the controls significantly decreases the effect of college by 0.012 (cross model difference p < .01).
Ex 6.3 - Comparing marginal effects using alternative predictors
This example uses two binary logit models with alternative predictors (independent variables). In this case, two different ways to measure sexuality (identity [sexident] or behavior [sexbehav]).
use "https://tdmize.github.io/data/data/gss_cme", clear
drop if missing(samesexB, sexident, sexbehav, college, woman, race, age, year)
qui logit samesexB i.sexbehav i.woman i.college c.age i.race i.year, vce(robust)
est store behavior
qui logit samesexB i.sexident i.woman i.college c.age i.race i.year, vce(robust)
est store identity
mecompare i.sexbehav i.sexident, models(behavior identity)
Model 1 (behavior) is:
logit samesexB i.sexbehav i.woman i.college c.age i.race i.year, vce(robust)
Model 2 (identity) is:
logit samesexB i.sexident i.woman i.college c.age i.race i.year, vce(robust)
Marginal effects and cross-model differences (N_behavior=4921) (N_identity=4921)
| ME # Estimate Robust SE P>|z|
---------------------------------+---------------------------------------
sexbehav |
Bisexual - Heterosexu |
behavior | 1 -0.097 0.023 0.000
identity |
Difference |
Gay - Heterosexu |
behavior | 2 -0.362 0.038 0.000
identity |
Difference |
---------------------------------+---------------------------------------
sexident |
Bisexual - Heterosexu |
behavior |
identity | 3 -0.274 0.041 0.000
Difference |
Gay - Heterosexu |
behavior |
identity | 4 -0.428 0.034 0.000
Difference |
In this case, to compare whether the different specifications lead to different marginal effects we need to use the melincom command to compare the marginal effects in the mecompare table. The marginal effects are labeled by row number in the "ME #" column for reference by melincom.
qui melincom clear
qui melincom 1 - 3, stat(est se p) add rowname("Het - Bi")
qui melincom 2 - 4, stat(est se p) add rowname("Het - Gay")
melincom (2 - 1) - (4 - 3), stat(est se p) add rowname("Bi - Gay")
| lincom se pvalue
----------------+-----------------------------
Het - Bi | 0.177 0.043 0.000
Het - Gay | 0.066 0.041 0.106
Bi - Gay | -0.111 0.059 0.058
Example interpretations:
The contrasts between heterosexual and gay/lesbian and between bisexual and gay/lesbian do not differ significantly across the two models. However, there is a significantly larger difference between heterosexual and bisexual individuals when using the identity measure, compared with using the sexual behavior measure.
Ex 6.4 - Comparing marginal effects across different outcomes
This example compares effects across different outcomes (dependent variables). In this case, two different count outcomes (mental health and physical health) are compared using a negative binomial model. Effects are calculated on the predicted rate.
Here, no independent variables are specified with mecompare so marginal effects and cross-model comparisons are calculated for all variables.
use "https://tdmize.github.io/data/data/gss_cme", clear
drop if missing(mntlhlth, physhlth, woman, married, age, faminc, race, college, parent, reltrad)
qui nbreg mntlhlth i.woman i.married i.parent i.college c.age faminc ///
i.race i.year, vce(robust)
est store mental
qui nbreg physhlth i.woman i.married i.parent i.college c.age faminc ///
i.race i.year, vce(robust)
est store physical
mecompare , models(mental physical) amount(sd)
Model 1 (mental) is:
nbreg mntlhlth i.woman i.married i.parent i.college c.age faminc i.race i.year, vce(robust)
Model 2 (physical) is:
nbreg physhlth i.woman i.married i.parent i.college c.age faminc i.race i.year, vce(robust)
Marginal effects and cross-model differences (N_mental=5062) (N_physical=5062)
| ME # Estimate Robust SE P>|z|
---------------------------------+---------------------------------------
woman |
Women - Men |
mental | 1 0.993 0.208 0.000
physical | 2 0.773 0.174 0.000
Difference | 3 0.220 0.229 0.337
---------------------------------+---------------------------------------
married |
Married - Not Marr |
mental | 4 -1.010 0.230 0.000
physical | 5 -0.159 0.192 0.408
Difference | 6 -0.851 0.250 0.001
---------------------------------+---------------------------------------
parent |
Parent - No Child |
mental | 7 0.274 0.248 0.268
physical | 8 -0.269 0.219 0.218
Difference | 9 0.543 0.274 0.048
---------------------------------+---------------------------------------
college |
College - No Col D |
mental | 10 -0.879 0.229 0.000
physical | 11 -0.542 0.189 0.004
Difference | 12 -0.337 0.254 0.184
---------------------------------+---------------------------------------
age + SD |
mental | 13 -0.462 0.108 0.000
physical | 14 0.491 0.118 0.000
Difference | 15 -0.953 0.132 0.000
---------------------------------+---------------------------------------
faminc + SD |
mental | 16 -0.445 0.118 0.000
physical | 17 -0.381 0.102 0.000
Difference | 18 -0.064 0.133 0.631
---------------------------------+---------------------------------------
race |
black - white |
mental | 19 -1.016 0.258 0.000
physical | 20 -0.531 0.217 0.015
Difference | 21 -0.485 0.279 0.083
other - white |
mental | 22 -0.438 0.356 0.219
physical | 23 0.145 0.343 0.672
Difference | 24 -0.583 0.385 0.130
---------------------------------+---------------------------------------
year |
2006 - 2002 |
mental | 25 -0.944 0.270 0.000
physical | 26 -0.292 0.225 0.194
Difference | 27 -0.652 0.298 0.029
2010 - 2002 |
mental | 28 -0.068 0.318 0.832
physical | 29 0.323 0.261 0.215
Difference | 30 -0.391 0.355 0.271
2014 - 2002 |
mental | 31 -0.582 0.302 0.054
physical | 32 -0.229 0.245 0.351
Difference | 33 -0.353 0.324 0.275
Example interpretations:
We find, for example, that women report 0.99 more days of poor mental health, and 0.77 more days of poor physical health, per month, than do men. Although the effect of gender is 0.22 larger for mental health, the difference is not statistically significant.
…being married significantly reduces the days of poor mental health by 1.01, and the effect of marriage on physical health is nonsignificant. A test of the cross-model difference shows that the effect of marriage on mental health is significantly larger than the effect of marriage on physical health (cross-model difference = -.851, p < .01).
Similarly, the effect of age differs significantly across the two outcomes, with aging associated with fewer poor mental health days but more poor physical health days.
Ex 6.5 - Comparing marginal effects across different model types (ordinal vs nominal)
This example compares effects across two different types of models, in this case an ordinal and a nominal model. The models are otherwise identical, with the same outcome and predictor variables.
This example shows how to specify custom marginal effects for continuous independent variables by specifying a start( ) value and the amount( ) of change. In this case, a change from 20 to 30.
use "https://tdmize.github.io/data/data/gss_cme", clear
drop if year < 2010
drop if missing(partyid5, woman, edyrs, age, parent, married, faminc, employed, region4, year)
qui ologit partyid5 c.age##c.age i.woman c.edyrs i.parent i.married ///
i.race c.faminc i.employed i.region4 i.year, vce(robust)
est store ordinal
qui mlogit partyid5 c.age##c.age i.woman c.edyrs i.parent i.married ///
i.race c.faminc i.employed i.region4 i.year, vce(robust)
est store nominal
mecompare age, models(ordinal nominal) start(age=20) amount(10) covariates(atmeans)
Model 1 (ordinal) is:
ologit partyid5 c.age##c.age i.woman c.edyrs i.parent i.married i.race c.faminc i.employed i.region4 i.year, vce(robust)
Model 2 (nominal) is:
mlogit partyid5 c.age##c.age i.woman c.edyrs i.parent i.married i.race c.faminc i.employed i.region4 i.year, vce(robust)
Marginal effects and cross-model differences (N_ordinal=8179) (N_nominal=8179)
| ME # Estimate Robust SE P>|z|
---------------------------------+---------------------------------------
age + 10 |
Strong Dem - ordinal | 1 0.020 0.004 0.000
Strong Dem - nominal | 2 0.032 0.003 0.000
Strong Dem - Difference | 3 -0.012 0.003 0.000
Democrat - ordinal | 4 0.023 0.005 0.000
Democrat - nominal | 5 -0.010 0.011 0.352
Democrat - Difference | 6 0.034 0.009 0.000
Independent - ordinal | 7 -0.003 0.000 0.000
Independent - nominal | 8 -0.000 0.010 0.965
Independent - Difference | 9 -0.003 0.010 0.796
Republican - ordinal | 10 -0.024 0.005 0.000
Republican - nominal | 11 -0.031 0.011 0.003
Republican - Difference | 12 0.007 0.009 0.436
Strong Repub - ordinal | 13 -0.016 0.004 0.000
Strong Repub - nominal | 14 0.010 0.003 0.003
Strong Repub - Difference | 15 -0.026 0.004 0.000
Example interpretations:
For someone who is 20 years old, the effect of a 10-year increase in age is significantly different across the ordinal and nominal models for three of the five outcome categories. For example, the effect of aging on the probability of identifying as a strong Democrat is positive in both models, but it is significantly larger in the nominal model. Even more striking, the effects of age on being a strong Republican are in opposite directions across the two models.
Ex 6.6 - Comparing marginal effects across different samples or groups
This example compares effects across separate models on different samples, in this case one sample from 1986 and one from 2016. In this case, the models are identical (i.e., both use a binary logit with the same outcome and the same predictors).
The process is the same for comparing across separate models for different groups (e.g., one model for men and one model for women).
The individual models should be fit using if statements to select the samples with a single grouping variable selecting the samples. Then, the group( ) option is used on mecompare to identify the variable that selects the sample.
use "https://tdmize.github.io/data/data/gss_cme", clear
drop if missing(helpsickB, polviews, faminc, employed, woman, age, college, married, parent)
qui logit helpsickB i.conserv faminc i.employed i.woman age i.college ///
i.married i.parent i.race if year == 1986, vce(robust)
est store mod1986
qui logit helpsickB i.conserv faminc i.employed i.woman age i.college ///
i.married i.parent i.race if year == 2016, vce(robust)
est store mod2016
mecompare i.conserv, models(mod1986 mod2016) group(year)
Model 1 (mod1986) is:
logit helpsickB i.conserv faminc i.employed i.woman age i.college i.married i.parent i.race if year == 1986, vce(robust)
Model 2 (mod2016) is:
logit helpsickB i.conserv faminc i.employed i.woman age i.college i.married i.parent i.race if year == 2016, vce(robust)
Marginal effects and cross-model differences (N_mod1986=1254) (N_mod2016=1670)
| ME # Estimate Robust SE P>|z|
---------------------------------+---------------------------------------
conserv |
Conserva - Not Cons |
mod1986 | 1 -0.092 0.030 0.002
mod2016 | 2 -0.258 0.025 0.000
Difference | 3 0.166 0.039 0.000
Example interpretation
Conservatives had significantly lower predicted probabilities of thinking the government is responsible for providing health care in 1986 (ME = 0.092) and in 2016 (ME = 0.258). However, the polarization between conservatives and nonconservatives increased over time, as the effect of being conservative is significantly larger in 2016 than in 1986 (cross-model difference of MEs = 0.166; p < .001).
Ex 7 - Comparing marginal effects within a single model
The primary purpose of mecompare is to compare effects across models, but it can also be used to compare effects within a single model. This example compares whether the effect of college or married is larger using a binary logit model.
use "https://tdmize.github.io/data/data/gss_cme", clear
drop if year < 2000
drop if employed != 1
drop if missing(vhappy, college, wages, occprest, age, married, parent, woman, conserv, reltrad)
qui logit vhappy i.college i.married i.parent i.woman i.conserv ///
i.race i.reltrad i.year c.age##c.age c.occprest
est store happymod
mecompare i.college i.married, mod(happymod)
Marginal effects (N_happymod=9216)
| ME # Estimate Robust SE P>|z|
---------------------------------+---------------------------------------
college |
College - No Col D |
happymod | 1 0.033 0.012 0.004
---------------------------------+---------------------------------------
married |
Married - Not Marr |
happymod | 2 0.200 0.010 0.000
The melincom command can then be used to compare the equality of the marginal effects calculated by mecompare.
melincom 2 - 1
| lincom pvalue ll ul
----------------+---------------------------------------
1 | 0.167 0.000 0.136 0.198
Example interpretation:
Those with college degrees have a 0.033 higher probability of being very happy than those without a degree. Married individuals have a 0.20 higher probability of being happy than unmarried individuals. The effect of marital status is larger than the effect of college (difference p < .001).