umb001.jpg                                        USING PROPENSITY SCORES IN QUASI-EXPERIMENTAL DESIGNS

William M. Holmes

SPSS COMMANDS FOR PROPENSITY USE

 

            Many uses of propensity scores are possible with SPSS commands. The following presents some of these commands. There may be other ways of accomplishing the same result. Some uses of propensity scores are not possible directly using SPSS commands. However, with an add-on R extender for SPSS, any procedures not possible within SPSS directly can be executed through R . An overview of the R interface for SPSS by Felix Thoemmes, can be found at http://arxiv.org/ftp/arxiv/papers/1201/1201.6385.pdf

 

NORMALITY TESTS

            Testing whether the distribution is normal or some other shape can be done either with the Kolmogorov-Smirnoff onesample test within NPAR.

 

NPTESTS   /ONESAMPLE TEST (  confounder1, confounder2)

    KOLMOGOROV_SMIRNOV(NORMAL=SAMPLE EXPONENTIAL=SAMPLE

    POISSON=SAMPLE ).

        /MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE.

IMBALANCE ASSESSMENT PROCEDURES

            Imbalance tests with SPSS can be done with the MEANS program (for ANOVA statistics) or T-TEST.

 

            MEANS TABLES=confounder1, confounder2, confounder3, confounder4

                BY treatment/CELLS MEAN STDDEV VARIANCE COUNT

                SUM / STAT ANOVA.

 

            T-TEST   GROUPS=treatment(0 1) /MISSING=ANALYSIS

     /VARIABLES= confounder1, confounder2, confounder3, confounder4

                 /CRITERIA=CI(.95).       

 

PROPENSITY ESTIMATION

            Propensity scores can be estimated using a REGRESSION program, LOGISTIC REGRESSION,  GLM, or DISCRIMINANT.

 

            REGRESSION    /MISSING LISTWISE    /STATISTICS COEFF OUTS R ANOVA

                  /CRITERIA=PIN(.05) POUT(.10)  /NOORIGIN   /DEPENDENT treatment

                  /METHOD=ENTER confounder1, confounder2, confounder3, confounder4

                  /SAVE PRED (propen).

 

            LOGISTIC REGRESSION VARIABLES treatment  /METHOD=ENTER confounder1,

                confounder2, confounder3, confounder4  /SAVE=PRED (propen)

                /CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).

 

            GLM

            DISCRIMINANT    /GROUPS=treatment0 1)   /VARIABLES=confounder1

    confounder2 confounder3, confounder4   /ANALYSIS ALL

     /save PROBS (propen)   /PRIORS EQUAL   /STATISTICS=MEAN STDDEV

    UNIVF COEFF   /CLASSIFY=NONMISSING POOLED.

 

MATCHING

            Most matching with SPSS has to be done using the R Extender add-on to execute R programs that do matching.  A version of exact or coarsened matching and of greedy matching are two exceptions.

            Exact and Coarsened Exact

            Break the file into treatment and comparison group files containing propensity scores. Aggregate each file. Merge aggregated control file into disaggregated treatment using propensity scores as IDs, trimmed to as many significant digits as desired. Add cases of comparison disaggregated file to treatment file.  Create flag for cases having merge matches.  Select Cases meeting flag. This produces a 1-many match.  For 1-1 match, purge comparison cases having duplicate subject identifiers (the subject ID’s, not the propensities used as case ID’s).

 

            Nearest Neighbor and Caliper

            Painter (2004) has created an SPSS Macro to do Nearest Neighbor matching. It has been extended by Clark (2012). The Painter macro is available from: http://www.unc.edu/~painter/SPSSsyntax/propen.txt. It requires an input file containing a propensity score named propen and an intervention variable named treatm. You must also specify the number of cases. Instructions are contained in the file propen.txt. The Clark version and instructions are posted at http://faculty.umb.edu/william_holmes/clarkmacro.htm.

 

            1-Many

            This is currently available through R programs via the R Extender Add-on.

 

            Optimized, Full, and Genetic

            Optimized, Full, and  Genetic matching within SPSS commands is not currently possible.  Optimized programs can be executed within SPSS using the R extension add-on.

 

STRATIFYING

Stratifying using propensity scores is achieved in SPSS by recoding the propensity score into 5 groups whose range of values are equal. The new, variable groups are the strata.

RECODE propen (.20 THRU .29=1)(.30 THRU .39=2)(.40 THRU .49=3)

    (.50 THRU .59=4)(.60 THRU .69=5) INTO propenstrata.

 

REGRESSION AND THE GENERAL LINEAR MODEL

            These procedures can accomplished in SPSS with the GLM program or the REGRESSION program.

            REGRESSION    /MISSING LISTWISE   /STATISTICS COEFF OUTS R ANOVA

                 /CRITERIA=PIN(.05) POUT(.10)   /NOORIGIN   /DEPENDENT treatment

                 /METHOD=ENTER confounder1 confounder2 confounder3 confounder4

                  /SAVE PRED (propen).

 

            GLM  income BY treatment /EMMEANS TABLES(treatment)

                /PRINTDESCRIPTIVES PARAMETER.

 

TWO-STAGE LEAST SQUARES

            Two-Stage Least Squares may be done either with 2SLS or with WLS procedures.

            2SLS income WITH treatment  /INSTRUMENTS age   /CONSTANT

                 /SAVE PRED RESID.

 

            WLS income WITH treatment   /INSTRUMENTS age   /CONSTANT

      /SAVE  PRED RESID.

 

SAMPLE WEIGHTING

            COMPUTE ipw=1/propen.

            IF (treatment EQ 1)ipw=1/(1-propen).

WEIGHT BY ipw.

 

WEIGHTED LEAST SQUARES

            This may be done either with the WLS procedure or the GLM procedure.

GLM income BY treatment /EMMEANS TABLES(treatment)

    /REGWGT ipw /PRINT DESCRIPTIVES PARAMETER.

 

GENERALIZED LINEAR MODEL

GZLM is done with the GENLIN PROGRAM. The following produces logit predicted propensity scores.

            GENLIN treatment (REFERENCE=LAST) BY confounder1

                  confounder2    (ORDER=ASCENDING)   /MODEL  confounder1

                 confounder2    INTERCEPT=YES   DISTRIBUTION=BINOMIAL

                LINK=LOGIT    /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=MODEL

    MAXITERATIONS=100  MAXSTEPHALVING=5   PCONVERGE=.001     

    (ABSOLUTE) SINGULAR=1E-012 ANALYSISTYPE=3(WALD) CILEVEL=95

    CITYPE=WALD LIKELIHOOD=FULL   /MISSING CLASSMISSING=EXCLUDE

     /PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION

     /SAVE MEANPRED (propen) .

MISSING DATA ANALYSIS

            Missing value analysis is done in SPSS with the MVA command. It also allows em estimation of missing values that can be saved as an output file. It can also be done with the multiple imputation command.

            MVA VARIABLES=educ_1 boyorgrl_1  agekdbrn_1 finrela_1   /TPATTERN

                PERCENT=1 DESCRIBE=agekdbrn_1    /EM(TOLERANCE=0.001

                CONVERGENCE=0.0001 ITERATIONS=25)   /EM

                (OUTFILE='c:\spssdata\gss\emdata.sav')  .

 

MULTIPLE IMPUTATION confounder1 confounder2 confounder3

    confounder4    /IMPUTE METHOD=NONE

    /MISSINGSUMMARIES VARIABLES (MINPCTMISSING=.001) .

  

IMPUTATION OF MISSING DATA

            Multiple imputation in SPSS is done with the multiple imputation command. A new data file  is saved containing imputed values. Running subfiles for a command on this file calculates the results with average imputed results.

 

            MULTIPLE IMPUTATION confounder1, confounder2, confounder3, confounder4

                 /IMPUTE METHOD=FCS   /CONSTRAINTS confounder1

    (RND=1 MIN=1)   /CONSTRAINTS confounder1  (MAX=20)

     /OUTFILE IMPUTATIONS = IMPUTEDDATA.

 

Return to Home

 

 Revised 1/31/2012