title: “SCORE: Fielding-Miller et al. (2020) replication”
subtitle: “Analysis of 5% sample”
author: “Radoslaw Panczak”
date: 23 Nov 2020

Preps

Read data

C:\external\SCORE_Fielding-Miller_covid_R3pV\data

Contains data from merged_covid_usa_prepared_original_sample.dta
  obs:           125                          
 vars:            18                          23 Nov 2020 11:22
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
              storage   display    value
variable name   type    format     label      variable label
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
date            str8    %9s                   
nonurban        byte    %9.0g      nonurban   Non-urban counties flag
county          str33   %33s                  
state           str20   %20s                  
fips            long    %12.0g                
cases           long    %8.0g                 Cases
deaths          int     %8.0g                 Deaths
nonenglish      double  %10.0g                % nonenglish speaking hh
farmwork        float   %9.0g                 % engaged in farmwork
uninsured       float   %9.0g                 % uninsured under 65
poverty         double  %10.0g                % below poverty line
older           double  %10.0g                % aged 65 and above
pop_dens        float   %9.0g                 Pop density
date_case1      int     %td..                 Date of case 1
time_case1      byte    %9.0g                 Days since case 1 in county
sip_effect      int     %td..                 SIP effect start date
date_case100    int     %td..                 Date of case 100
time_case100    byte    %9.0g                 Days between case 100 in county and SIP in state
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Sorted by: 

Missing data

. mdesc

    Variable    │     Missing          Total     Percent Missing
────────────────┼───────────────────────────────────────────────
           date │           0            125           0.00
       nonurban │           0            125           0.00
         county │           0            125           0.00
          state │           0            125           0.00
           fips │           0            125           0.00
          cases │           0            125           0.00
         deaths │           0            125           0.00
     nonenglish │           0            125           0.00
       farmwork │           3            125           2.40
      uninsured │           0            125           0.00
        poverty │           1            125           0.80
          older │           0            125           0.00
       pop_dens │           0            125           0.00
     date_case1 │           0            125           0.00
     time_case1 │           0            125           0.00
     sip_effect │           5            125           4.00
   date_case100 │          86            125          68.80
   time_case100 │           0            125           0.00
────────────────┼───────────────────────────────────────────────

States

. ta state, m

               state │      Freq.     Percent        Cum.
─────────────────────┼───────────────────────────────────
             Arizona │         12        9.60        9.60
          California │          6        4.80       14.40
            Colorado │         52       41.60       56.00
              Kansas │         11        8.80       64.80
            Nebraska │          3        2.40       67.20
          New Mexico │         20       16.00       83.20
            Oklahoma │          3        2.40       85.60
               Texas │          5        4.00       89.60
                Utah │         11        8.80       98.40
             Wyoming │          2        1.60      100.00
─────────────────────┼───────────────────────────────────
               Total │        125      100.00

Urban

. ta nonurban, m

  Non-urban │
   counties │
       flag │      Freq.     Percent        Cum.
────────────┼───────────────────────────────────
      urban │          4        3.20        3.20
  non-urban │        121       96.80      100.00
────────────┼───────────────────────────────────
      Total │        125      100.00

Outcome

. univar deaths, dec(0)
                                        ────────────── Quantiles ──────────────
Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
───────────────────────────────────────────────────────────────────────────────
  deaths     125       11       26        0        0        0        4      132
───────────────────────────────────────────────────────────────────────────────

. * hist deaths, width(10)

SAR models

Largely based on Stata’s SP manual.

Creating the Stata-format shapefile

. spshape2dta cb_2018_us_county_20m_prep_sample.shp, replace
  (importing .shp file)
  (importing .dbf file)
  (creating _ID spatial-unit id)
  (creating _CX coordinate)
  (creating _CY coordinate)

  file cb_2018_us_county_20m_prep_sample_shp.dta created
  file cb_2018_us_county_20m_prep_sample.dta     created

. * spshape2dta cb_2018_us_county_20m_prep.shp, replace
.     
. u cb_2018_us_county_20m_prep_sample, clear

. d

Contains data from cb_2018_us_county_20m_prep_sample.dta
  obs:           157                          
 vars:             5                          23 Nov 2020 11:22
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
              storage   display    value
variable name   type    format     label      variable label
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
_ID             int     %12.0g                Spatial-unit ID
_CX             double  %10.0g                x-coordinate of area centroid
_CY             double  %10.0g                y-coordinate of area centroid
fips            long    %9.0f                 fips
state_code      str2    %9s                   state_code
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Sorted by: _ID

. spset fips, modify replace
  (_shp.dta file saved)
  (data in memory saved)
  Sp dataset cb_2018_us_county_20m_prep_sample.dta
                data:  cross sectional
     spatial-unit id:  _ID (equal to fips)
         coordinates:  _CX, _CY (planar)
    linked shapefile:  cb_2018_us_county_20m_prep_sample_shp.dta

. spset, modify coordsys(latlong, miles)
  Sp dataset cb_2018_us_county_20m_prep_sample.dta
                data:  cross sectional
     spatial-unit id:  _ID (equal to fips)
         coordinates:  _CY, _CX (latitude-and-longitude, miles)
    linked shapefile:  cb_2018_us_county_20m_prep_sample_shp.dta


. sa, replace 
file cb_2018_us_county_20m_prep_sample.dta saved

Case selection

In order to test focal hypothesis only nonurban counties are used.

Counties with missing information on farmwork and poverty are also excluded since models will not run in situations in which counties exist in prepared spatial data but are excluded from regression (default Stata behaviour afaik).

. u "merged_covid_usa_prepared_original_sample.dta" , clear

. *u "merged_covid_usa_prepared_original.dta" , clear
. keep if nonurban
(4 observations deleted)

.     
. * stata deletes cases with missing obs
. * then the dataset doesnt match the spatial matrix
. * either fix missings or drop
. drop if mi(farmwork)
(3 observations deleted)

. drop if mi(poverty)
(1 observation deleted)

Merging covid data with the Stata-format shapefile

. merge 1:1 fips using cb_2018_us_county_20m_prep_sample

    Result                           # of obs.
    ─────────────────────────────────────────
    not matched                            40
        from master                         0  (_merge==1)
        from using                         40  (_merge==2)

    matched                               117  (_merge==3)
    ─────────────────────────────────────────

. *merge 1:1 fips using cb_2018_us_county_20m_prep
. assert _merge != 1

. keep if _merge == 3
(40 observations deleted)

. drop _merge

.     
. * grmap deaths

Fitting model with a spatial lag of the dependent variable

Using gs2sls option (instead of ml) to guard against potential heteroskedasticity.

. spmatrix create contiguity W, replace 
  weighting matrix in W contains 3 islands

. 
. spregress deaths nonenglish farmwork uninsured poverty older pop_dens time_case1 time_case100, gs2sls dvarlag(W)
  (117 observations)
  (117 observations (places) used)
  (weighting matrix defines 117 places)

Spatial autoregressive model                    Number of obs     =        117
GS2SLS estimates                                Wald chi2(9)      =     165.72
                                                Prob > chi2       =     0.0000
                                                Pseudo R2         =     0.5895

─────────────┬────────────────────────────────────────────────────────────────
      deaths │      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
─────────────┼────────────────────────────────────────────────────────────────
deaths       │
  nonenglish │   -.098869    .484726    -0.20   0.838    -1.048914    .8511764
    farmwork │   .2569663   .3432206     0.75   0.454    -.4157337    .9296663
   uninsured │   .0136608    .183338     0.07   0.941    -.3456751    .3729968
     poverty │   .1671235   .2698939     0.62   0.536    -.3618588    .6961057
       older │  -.4547352   .2922553    -1.56   0.120    -1.027545    .1180746
    pop_dens │   .0704497   .0117454     6.00   0.000     .0474291    .0934702
  time_case1 │   .5964849   .1571643     3.80   0.000     .2884485    .9045213
time_case100 │  -.3877583   .2244585    -1.73   0.084    -.8276889    .0521724
       _cons │  -10.85542   9.045094    -1.20   0.230    -28.58348    6.872641
─────────────┼────────────────────────────────────────────────────────────────
W            │
      deaths │   .2020525   .1471494     1.37   0.170    -.0863551    .4904601
─────────────┴────────────────────────────────────────────────────────────────
Wald test of spatial terms:          chi2(1) = 1.89       Prob > chi2 = 0.1697

.     
. * estat impact
.     
. est sto m_original_queen

. 

Sensitivity analysis using rook contiguity

. spmatrix create contiguity W, rook replace 
  weighting matrix in W contains 3 islands

. 
. spregress deaths nonenglish farmwork uninsured poverty older pop_dens time_case1 time_case100, gs2sls dvarlag(W)
  (117 observations)
  (117 observations (places) used)
  (weighting matrix defines 117 places)

Spatial autoregressive model                    Number of obs     =        117
GS2SLS estimates                                Wald chi2(9)      =     166.11
                                                Prob > chi2       =     0.0000
                                                Pseudo R2         =     0.5899

─────────────┬────────────────────────────────────────────────────────────────
      deaths │      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
─────────────┼────────────────────────────────────────────────────────────────
deaths       │
  nonenglish │  -.1192228   .4839862    -0.25   0.805    -1.067818    .8293727
    farmwork │   .2588012   .3429208     0.75   0.450    -.4133113    .9309137
   uninsured │   .0157909   .1833139     0.09   0.931    -.3434978    .3750796
     poverty │   .1832614   .2698682     0.68   0.497    -.3456705    .7121933
       older │  -.4681431   .2925293    -1.60   0.110     -1.04149    .1052039
    pop_dens │   .0704416   .0117314     6.00   0.000     .0474485    .0934347
  time_case1 │   .5941377   .1572338     3.78   0.000     .2859651    .9023103
time_case100 │  -.3911039    .224273    -1.74   0.081    -.8306708    .0484631
       _cons │  -10.76116   9.031736    -1.19   0.233    -28.46304    6.940713
─────────────┼────────────────────────────────────────────────────────────────
W            │
      deaths │   .1964226   .1401691     1.40   0.161    -.0783038    .4711489
─────────────┴────────────────────────────────────────────────────────────────
Wald test of spatial terms:          chi2(1) = 1.96       Prob > chi2 = 0.1611

.     
. est sto m_original_rook

.     
. est tab m_original_queen m_original_rook , b(%6.3f) p(%6.3f)

─────────────┬────────────────────
    Variable │ m_ori~n   m_ori~k  
─────────────┼────────────────────
deaths       │
  nonenglish │  -0.099    -0.119  
             │   0.838     0.805  
    farmwork │   0.257     0.259  
             │   0.454     0.450  
   uninsured │   0.014     0.016  
             │   0.941     0.931  
     poverty │   0.167     0.183  
             │   0.536     0.497  
       older │  -0.455    -0.468  
             │   0.120     0.110  
    pop_dens │   0.070     0.070  
             │   0.000     0.000  
  time_case1 │   0.596     0.594  
             │   0.000     0.000  
time_case100 │  -0.388    -0.391  
             │   0.084     0.081  
       _cons │ -10.855   -10.761  
             │   0.230     0.233  
─────────────┼────────────────────
W            │
      deaths │   0.202     0.196  
             │   0.170     0.161  
─────────────┴────────────────────
                       legend: b/p