Mexico temperature notebook

Kodi B. Arfer
Created 10 Sep 2018 • Last modified 2 Jun 2020

How things were

Stages:

  1. Mixed effects for grid cells with both satellite and ground data
  2. Predict temperature in cells with satellite data but no ground data, using the mixed model(s) fit at stage 1
  3. Predict temperature in cells with no satellite data using a spatial smoother

Rosenfeld et al. (2017) describes the method as applied to Israel.

Time shenanigans

Hu, Brunsell, Monaghan, Barlage, and Wilhelmi (2014): "The overpass times provided by MODIS LST product are in local solar time, which is defined as the MODIS observation time in coordinated universal time (UTC) plus longitude in degrees divided by 15 (Williamson, Hik, Gamon, Kavanaugh, & Koh, 2013)."

MODIS documentation: "Note that the Day_view_time and Night_view_time are in local solar time, which is the UTC time plus grid’s longitude in degrees / 15 degrees (in hours , +24 if local solar time < 0 or - 24 if local solar time >= 24). The data day in the name of all the daily MOD11A1 files is in UTC so the data day in local solar time at each grid may be different from the data day in UTC by one day."

Metereological missingness

Are there any days on which one of the ground-station meterological variables is missing from every station?

sapply(
    subset(select = -date, ground[,
        by = date,
        .SDcols = c(temp.ground.vars, nontemp.ground.vars),
        lapply(.SD, function(v) all(is.na(v)))]),
    any)
  x
ground.temp.lo FALSE
ground.temp.mean FALSE
ground.temp.hi FALSE
wind.speed.mean FALSE

Fortunately, no.

Satellite-temperature missingness

r = rbindlist(lapply(available.years, function(y)
   {d = model.dataset(y, mrow.set = "pred.area")
    cbind(year = y, d[,
        .SDcols = c("satellite.temp.day.imputed", "satellite.temp.night.imputed"),
        lapply(.SD, mean)])}))
setnames(r, c("year", "day", "night"))
rd(d = 2, as.data.frame(r))
  year day night
1 2003 0.30 0.35
2 2004 0.52 0.50
3 2005 0.30 0.30
4 2006 0.31 0.36
5 2007 0.28 0.33
6 2008 0.29 0.30
7 2009 0.26 0.30
8 2010 0.32 0.33
9 2011 0.23 0.27
10 2012 0.31 0.35
11 2013 0.32 0.37
12 2014 0.32 0.35
13 2015 0.34 0.37
14 2016 0.34 0.36
15 2017 0.27 0.28
16 2018 0.29 0.34

Cross-validation results

sr = summarize.cv.results(multi.run.cv(available.years))

Here are RMSE and R2 by year and DV, as well as the proportion of daily Moran's I statistics of the signed error that are significant at α = .05. N and stn denote the number of observations and stations in the megalopolis, for which predictions were made; stations in the larger region, which were only used for training, aren't counted. sd.s and rmse.s are spatially weighted RMSEs, in which each day and each sixteenth of a lon–lat cell are given total weight 1.

library(data.table)
as.data.frame(rd(d = 2, sr$overall))
  year dv N stn sd rmse R2 sd.s rmse.s R2.spatial R2.temporal Moran ps < .05
1 2003 hi 7483 25 3.42 1.03 0.91 3.71 1.41 0.76 0.93 0.03
2 2003 lo 7483 25 3.62 1.66 0.79 4.17 2.65 0.65 0.89 0.03
3 2003 mean 7483 25 3.01 0.82 0.93 3.29 1.21 0.24 0.94 0.05
4 2004 hi 7901 26 2.97 1.10 0.86 3.32 1.39 0.60 0.91 0.05
5 2004 lo 7901 26 3.30 1.56 0.78 3.98 2.51 0.16 0.90 0.02
6 2004 mean 7901 26 2.53 0.86 0.89 2.92 1.18 0.56 0.94 0.05
7 2005 hi 8939 30 3.40 1.31 0.85 3.48 1.57 0.57 0.93 0.03
8 2005 lo 8939 30 3.35 1.56 0.78 3.95 2.44 0.69 0.89 0.04
9 2005 mean 8939 30 2.91 0.91 0.90 3.08 1.15 0.63 0.95 0.02
10 2006 hi 8303 29 3.25 1.36 0.82 3.45 1.69 0.64 0.90 0.02
11 2006 lo 8303 29 3.50 1.54 0.81 4.09 2.26 0.54 0.88 0.02
12 2006 mean 8303 29 2.72 0.97 0.87 3.00 1.22 0.77 0.92 0.05
13 2007 hi 7483 31 3.13 1.26 0.84 3.37 1.60 0.54 0.86 0.15
14 2007 lo 7483 31 3.19 1.53 0.77 3.71 2.14 0.49 0.84 0.11
15 2007 mean 7483 31 2.58 0.95 0.86 2.89 1.23 0.63 0.90 0.22
16 2008 hi 9190 31 3.19 1.26 0.84 3.40 1.50 0.55 0.88 0.34
17 2008 lo 9190 31 3.48 1.39 0.84 3.99 1.95 0.81 0.87 0.45
18 2008 mean 9190 31 2.80 0.90 0.90 3.04 1.12 0.85 0.91 0.57
19 2009 hi 10098 37 3.36 1.32 0.84 3.55 1.70 0.34 0.91 0.41
20 2009 lo 10098 37 3.31 1.44 0.81 4.03 2.13 0.72 0.86 0.51
21 2009 mean 10098 37 2.82 0.98 0.88 3.07 1.21 0.76 0.93 0.63
22 2010 hi 13559 48 5.39 1.43 0.93 7.98 1.87 0.95 0.90 0.16
23 2010 lo 13559 48 4.53 1.62 0.87 5.78 2.16 0.90 0.85 0.44
24 2010 mean 13559 48 4.50 1.11 0.94 6.52 1.48 0.96 0.92 0.29
25 2011 hi 13713 44 5.15 1.56 0.91 7.39 1.98 0.93 0.85 0.06
26 2011 lo 13713 44 4.34 1.63 0.86 5.42 2.03 0.90 0.82 0.14
27 2011 mean 13713 44 4.35 1.15 0.93 6.02 1.44 0.96 0.88 0.15
28 2012 hi 14857 50 4.83 1.45 0.91 6.78 1.83 0.92 0.86 0.16
29 2012 lo 14857 50 4.00 1.68 0.82 5.11 2.10 0.89 0.76 0.36
30 2012 mean 14857 50 4.04 1.10 0.93 5.60 1.43 0.96 0.86 0.37
31 2013 hi 16316 55 5.03 1.68 0.89 6.25 2.12 0.90 0.84 0.12
32 2013 lo 16316 55 4.23 1.81 0.82 5.05 2.20 0.89 0.73 0.42
33 2013 mean 16316 55 4.25 1.19 0.92 5.22 1.44 0.96 0.85 0.37
34 2014 hi 17979 57 4.77 1.73 0.87 6.13 2.15 0.89 0.81 0.14
35 2014 lo 17979 57 4.28 1.66 0.85 5.14 2.01 0.91 0.79 0.41
36 2014 mean 17979 57 4.11 1.14 0.92 5.20 1.34 0.95 0.85 0.42
37 2015 hi 19827 67 4.66 1.70 0.87 6.23 2.21 0.89 0.79 0.22
38 2015 lo 19827 67 4.03 1.71 0.82 5.23 2.02 0.79 0.73 0.46
39 2015 mean 19827 67 3.98 1.16 0.91 5.35 1.38 0.92 0.81 0.47
40 2016 hi 22557 68 4.81 1.69 0.88 6.09 2.14 0.88 0.86 0.19
41 2016 lo 22557 68 4.25 1.92 0.80 5.25 2.11 0.84 0.77 0.57
42 2016 mean 22557 68 4.16 1.30 0.90 5.31 1.41 0.93 0.87 0.35
43 2017 hi 22953 75 4.49 1.74 0.85 5.95 2.09 0.86 0.80 0.23
44 2017 lo 22953 75 4.58 2.00 0.81 5.62 2.38 0.83 0.81 0.52
45 2017 mean 22953 75 4.16 1.39 0.89 5.42 1.55 0.90 0.86 0.35
46 2018 hi 22203 80 4.25 1.75 0.83 5.60 1.97 0.86 0.82 0.32
47 2018 lo 22203 80 4.08 1.93 0.78 5.15 2.13 0.85 0.76 0.59
48 2018 mean 22203 80 3.82 1.40 0.87 5.08 1.42 0.90 0.86 0.48

Here's plot of the above SDs and RMSEs. Each year gets a line going from the SD (top) to the RMSE (bottom).

ggplot(transform(sr$overall,
        dv = factor(dv, levels = c("hi", "mean", "lo")))) +
    geom_linerange(aes(
        sprintf("%02d", year - 2000),
        ymin = rmse, ymax = sd)) +
    facet_grid(dv ~ ., labeller = label_both) +
    no.gridlines() +
    xlab("Year") +
    scale_y_continuous(expand = expand_scale(), name = "SD and RMSE") +
    coord_cartesian(ylim = c(0, 7))
rmse-by-year.png

RMSE by whether the satellite temperature was imputed (imp.d for day, imp.n for night):

as.data.frame(rd(d = 2, sr$by.imp))
  year dv imp.d imp.n N stn sd rmse sd - rmse
1 2003 hi FALSE FALSE 4298 24 3.32 1.00 2.32
2 2003 hi FALSE TRUE 1253 24 2.84 1.00 1.84
3 2003 hi TRUE FALSE 780 25 3.07 1.10 1.96
4 2003 hi TRUE TRUE 1152 24 3.05 1.12 1.93
5 2003 lo FALSE FALSE 4298 24 3.76 1.86 1.90
6 2003 lo FALSE TRUE 1253 24 3.18 1.29 1.89
7 2003 lo TRUE FALSE 780 25 3.61 1.56 2.05
8 2003 lo TRUE TRUE 1152 24 2.72 1.23 1.49
9 2003 mean FALSE FALSE 4298 24 3.22 0.87 2.35
10 2003 mean FALSE TRUE 1253 24 2.62 0.68 1.94
11 2003 mean TRUE FALSE 780 25 2.91 0.87 2.04
12 2003 mean TRUE TRUE 1152 24 2.39 0.72 1.67
13 2004 hi FALSE FALSE 2408 26 2.71 1.09 1.62
14 2004 hi FALSE TRUE 1482 26 2.52 1.11 1.42
15 2004 hi TRUE FALSE 1717 26 2.71 1.07 1.63
16 2004 hi TRUE TRUE 2294 26 3.23 1.11 2.12
17 2004 lo FALSE FALSE 2408 26 3.63 1.69 1.94
18 2004 lo FALSE TRUE 1482 26 3.10 1.53 1.57
19 2004 lo TRUE FALSE 1717 26 3.01 1.72 1.30
20 2004 lo TRUE TRUE 2294 26 2.66 1.30 1.37
21 2004 mean FALSE FALSE 2408 26 2.79 0.89 1.90
22 2004 mean FALSE TRUE 1482 26 2.42 0.86 1.56
23 2004 mean TRUE FALSE 1717 26 2.22 0.91 1.31
24 2004 mean TRUE TRUE 2294 26 2.48 0.77 1.71
25 2005 hi FALSE FALSE 5397 30 3.36 1.32 2.04
26 2005 hi FALSE TRUE 1104 28 3.09 1.36 1.73
27 2005 hi TRUE FALSE 1093 28 3.08 1.28 1.80
28 2005 hi TRUE TRUE 1345 29 2.89 1.28 1.62
29 2005 lo FALSE FALSE 5397 30 3.56 1.69 1.87
30 2005 lo FALSE TRUE 1104 28 2.84 1.42 1.42
31 2005 lo TRUE FALSE 1093 28 2.74 1.46 1.28
32 2005 lo TRUE TRUE 1345 29 2.32 1.18 1.15
33 2005 mean FALSE FALSE 5397 30 3.18 0.96 2.22
34 2005 mean FALSE TRUE 1104 28 2.70 0.84 1.86
35 2005 mean TRUE FALSE 1093 28 2.35 0.82 1.53
36 2005 mean TRUE TRUE 1345 29 2.13 0.81 1.31
37 2006 hi FALSE FALSE 4855 29 2.95 1.33 1.62
38 2006 hi FALSE TRUE 1269 28 2.76 1.33 1.43
39 2006 hi TRUE FALSE 753 29 3.45 1.55 1.90
40 2006 hi TRUE TRUE 1426 29 3.25 1.41 1.84
41 2006 lo FALSE FALSE 4855 29 3.64 1.65 1.99
42 2006 lo FALSE TRUE 1269 28 2.51 1.40 1.11
43 2006 lo TRUE FALSE 753 29 3.01 1.58 1.43
44 2006 lo TRUE TRUE 1426 29 2.39 1.20 1.20
45 2006 mean FALSE FALSE 4855 29 2.91 0.98 1.93
46 2006 mean FALSE TRUE 1269 28 2.27 0.95 1.32
47 2006 mean TRUE FALSE 753 29 2.51 1.05 1.46
48 2006 mean TRUE TRUE 1426 29 2.42 0.91 1.51
49 2007 hi FALSE FALSE 4582 31 2.68 1.20 1.48
50 2007 hi FALSE TRUE 1014 31 2.78 1.20 1.58
51 2007 hi TRUE FALSE 747 31 3.45 1.44 2.01
52 2007 hi TRUE TRUE 1140 31 3.54 1.41 2.14
53 2007 lo FALSE FALSE 4582 31 3.18 1.65 1.53
54 2007 lo FALSE TRUE 1014 31 2.72 1.23 1.49
55 2007 lo TRUE FALSE 747 31 2.77 1.47 1.30
56 2007 lo TRUE TRUE 1140 31 2.60 1.28 1.32
57 2007 mean FALSE FALSE 4582 31 2.57 0.98 1.59
58 2007 mean FALSE TRUE 1014 31 2.45 0.87 1.58
59 2007 mean TRUE FALSE 747 31 2.54 0.98 1.57
60 2007 mean TRUE TRUE 1140 31 2.61 0.90 1.70
61 2008 hi FALSE FALSE 5729 31 2.83 1.16 1.66
62 2008 hi FALSE TRUE 1220 31 2.82 1.34 1.49
63 2008 hi TRUE FALSE 928 31 3.47 1.44 2.03
64 2008 hi TRUE TRUE 1313 31 3.15 1.45 1.69
65 2008 lo FALSE FALSE 5729 31 3.52 1.48 2.04
66 2008 lo FALSE TRUE 1220 31 2.99 1.27 1.72
67 2008 lo TRUE FALSE 928 31 3.15 1.41 1.74
68 2008 lo TRUE TRUE 1313 31 2.40 1.05 1.35
69 2008 mean FALSE FALSE 5729 31 2.91 0.89 2.02
70 2008 mean FALSE TRUE 1220 31 2.59 0.95 1.64
71 2008 mean TRUE FALSE 928 31 2.69 0.99 1.70
72 2008 mean TRUE TRUE 1313 31 2.27 0.88 1.38
73 2009 hi FALSE FALSE 6342 37 2.99 1.27 1.73
74 2009 hi FALSE TRUE 1535 37 2.73 1.37 1.36
75 2009 hi TRUE FALSE 931 37 3.47 1.47 2.01
76 2009 hi TRUE TRUE 1290 37 3.62 1.44 2.18
77 2009 lo FALSE FALSE 6342 37 3.46 1.53 1.93
78 2009 lo FALSE TRUE 1535 37 2.86 1.33 1.53
79 2009 lo TRUE FALSE 931 37 3.01 1.49 1.52
80 2009 lo TRUE TRUE 1290 37 2.33 1.07 1.26
81 2009 mean FALSE FALSE 6342 37 2.90 0.98 1.92
82 2009 mean FALSE TRUE 1535 37 2.47 0.97 1.50
83 2009 mean TRUE FALSE 931 37 2.79 0.99 1.79
84 2009 mean TRUE TRUE 1290 37 2.50 0.98 1.53
85 2010 hi FALSE FALSE 7734 48 5.09 1.33 3.77
86 2010 hi FALSE TRUE 1708 48 4.28 1.42 2.86
87 2010 hi TRUE FALSE 1459 47 5.60 1.60 4.00
88 2010 hi TRUE TRUE 2658 47 5.64 1.60 4.04
89 2010 lo FALSE FALSE 7734 48 4.48 1.71 2.77
90 2010 lo FALSE TRUE 1708 48 3.45 1.48 1.96
91 2010 lo TRUE FALSE 1459 47 4.21 1.67 2.54
92 2010 lo TRUE TRUE 2658 47 4.13 1.42 2.71
93 2010 mean FALSE FALSE 7734 48 4.51 1.07 3.44
94 2010 mean FALSE TRUE 1708 48 3.67 1.10 2.57
95 2010 mean TRUE FALSE 1459 47 4.62 1.23 3.39
96 2010 mean TRUE TRUE 2658 47 4.56 1.15 3.41
97 2011 hi FALSE FALSE 8982 44 4.89 1.45 3.44
98 2011 hi FALSE TRUE 1842 44 4.54 1.61 2.93
99 2011 hi TRUE FALSE 979 44 6.20 1.76 4.44
100 2011 hi TRUE TRUE 1910 44 5.10 1.89 3.21
101 2011 lo FALSE FALSE 8982 44 4.33 1.70 2.63
102 2011 lo FALSE TRUE 1842 44 3.70 1.47 2.23
103 2011 lo TRUE FALSE 979 44 4.57 1.74 2.83
104 2011 lo TRUE TRUE 1910 44 3.64 1.41 2.24
105 2011 mean FALSE FALSE 8982 44 4.37 1.10 3.27
106 2011 mean FALSE TRUE 1842 44 3.91 1.18 2.73
107 2011 mean TRUE FALSE 979 44 5.10 1.36 3.73
108 2011 mean TRUE TRUE 1910 44 3.96 1.24 2.72
109 2012 hi FALSE FALSE 8078 50 4.42 1.37 3.05
110 2012 hi FALSE TRUE 2303 50 4.13 1.45 2.68
111 2012 hi TRUE FALSE 1878 50 5.34 1.56 3.79
112 2012 hi TRUE TRUE 2598 50 5.02 1.60 3.42
113 2012 lo FALSE FALSE 8078 50 3.95 1.83 2.12
114 2012 lo FALSE TRUE 2303 50 3.65 1.43 2.22
115 2012 lo TRUE FALSE 1878 50 4.04 1.70 2.34
116 2012 lo TRUE TRUE 2598 50 3.69 1.35 2.34
117 2012 mean FALSE FALSE 8078 50 3.97 1.11 2.85
118 2012 mean FALSE TRUE 2303 50 3.72 1.01 2.71
119 2012 mean TRUE FALSE 1878 50 4.37 1.19 3.18
120 2012 mean TRUE TRUE 2598 50 4.05 1.10 2.96
121 2013 hi FALSE FALSE 8589 55 4.49 1.59 2.90
122 2013 hi FALSE TRUE 2750 55 4.36 1.69 2.67
123 2013 hi TRUE FALSE 1819 55 5.57 1.87 3.70
124 2013 hi TRUE TRUE 3158 55 5.19 1.81 3.37
125 2013 lo FALSE FALSE 8589 55 4.23 2.01 2.22
126 2013 lo FALSE TRUE 2750 55 4.03 1.58 2.45
127 2013 lo TRUE FALSE 1819 55 4.43 1.84 2.59
128 2013 lo TRUE TRUE 3158 55 3.89 1.35 2.54
129 2013 mean FALSE FALSE 8589 55 4.13 1.21 2.93
130 2013 mean FALSE TRUE 2750 55 3.92 1.12 2.80
131 2013 mean TRUE FALSE 1819 55 4.78 1.31 3.47
132 2013 mean TRUE TRUE 3158 55 4.16 1.15 3.02
133 2014 hi FALSE FALSE 9791 57 4.54 1.67 2.87
134 2014 hi FALSE TRUE 2524 56 3.98 1.76 2.22
135 2014 hi TRUE FALSE 2241 56 4.96 1.86 3.10
136 2014 hi TRUE TRUE 3423 57 4.86 1.81 3.05
137 2014 lo FALSE FALSE 9791 57 4.32 1.83 2.50
138 2014 lo FALSE TRUE 2524 56 3.63 1.42 2.21
139 2014 lo TRUE FALSE 2241 56 4.27 1.71 2.57
140 2014 lo TRUE TRUE 3423 57 3.78 1.23 2.55
141 2014 mean FALSE FALSE 9791 57 4.17 1.15 3.02
142 2014 mean FALSE TRUE 2524 56 3.52 1.14 2.38
143 2014 mean TRUE FALSE 2241 56 4.28 1.24 3.04
144 2014 mean TRUE TRUE 3423 57 3.96 1.06 2.90
145 2015 hi FALSE FALSE 9818 67 4.11 1.65 2.47
146 2015 hi FALSE TRUE 3439 64 3.86 1.68 2.19
147 2015 hi TRUE FALSE 2955 65 5.01 1.75 3.26
148 2015 hi TRUE TRUE 3615 64 4.99 1.80 3.19
149 2015 lo FALSE FALSE 9818 67 4.02 1.87 2.15
150 2015 lo FALSE TRUE 3439 64 3.53 1.48 2.05
151 2015 lo TRUE FALSE 2955 65 4.21 1.76 2.45
152 2015 lo TRUE TRUE 3615 64 4.01 1.35 2.65
153 2015 mean FALSE FALSE 9818 67 3.80 1.20 2.60
154 2015 mean FALSE TRUE 3439 64 3.43 1.08 2.35
155 2015 mean TRUE FALSE 2955 65 4.36 1.17 3.19
156 2015 mean TRUE TRUE 3615 64 4.18 1.14 3.04
157 2016 hi FALSE FALSE 11743 68 4.47 1.64 2.83
158 2016 hi FALSE TRUE 3506 68 3.81 1.67 2.14
159 2016 hi TRUE FALSE 3123 68 5.17 1.80 3.38
160 2016 hi TRUE TRUE 4185 68 4.98 1.77 3.21
161 2016 lo FALSE FALSE 11743 68 4.32 2.03 2.29
162 2016 lo FALSE TRUE 3506 68 3.64 1.72 1.92
163 2016 lo TRUE FALSE 3123 68 4.34 2.06 2.29
164 2016 lo TRUE TRUE 4185 68 4.10 1.64 2.46
165 2016 mean FALSE FALSE 11743 68 4.10 1.29 2.81
166 2016 mean FALSE TRUE 3506 68 3.46 1.25 2.21
167 2016 mean TRUE FALSE 3123 68 4.45 1.41 3.03
168 2016 mean TRUE TRUE 4185 68 4.25 1.28 2.98
169 2017 hi FALSE FALSE 14421 75 4.25 1.67 2.58
170 2017 hi FALSE TRUE 2770 74 4.14 1.88 2.26
171 2017 hi TRUE FALSE 2505 74 5.34 1.92 3.42
172 2017 hi TRUE TRUE 3257 73 4.36 1.77 2.60
173 2017 lo FALSE FALSE 14421 75 4.55 2.15 2.40
174 2017 lo FALSE TRUE 2770 74 3.91 1.77 2.14
175 2017 lo TRUE FALSE 2505 74 4.62 1.95 2.66
176 2017 lo TRUE TRUE 3257 73 3.48 1.48 2.01
177 2017 mean FALSE FALSE 14421 75 4.13 1.39 2.74
178 2017 mean FALSE TRUE 2770 74 3.81 1.44 2.38
179 2017 mean TRUE FALSE 2505 74 4.77 1.45 3.32
180 2017 mean TRUE TRUE 3257 73 3.73 1.31 2.42
181 2018 hi FALSE FALSE 12357 80 3.96 1.77 2.18
182 2018 hi FALSE TRUE 4179 80 3.56 1.70 1.86
183 2018 hi TRUE FALSE 2369 79 4.39 1.72 2.67
184 2018 hi TRUE TRUE 3298 79 4.58 1.71 2.87
185 2018 lo FALSE FALSE 12357 80 4.15 2.11 2.04
186 2018 lo FALSE TRUE 4179 80 3.29 1.64 1.65
187 2018 lo TRUE FALSE 2369 79 4.29 1.90 2.39
188 2018 lo TRUE TRUE 3298 79 3.93 1.55 2.38
189 2018 mean FALSE FALSE 12357 80 3.77 1.43 2.34
190 2018 mean FALSE TRUE 4179 80 3.22 1.35 1.88
191 2018 mean TRUE FALSE 2369 79 4.15 1.40 2.75
192 2018 mean TRUE TRUE 3298 79 4.01 1.32 2.69

It's inconsistent whether we see greater improvement over the RMSE when both satellite temperatures are missing than when both are present:

m = merge(
    sr$by.imp[!imp.d & !imp.n],
    sr$by.imp[imp.d & imp.n],
    by = c("year", "dv"))
m[, table(get("sd - rmse.y") > get("sd - rmse.x"))]
  count
FALSE 23
TRUE 25

Here are the RMSEs (and Moran's I p-values for the per-station mean signed error) by meteorological season:

as.data.frame(rd(d = 2, sr$by.season))
  year dv season N stn sd rmse sd - rmse Moran p
1 2003 hi ColdDry 2507 25 3.08 1.02 2.05 0.68
2 2003 hi Rainy 3700 24 3.03 0.97 2.06 0.76
3 2003 hi WarmDry 1276 24 3.17 1.19 1.98 0.45
4 2003 lo ColdDry 2507 25 3.07 1.94 1.13 0.93
5 2003 lo Rainy 3700 24 1.94 1.30 0.65 0.39
6 2003 lo WarmDry 1276 24 3.07 1.98 1.09 0.90
7 2003 mean ColdDry 2507 25 2.54 0.96 1.58 0.88
8 2003 mean Rainy 3700 24 2.19 0.69 1.51 0.39
9 2003 mean WarmDry 1276 24 2.84 0.87 1.97 0.67
10 2004 hi ColdDry 2647 26 3.11 1.13 1.98 0.56
11 2004 hi Rainy 4060 25 2.51 1.10 1.40 0.33
12 2004 hi WarmDry 1194 21 2.82 1.00 1.82 0.28
13 2004 lo ColdDry 2647 26 3.00 1.88 1.13 0.91
14 2004 lo Rainy 4060 25 1.69 1.25 0.45 0.78
15 2004 lo WarmDry 1194 21 2.50 1.74 0.77 0.87
16 2004 mean ColdDry 2647 26 2.40 0.96 1.44 0.90
17 2004 mean Rainy 4060 25 1.74 0.80 0.95 0.48
18 2004 mean WarmDry 1194 21 2.34 0.80 1.54 0.56
19 2005 hi ColdDry 2939 29 2.74 1.25 1.48 0.94
20 2005 hi Rainy 4416 29 3.21 1.34 1.87 0.53
21 2005 hi WarmDry 1584 28 2.74 1.34 1.40 0.75
22 2005 lo ColdDry 2939 29 2.66 1.68 0.99 0.97
23 2005 lo Rainy 4416 29 2.25 1.37 0.88 0.44
24 2005 lo WarmDry 1584 28 2.93 1.81 1.12 0.43
25 2005 mean ColdDry 2939 29 2.18 0.91 1.27 0.70
26 2005 mean Rainy 4416 29 2.36 0.90 1.46 0.24
27 2005 mean WarmDry 1584 28 2.34 0.92 1.42 0.20
28 2006 hi ColdDry 2814 29 3.14 1.35 1.79 1.00
29 2006 hi Rainy 3877 28 2.67 1.38 1.29 0.60
30 2006 hi WarmDry 1612 27 2.69 1.35 1.34 0.58
31 2006 lo ColdDry 2814 29 3.20 1.62 1.58 0.90
32 2006 lo Rainy 3877 28 2.03 1.38 0.65 0.91
33 2006 lo WarmDry 1612 27 2.81 1.73 1.08 0.88
34 2006 mean ColdDry 2814 29 2.58 0.98 1.60 0.05
35 2006 mean Rainy 3877 28 1.83 0.96 0.88 0.68
36 2006 mean WarmDry 1612 27 2.26 0.99 1.28 0.90
37 2007 hi ColdDry 2781 31 2.87 1.35 1.52 0.49
38 2007 hi Rainy 3569 29 3.17 1.20 1.97 0.03
39 2007 hi WarmDry 1133 23 2.70 1.21 1.49 0.34
40 2007 lo ColdDry 2781 31 2.65 1.66 0.99 0.77
41 2007 lo Rainy 3569 29 2.50 1.37 1.13 0.00
42 2007 lo WarmDry 1133 23 2.88 1.65 1.23 0.09
43 2007 mean ColdDry 2781 31 2.23 1.04 1.19 0.05
44 2007 mean Rainy 3569 29 2.32 0.87 1.45 0.00
45 2007 mean WarmDry 1133 23 2.43 0.97 1.45 0.00
46 2008 hi ColdDry 3017 31 2.86 1.13 1.73 0.50
47 2008 hi Rainy 4523 30 3.04 1.32 1.72 0.05
48 2008 hi WarmDry 1650 30 3.16 1.31 1.85 0.78
49 2008 lo ColdDry 3017 31 2.61 1.47 1.14 0.74
50 2008 lo Rainy 4523 30 2.20 1.23 0.97 0.00
51 2008 lo WarmDry 1650 30 2.97 1.63 1.35 0.38
52 2008 mean ColdDry 3017 31 2.42 0.86 1.56 0.00
53 2008 mean Rainy 4523 30 2.10 0.88 1.22 0.00
54 2008 mean WarmDry 1650 30 2.80 1.03 1.76 0.04
55 2009 hi ColdDry 3145 37 3.44 1.36 2.08 0.41
56 2009 hi Rainy 5080 36 2.84 1.30 1.54 0.40
57 2009 hi WarmDry 1873 35 2.64 1.32 1.32 0.23
58 2009 lo ColdDry 3145 37 2.83 1.50 1.33 0.02
59 2009 lo Rainy 5080 36 1.77 1.37 0.40 0.00
60 2009 lo WarmDry 1873 35 2.98 1.55 1.44 0.57
61 2009 mean ColdDry 3145 37 2.43 0.97 1.47 0.06
62 2009 mean Rainy 5080 36 2.01 0.98 1.03 0.02
63 2009 mean WarmDry 1873 35 2.50 1.00 1.51 0.06
64 2010 hi ColdDry 4325 48 5.18 1.47 3.71 0.82
65 2010 hi Rainy 6988 45 5.03 1.38 3.66 0.96
66 2010 hi WarmDry 2246 43 4.95 1.50 3.45 0.84
67 2010 lo ColdDry 4325 48 3.77 1.83 1.94 0.32
68 2010 lo Rainy 6988 45 3.60 1.47 2.12 0.00
69 2010 lo WarmDry 2246 43 3.69 1.64 2.04 0.00
70 2010 mean ColdDry 4325 48 3.96 1.20 2.76 0.07
71 2010 mean Rainy 6988 45 3.94 1.05 2.89 0.41
72 2010 mean WarmDry 2246 43 4.15 1.11 3.05 0.12
73 2011 hi ColdDry 4486 44 4.73 1.39 3.34 0.72
74 2011 hi Rainy 6989 43 5.09 1.67 3.42 0.82
75 2011 hi WarmDry 2238 39 5.16 1.54 3.62 0.71
76 2011 lo ColdDry 4486 44 3.73 1.72 2.01 0.94
77 2011 lo Rainy 6989 43 3.87 1.56 2.31 0.18
78 2011 lo WarmDry 2238 39 4.21 1.66 2.54 0.97
79 2011 mean ColdDry 4486 44 3.92 1.09 2.83 0.77
80 2011 mean Rainy 6989 43 4.11 1.21 2.89 0.26
81 2011 mean WarmDry 2238 39 4.52 1.08 3.43 0.70
82 2012 hi ColdDry 4733 49 4.64 1.48 3.16 0.32
83 2012 hi Rainy 7416 47 4.61 1.45 3.16 0.54
84 2012 hi WarmDry 2708 47 4.57 1.37 3.19 0.75
85 2012 lo ColdDry 4733 49 3.59 1.81 1.78 0.12
86 2012 lo Rainy 7416 47 3.34 1.54 1.79 0.00
87 2012 lo WarmDry 2708 47 3.75 1.80 1.95 0.65
88 2012 mean ColdDry 4733 49 3.79 1.21 2.59 0.02
89 2012 mean Rainy 7416 47 3.62 1.05 2.57 0.04
90 2012 mean WarmDry 2708 47 3.90 1.06 2.84 0.25
91 2013 hi ColdDry 5539 54 4.71 1.66 3.05 0.33
92 2013 hi Rainy 8159 55 4.88 1.68 3.19 0.64
93 2013 hi WarmDry 2618 49 5.67 1.74 3.93 0.96
94 2013 lo ColdDry 5539 54 3.77 1.99 1.78 0.34
95 2013 lo Rainy 8159 55 3.61 1.51 2.10 0.00
96 2013 lo WarmDry 2618 49 4.84 2.22 2.62 0.79
97 2013 mean ColdDry 5539 54 3.89 1.28 2.61 0.02
98 2013 mean Rainy 8159 55 3.93 1.09 2.85 0.08
99 2013 mean WarmDry 2618 49 5.04 1.31 3.73 0.92
100 2014 hi ColdDry 5903 57 4.64 1.65 2.99 0.30
101 2014 hi Rainy 9004 55 4.44 1.77 2.66 1.00
102 2014 hi WarmDry 3072 55 4.86 1.77 3.09 0.37
103 2014 lo ColdDry 5903 57 4.02 1.84 2.17 0.35
104 2014 lo Rainy 9004 55 3.56 1.44 2.12 0.01
105 2014 lo WarmDry 3072 55 4.13 1.87 2.26 0.26
106 2014 mean ColdDry 5903 57 3.99 1.20 2.78 0.02
107 2014 mean Rainy 9004 55 3.63 1.09 2.53 0.40
108 2014 mean WarmDry 3072 55 4.28 1.16 3.13 0.06
109 2015 hi ColdDry 6265 66 4.49 1.71 2.78 0.81
110 2015 hi Rainy 10430 61 4.43 1.65 2.78 0.25
111 2015 hi WarmDry 3132 60 5.28 1.81 3.47 0.71
112 2015 lo ColdDry 6265 66 3.94 1.92 2.02 0.00
113 2015 lo Rainy 10430 61 3.37 1.49 1.89 0.07
114 2015 lo WarmDry 3132 60 4.18 1.91 2.28 0.13
115 2015 mean ColdDry 6265 66 3.88 1.24 2.64 0.00
116 2015 mean Rainy 10430 61 3.57 1.08 2.49 0.09
117 2015 mean WarmDry 3132 60 4.55 1.28 3.28 0.07
118 2016 hi ColdDry 7418 68 4.51 1.61 2.90 0.70
119 2016 hi Rainy 11303 68 4.37 1.73 2.64 0.90
120 2016 hi WarmDry 3836 65 5.56 1.72 3.84 0.71
121 2016 lo ColdDry 7418 68 3.95 2.13 1.82 0.04
122 2016 lo Rainy 11303 68 3.50 1.63 1.87 0.00
123 2016 lo WarmDry 3836 65 4.29 2.25 2.04 0.04
124 2016 mean ColdDry 7418 68 3.75 1.32 2.43 0.15
125 2016 mean Rainy 11303 68 3.69 1.25 2.44 0.48
126 2016 mean WarmDry 3836 65 4.62 1.41 3.21 0.23
127 2017 hi ColdDry 7637 74 4.04 1.58 2.46 0.55
128 2017 hi Rainy 11326 73 4.52 1.80 2.72 0.81
129 2017 hi WarmDry 3990 68 4.72 1.84 2.88 0.84
130 2017 lo ColdDry 7637 74 4.07 2.30 1.77 0.64
131 2017 lo Rainy 11326 73 3.54 1.69 1.85 0.00
132 2017 lo WarmDry 3990 68 4.19 2.20 1.99 0.38
133 2017 mean ColdDry 7637 74 3.76 1.42 2.34 0.23
134 2017 mean Rainy 11326 73 3.78 1.35 2.43 0.24
135 2017 mean WarmDry 3990 68 4.31 1.46 2.85 0.81
136 2018 hi ColdDry 6666 77 3.90 1.72 2.18 0.25
137 2018 hi Rainy 11659 79 3.89 1.69 2.21 0.11
138 2018 hi WarmDry 3878 71 4.21 1.95 2.27 0.81
139 2018 lo ColdDry 6666 77 3.97 2.17 1.79 0.40
140 2018 lo Rainy 11659 79 3.00 1.70 1.30 0.00
141 2018 lo WarmDry 3878 71 3.94 2.11 1.83 0.58
142 2018 mean ColdDry 6666 77 3.55 1.41 2.14 0.18
143 2018 mean Rainy 11659 79 3.22 1.37 1.86 0.15
144 2018 mean WarmDry 3878 71 3.81 1.47 2.34 0.18

And a plot like the previous plot, but broken down by season:

ggplot(transform(sr$by.season,
        dv = factor(dv, levels = c("hi", "mean", "lo")))) +
    geom_linerange(aes(season, ymin = rmse, ymax = sd, color = season)) +
    facet_grid(dv ~ sprintf("%02d", year - 2000)) +
    no.gridlines() +
    scale_y_continuous(expand = expand_scale(), name = "SD and RMSE") +
    coord_cartesian(ylim = c(0, 7)) +
    theme(axis.text.x = element_text(angle = -90))
rmse-by-season.png
as.data.frame(rd(d = 2, sr$by.region))
  dv region N stn sd rmse sd - rmse
1 hi Cuautla 200 1 2.66 1.64 1.02
2 hi Cuernavaca 1460 6 4.72 1.94 2.78
3 hi Pachuca 132 1 4.29 3.42 0.87
4 hi Tlaxcala-Apizaco 243 1 2.75 1.67 1.08
5 hi Toluca 1097 5 6.49 2.73 3.76
6 hi Valle de México 16319 56 3.68 1.61 2.07
7 hi Puebla-Tlaxcala 2752 10 3.11 1.82 1.29
8 lo Cuautla 200 1 1.53 1.77 -0.24
9 lo Cuernavaca 1460 6 5.25 2.18 3.07
10 lo Pachuca 132 1 3.48 2.15 1.33
11 lo Tlaxcala-Apizaco 243 1 3.10 2.63 0.47
12 lo Toluca 1097 5 3.56 2.47 1.09
13 lo Valle de México 16319 56 3.61 1.89 1.71
14 lo Puebla-Tlaxcala 2752 10 3.35 1.66 1.68
15 mean Cuautla 200 1 1.65 1.11 0.54
16 mean Cuernavaca 1460 6 5.02 1.44 3.58
17 mean Pachuca 132 1 3.28 1.10 2.18
18 mean Tlaxcala-Apizaco 243 1 2.26 1.13 1.13
19 mean Toluca 1097 5 4.53 1.73 2.79
20 mean Valle de México 16319 56 3.23 1.39 1.84
21 mean Puebla-Tlaxcala 2752 10 2.95 1.32 1.63

These by-region results are only for 2018.

as.data.frame(rd(d = 2, sr$by.network))
  dv network N stn sd rmse sd - rmse
1 hi emas 2631 13 7.59 2.04 5.55
2 hi esimes 616 3 3.46 2.27 1.19
3 hi simat 7714 24 3.13 1.26 1.87
4 hi unam 3189 12 2.82 0.98 1.84
5 hi wunderground 8053 28 3.72 2.18 1.54
6 lo emas 2631 13 6.02 2.27 3.76
7 lo esimes 616 3 3.64 2.48 1.16
8 lo simat 7714 24 3.31 1.76 1.55
9 lo unam 3189 12 2.76 1.04 1.72
10 lo wunderground 8053 28 3.76 2.18 1.59
11 mean emas 2631 13 6.70 1.61 5.10
12 mean esimes 616 3 2.86 1.11 1.74
13 mean simat 7714 24 2.69 1.05 1.64
14 mean unam 3189 12 2.47 0.76 1.71
15 mean wunderground 8053 28 3.26 1.77 1.49

These by-network results are only for 2018.

time.series.plot()
cv-time-series.png
pred.error.plot()
error-density.png

With vs. without training Wunderground

x = lapply(c(F, T), function(train.wunder)
    summarize.cv.results(multi.run.cv(2003 : 2018,
        train.wunder = train.wunder), test.wunder = F)$overall)
x = cbind(x[[1]][, .(year, dv, N, rmseF = rmse.s)],
    rmseT = x[[2]]$rmse.s)
x[, diff := rmseF - rmseT]
rd(as.data.frame(x))
  year dv N rmseF rmseT diff
1 2003 hi 7483 1.413 1.413 0.000
2 2003 lo 7483 2.654 2.654 0.000
3 2003 mean 7483 1.213 1.213 0.000
4 2004 hi 7901 1.388 1.388 0.000
5 2004 lo 7901 2.505 2.505 0.000
6 2004 mean 7901 1.183 1.183 0.000
7 2005 hi 8939 1.570 1.570 0.000
8 2005 lo 8939 2.444 2.444 0.000
9 2005 mean 8939 1.149 1.149 0.000
10 2006 hi 8303 1.668 1.692 -0.024
11 2006 lo 8303 2.298 2.259 0.040
12 2006 mean 8303 1.219 1.222 -0.003
13 2007 hi 7483 1.614 1.596 0.018
14 2007 lo 7483 2.152 2.140 0.012
15 2007 mean 7483 1.228 1.232 -0.004
16 2008 hi 9190 1.496 1.504 -0.008
17 2008 lo 9190 1.905 1.947 -0.042
18 2008 mean 9190 1.119 1.123 -0.004
19 2009 hi 10098 1.706 1.702 0.003
20 2009 lo 10098 2.095 2.134 -0.039
21 2009 mean 10098 1.185 1.212 -0.027
22 2010 hi 13250 1.782 1.847 -0.064
23 2010 lo 13250 2.147 2.140 0.006
24 2010 mean 13250 1.425 1.452 -0.027
25 2011 hi 13355 1.978 1.990 -0.012
26 2011 lo 13355 2.052 2.018 0.034
27 2011 mean 13355 1.487 1.442 0.045
28 2012 hi 14194 1.797 1.812 -0.016
29 2012 lo 14194 2.062 2.038 0.024
30 2012 mean 14194 1.428 1.421 0.007
31 2013 hi 15263 2.087 2.124 -0.037
32 2013 lo 15263 2.209 2.171 0.038
33 2013 mean 15263 1.439 1.435 0.004
34 2014 hi 16496 2.066 2.091 -0.025
35 2014 lo 16496 2.027 1.990 0.037
36 2014 mean 16496 1.332 1.319 0.013
37 2015 hi 17900 2.025 2.036 -0.011
38 2015 lo 17900 1.936 1.921 0.015
39 2015 mean 17900 1.239 1.250 -0.011
40 2016 hi 19324 2.079 2.094 -0.015
41 2016 lo 19324 2.124 2.124 -0.001
42 2016 mean 19324 1.308 1.366 -0.058
43 2017 hi 17986 2.001 2.023 -0.022
44 2017 lo 17986 2.379 2.395 -0.016
45 2017 mean 17986 1.425 1.489 -0.064
46 2018 hi 14150 1.698 1.793 -0.094
47 2018 lo 14150 2.059 2.100 -0.040
48 2018 mean 14150 1.169 1.318 -0.149

rmseF is the spatial RMSE obtained from a CV that tests on non-Wunderground stations and trains on non-Wunderground stations. rmseT is similar except Wunderground stations are allowed in training. diff is rmseF - rmseT, so positive diff means an improvement in RMSE when Wunderground is included in training.

round(d = 5, mean(x[year > 2005, diff]))
  value
  -0.01327

New predictions

# d = predict.temps("~/Jdrive/PM/Just_Lab/projects/PROGRESS_physical_activity/data/intermediate/allvar_aug8.rds")
area.map()
area-map.png

Above is the study area, the prediction area (divided into metropolitan areas), and the stations.

mexico.context.map()
mexico-context-map.png

Above is a map of Mexico with the study area highlighted.

temp.quantiles.map(2018L)
map-temp-quantiles.png

Above are the .95 quantiles of the lows and highs, respectively, in 2018.

pop.map("POB65_MAS")
map-population.png

Above is the gridded population density in 2010 for the whole area (counting only people ages 65 and up).

pop.map("POB65_MAS", thresholds.tempC = c(5, 30))
map-extreme-person-days.png

Above is the person-days of exposure to extreme lows or highs, respectively, in 2010. The total exposure, summed across all pixels, is:

  kind total person-days
1 ≤ 5 °C 52,153,954
2 ≥ 30 °C 23,698,969

References

Hu, L., Brunsell, N. A., Monaghan, A. J., Barlage, M., & Wilhelmi, O. V. (2014). How can we use MODIS land surface temperature to validate long-term urban model simulations? Journal of Geophysical Research, 119(6), 3185–3201. doi:10.1002/2013JD021101

Rosenfeld, A., Dorman, M., Schwartz, J., Novack, V., Just, A. C., & Kloog, I. (2017). Estimating daily minimum, maximum, and mean near surface air temperature using hybrid satellite models across Israel. Environmental Research, 159, 297–312. doi:10.1016/j.envres.2017.08.017

Williamson, S. N., Hik, D. S., Gamon, J. A., Kavanaugh, J. L., & Koh, S. (2013). Evaluating cloud contamination in clear-sky MODIS Terra daytime land surface temperatures using ground-based meteorology station observations. Journal of Climate, 26(5), 1551–1560. doi:10.1175/JCLI-D-12-00250.1