/*******************************************************************************
Author: John Iselin
Date: 	February 4, 2023
File: 	longrun_snap_iselin.do

PhD Student
Department of Economics
University of Maryland, College Park

Project: 	Is the Social Safety Net a Long-Term Investment?
			Large-Scale Evidence from the Food Stamps Program

This do-file produces the following figures: 

Appendix Figure 2: Childhood use of the FSP in the PSID
	(A) Child Participation Rates in Food Stamps, by Age 
	(B) Years of Childhood on Food Stamps, by Age at First Use
	
Data for Appendix Figure 2 comes from the PSID, downloaded via: 
 - https://simba.isr.umich.edu/data/data.aspx
Also uses data published/disclosed at AER for Hoynes, Schanzenbach & Almond "The Long Term Effects of childhood Access to the Safety net", 2016. 	
For full set of variables, see summarized data below.
Other do files called within this do file: J258924.do and J258924_format.do (processing PSID additional data)
	
Appendix Figure 7: Five-Year Childhood Migration Rates
	(A) Five-Year Childhood Migration Rates by Age of Child, 1970 Dec. Census
	(B) Five-Year Childhood Migration Rates by Age of Child, 1980 Dec. Census
	
Data for Appendix Figure 7 comes from the US Decennial Census, 1970 and 1980.  
Downloaded via IPUMS:

Sample: 
- 1970 1% Form 2 Metro Sample
- 1980 5% State Sample  
 
Variables: 
- year 
- datanum  	data set number
- serial  	household serial number
- numprec  	number of person records following
- hhwt	 	household weight
- statefip	state (fips code)
- countyfip	county (fips code)
- gq      	group quarters status
- pernum	person number in sample unit
- perwt 	person weight
- relate	relationship to household head [general - version]
- related  	relationship to household head [detailed version]
- sex 	    sex
- age       age
- educ 		educational attainment [general version]
- educd 	educational attainment [detailed version]
- migrate5 	migration status, 5 years [general version]
- migrate5d migration status, 5 years [detailed version]

 

** FILE STRUCTURE

All files should be located within the "dir" folder

The "dir" folder should contain three sub-folders
** code	 	--> 	"dir/code/" 
** data 	--> 	"dir/data/"
** results 	--> 	"dir/results/"

The code folder should contain this do-file, plus a sub-folder called "logs".

The results and data folders should be empty. 

Any questions can be directed to John Iselin at jiselin@umd.edu

*******************************************************************************/


*** Set-Up
capture log close
clear matrix
clear all
set more off

** Name of Project
local name 		"longrun_snap_iselin"

** Set Local with Date Run
local date "`: di %tdCY-N-D daily("$S_DATE", "DMY")'"

** Set Filepaths
*global dir 	"/Users/hh2974/Dropbox/Shared-with-others/Bailey-Hoynes-RS-Walker-LongRun/Draft/RESTUD_Revision/Replication/Census-PSID/"	
global dir 		"/Users/johniselin/Desktop/Census-PSID/"
global code 	"${dir}code/"		// CODE FILEPATH
global data 	"${dir}data/"		// DATA FILEPATH
global results 	"${dir}results/"	// RESULTS FILEPATH
global logs 	"${code}logs/"		// LOG FILE SUB-FILEPATH

** Set working directory 
cd "${dir}"

** Start log file for appendix figure 2
log using "${logs}`name'_app_fig_2_`date'", replace text

** Appendix Figure 2: Childhood use of the FSP in the PSID
** (A) Child Participation Rates in Food Stamps, by Age 


** Load data + Load additional variables and clean things up
*use "${data}addlpsid/addlpsid.dta", clear

** Change working directory 
cd "${data}psid/addlpsid/"

** Run PSID Code 
do "J258924.do"

** dta with formatting
do "J258924_formats.do"

** Change working directory 
cd "${dir}"

** Summarize data
sum 
des

** drop interview #s after 1980
lookfor 1981
rename _all, lower

** Rename gender before dropping
tab er32000, mi
tab er32000, mi nolab
rename er32000 gender //time invariant
			
forvalues x = 30373/34511 {
	qui cap drop er`x'
}

** Rename interview ** 
/**		
er30001         int     %9.0g                 1968 INTERVIEW NUMBER
er30020         int     %9.0g                 1969 INTERVIEW NUMBER
er30043         int     %9.0g                 1970 INTERVIEW NUMBER
er30067         int     %9.0g                 1971 INTERVIEW NUMBER
er30091         int     %9.0g                 1972 INTERVIEW NUMBER
er30117         int     %9.0g                 1973 INTERVIEW NUMBER
er30138         int     %9.0g                 1974 INTERVIEW NUMBER
er30160         int     %9.0g                 1975 INTERVIEW NUMBER
er30188         int     %9.0g                 1976 INTERVIEW NUMBER
er30217         int     %9.0g                 1977 INTERVIEW NUMBER
er30246         int     %9.0g                 1978 INTERVIEW NUMBER
er30283         int     %9.0g                 1979 INTERVIEW NUMBER
er30313         int     %9.0g                 1980 INTERVIEW NUMBER
er30343         int     %9.0g                 1981 INTERVIEW NUMBER
**/

rename er30001 inum68
rename er30020 inum69
rename er30043 inum70
rename er30067 inum71
rename er30091 inum72
rename er30117 inum73
rename er30138 inum74
rename er30160 inum75
rename er30188 inum76
rename er30217 inum77
rename er30246 inum78
rename er30283 inum79
rename er30313 inum80
rename er30343 inum81

**  Store all of these in a local to make renaming other variables easier
local er68 = 30001
local er69 30020 
local er70 30043 
local er71 30067 
local er72 30091 
local er73 30117 
local er74 30138 
local er75 30160 
local er76 30188 
local er77 30217 
local er78 30246 
local er79 30283 
local er80 30313 
local er81 30343 

** Rename relationship **
/**
er30003         byte    %163.0g    ER30003L   RELATIONSHIP TO HEAD 68
er30022         byte    %230.0g    ER30022L   RELATIONSHIP TO HEAD 69
er30045         byte    %227.0g    ER30045L   RELATIONSHIP TO HEAD 70
er30069         byte    %227.0g    ER30069L   RELATIONSHIP TO HEAD 71
er30093         byte    %227.0g    ER30093L   RELATIONSHIP TO HEAD 72
er30119         byte    %227.0g    ER30119L   RELATIONSHIP TO HEAD 73
er30140         byte    %227.0g    ER30140L   RELATIONSHIP TO HEAD 74
er30162         byte    %227.0g    ER30162L   RELATIONSHIP TO HEAD 75
er30190         byte    %227.0g    ER30190L   RELATIONSHIP TO HEAD 76
er30219         byte    %227.0g    ER30219L   RELATIONSHIP TO HEAD 77
er30248         byte    %227.0g    ER30248L   RELATIONSHIP TO HEAD 78
er30285         byte    %227.0g    ER30285L   RELATIONSHIP TO HEAD 79
er30315         byte    %227.0g    ER30315L   RELATIONSHIP TO HEAD 80
er30345         byte    %227.0g    ER30345L   RELATIONSHIP TO HEAD 81
**/

forvalues y = 68/81 {
	local erplus = `er`y'' + 2
	rename er`erplus' rel`y'
	
}

desc rel*
	
** Rename age ** 
/**
er30004         int     %9.0g                 AGE OF INDIVIDUAL 68
er30023         int     %9.0g                 AGE OF INDIVIDUAL 69
er30046         int     %9.0g                 AGE OF INDIVIDUAL 70
er30070         int     %9.0g                 AGE OF INDIVIDUAL 71
er30094         int     %9.0g                 AGE OF INDIVIDUAL 72
er30120         int     %9.0g                 AGE OF INDIVIDUAL 73
er30141         int     %9.0g                 AGE OF INDIVIDUAL 74
er30163         int     %9.0g                 AGE OF INDIVIDUAL 75
er30191         int     %9.0g                 AGE OF INDIVIDUAL 76
er30220         byte    %9.0g                 AGE OF INDIVIDUAL 77
er30249         int     %9.0g                 AGE OF INDIVIDUAL 78
er30286         int     %9.0g                 AGE OF INDIVIDUAL 79
er30316         int     %9.0g                 AGE OF INDIVIDUAL 80
er30346         int     %9.0g                 AGE OF INDIVIDUAL 81
**/

forvalues y = 68/81 {
	local erplus = `er`y'' + 3
	rename er`erplus' age`y'
	replace age`y' = . if age`y' > 500 //missings
}

desc age*
	
** Set 0s to . for consistency with other file
foreach var of var inum* {
	replace `var' = . if `var' == 0
}
	
tempfile addl
save `addl', replace
	
** Iterate over years
forvalues y = 68/81 {
	
	di "Starting merge loop for 19`y'"
	
	** One year at a time
	use if year == `y' using "${data}psid/psid_fs6_CLEANED.dta", clear 
	
	** Data cleaning 
	duplicates drop
	replace inum`y' = inum if inum != . & inum`y' == .
	drop if inum == .
		
	** Merge with additional data file 	
	merge 1:m inum`y' using `addl'
		
	** Each year as a tempfile
	tempfile psid`y'
	save `psid`y'', replace
	di "saved tempfile as psid`y'"
				
} //year loop
			
				
** Append everything together
use `psid68', clear

forvalues y = 69/81 {
	
	append using `psid`y''
	
} //year loop
				
** Check matches by year
tab year _merge, mi

** Make sure have individual files for everyone in main data
assert _merge >= 3 if year != .
drop if _merge == 2
compress
	
/*additional psid variables include:
Age, 
sex, 
relationship to head – to keep only children (as opposed heads who are <18)
interview year, 
interview number, 
person number, 
sequence number
*/

g age = .
g rel = .

** Loop over all years
forvalues y = 68/81 {
	
	** Loop over variables 
	foreach var in age rel { 
	
		** Fill age and rel values 
		replace `var' = `var'`y' if year == `y'

		drop `var'`y'
	} //variable
} //year

** Drop those who entered after survey
drop if rel == 0
summ age, d
	
** Added 4/21/2019: Just keep kids of head
tab rel, mi
keep if rel == 3
	
compress

** Clean up year
replace year = 1900 + year
summ year, d

** Indicators for race
g all = 1

** FSP
g fsp = (foodstsave > 0) if foodstsave != .
tab year, su(fsp) //check that this looks ok
 
** Age
** Changed to 0-5 and 6-18 4/25/2019
g ge5 = age > 5 if age != .

** Income - poverty bin
g incpovbin = 1 if  incpov <= .5
replace incpovbin = 2 if  incpov > 0.5 & incpov <= 1
replace incpovbin = 3 if  incpov > 1 & incpov <= 1.5
replace incpovbin = 4 if  incpov > 1.5 & incpov <= 2
replace incpovbin = 5 if  incpov > 2 & incpov <= .
	cap lab def incpovbin 1 "< 50%" 2 "50-100%" 3 "100-150%" 4 "150-200%" 5 "200%+"
	lab values incpovbin incpovbin
	
** Added 4/21/2019: Second spec pooling 1975-1977 and 1978-1980
g incyearpool = 1975 if incyear >= 1975 & incyear <= 1979
replace incyearpool = 1979 if incyear >= 1978 & incyear <= 1980
tab incyearpool, mi

** Figure plotting share of children receiving FS by age of child 
** (by single year of age if possible) 

local min = 0.1 
local max = 0.25 
local yaxis = "ysc(r(`min' `max')) ylab(`min'(0.05)`max')"

** Keep appropriate age / race / pool data 
keep if age <= 18 & all == 1 & incyearpool == 1975


summ incyear, d
local ymin = r(min)
local ymax = r(max)
if `ymin' == `ymax' local year = "`ymin'"
if `ymin' != `ymax' local year = "`ymin'-`ymax'"


** Get shares for total
summ fsp if age < 18 [aw = wgt], d
local all = r(mean)*100
di "`all'"
local all = substr("`all'", 1, strpos("`all'", ".") - 1) + "%"
di "`all'"
			
summ fsp if age <= 5 [aw = wgt], d
local young = r(mean)*100
local young = substr("`young'", 1, strpos("`young'", ".") - 1) + "%"
di "`young'"
			
summ fsp if age > 5 & age < 18 [aw = wgt], d
local old = r(mean)*100
local old= substr("`old'", 1, strpos("`old'", ".") - 1) + "%"
di "`old'"
			
local textpos0 = (`min' + `max')*.66
local textpos1 = (`min' + `max')*.625
local textpos2 = (`min' + `max')*.645
local textpos4 = (`min' + `max')*.605
	
collapse (mean) fsp [pw = wgt], by(age)
	
** Added 4/25/2019: Text box for all
tw scatter fsp age if age <= 18, ///
	ytitle("Percent all receiving Food Stamps, 1975-1977") 	///
	xtitle("Age") 											///
	xsc(r(0 18)) 											///
	xlab(0(3)18) 											///
	`yaxis' 												///
	graphregion(color(white)) 								///
	bgcolor(white) 											///				
	text(`textpos0' 0.5 "{bf:`year'}", 						///
				just(left) place(e) size(small)) 			///
	text(`textpos2' 0.5 "{bf:Participation Rate}", 			///
				just(left) place(e) size(small)) 			///
	text(`textpos4' 0.5	 									///
			"All < 18:" 									///
			"{&le} 5:        " 								///
			"6-17:        ", 								///
			just(left) 										///	
			place(e) size(small) bcolor(none) 				///
			margin(l+1 t+1 b+1) ) 							///
	text(`textpos4' 2.5 									///
			"`all'" 										///
			"`young'" 										///
			"`old'", 										///
			just(center) 									///
			place(e) size(small) bcolor(none) 		 		///
			margin(l+1 t+1 b+1) ) 							///
	text(`textpos1' 0.45 " " " " " " " " " " " " " " , /// empty box 
			just(left) 										///
			width(25) 										///
			place(e) 										///					
			size(small) 									///
			bcolor(none) 									///
			box 											///
			lcolor(black) 									///
			margin(l+1) ) 
			
gr export "${results}`name'_app_fig_2a.pdf", replace
gr export "${results}`name'_app_fig_2a.png", replace
				
clear 

** Appendix Figure 2: Childhood use of the FSP in the PSID
** (B) Years of Childhood on Food Stamps, by Age at First Use

** Change working directory 
cd "${data}psid/J260694/"

** Run PSID Code 
do "J260694.do"

** dta with formatting
do "J260694_formats.do"

** Change working directory 
cd "${dir}"

** Summarize 
sum
de

set scheme s2color

** Data Cleaning
rename _all, lower

** Generate unique id
gen id = _n

** Rename gender before dropping
tab er32000, mi
tab er32000, mi nolab
rename er32000 gender //time invariant

** Weights
rename er33637 weight_2001 
rename er33546 weight_1999 
rename er33430 weight_1997
rename er33318 weight_1996  
rename er33275 weight_1995 
rename er33119 weight_1994 
rename er30864 weight_1993 
rename er30803 weight_1992  
rename er30730 weight_1991  
rename er30686 weight_1990  
rename er30641 weight_1989  
rename er30605 weight_1988  
rename er30569 weight_1987  
rename er30534 weight_1986  
rename er30497 weight_1985  
rename er30462 weight_1984  
rename er30428 weight_1983  
rename er30398 weight_1982  
rename er30372 weight_1981  
rename er30342 weight_1980  
rename er30312 weight_1979  
rename er30282 weight_1978  
rename er30245 weight_1977  
rename er30216 weight_1976  
rename er30187 weight_1975  
rename er30159 weight_1974  
rename er30137 weight_1973  
rename er30116 weight_1972  
rename er30090 weight_1971  
rename er30066 weight_1970  
rename er30042 weight_1969  
rename er30019 weight_1968  

drop v439

*Store all of these in a local to make renaming other variables easier
local er68 30001
local er69 30020 
local er70 30043 
local er71 30067 
local er72 30091 
local er73 30117 
local er74 30138 
local er75 30160 
local er76 30188 
local er77 30217 
local er78 30246 
local er79 30283 
local er80 30313 
local er81 30343 
local er82 30373 
local er83 30399 
local er84 30429  
local er85 30463
local er86 30498 
local er87 30535 
local er88 30570 
local er89 30606 
local er90 30642
local er91 30689
local er92 30733
local er93 30806
local er94 33101
local er95 33201
local er96 33301
local er97 33401

** 1968 is special bc it is the first year. 
rename er30001 inum_1968
rename er30002 pnum
rename er30003 hrel_1968
rename er30004 age_1968

** Missing 1998 and 2000, so 1999 and 2001 are left out of loop
rename er33501 inum_1999
rename er33502 snum_1999
rename er33503 hrel_1999
rename er33504 age_1999

gen age_1998 = age_1999 - 1
gen hrel_1998 = hrel_1999
gen inum_1998 = inum_1999
gen snum_1998 = snum_1999

rename er33601 inum_2001
rename er33602 snum_2001
rename er33603 hrel_2001
rename er33604 age_2001

gen age_2000 = age_2001 -1
gen hrel_2000 = hrel_2001
gen inum_2000 = inum_2001
gen snum_2000 = snum_2001

** Loop for interview number, sequence number, relation to head, and age
forvalues y = 69/97 {
	local interview = `er`y''
	local sequence = `er`y'' + 1
	local relation = `er`y'' + 2
	local age = `er`y'' + 3

	rename er`interview' inum_19`y'
	rename er`sequence' snum_19`y'
	rename er`relation' hrel_19`y'
	rename er`age' age_19`y'
	
}

** Set 0s to . for consistency with other file
foreach var of var inum* {
	replace `var' = . if `var' == 0
	}
	
** Family Data 

* Release number - drop 
drop er30000 v1 v441 v1101 v1801 v2401 v3001 v3401 v3801 v4301 v5201 v5701 		///
	v6301 v6901 v7501 v8201 v8801 v10001 v11101 v12501 v13701 v14801 v16301 	///
	v17701 v19001 v20301 v21601 er2001 er5001 er7001 er10001 er13001 er17001


*  Interview number - extra, drop 
drop v1102 v1102 v8802 v8202 v7502 v6902 v6302 v5702 v5202 v4708 v3802 v3402 	///	
	v3002 v2402 v20302 v2 v19002 v1802 v17702 v16302 v14802 v13702 v12502 		///
	v11102 v1102 v10002 er5002 er7002 er10002 er13002 er17002 v21602 er2002

	
* Food Stamps
rename v45 fs_save_1968
rename v510 fs_save_1969
rename v1183 fs_save_1970
rename v1884 fs_save_1971
rename v2478 fs_save_1972

* NOTE: Food stamp amt in year X is recorded in interview year X+1
rename v3443 fs_save_1973 
rename v3851 fs_save_1974
rename v4364 fs_save_1975
rename v5277 fs_save_1976 
drop v5774 						/* FS Paid */
rename v5776 fs_bonus_1977
drop v6380 						/* FS Paid */
rename v6382 fs_bonus_1978				
rename v6976 fs_value_1979
rename v7568 fs_value_1980
rename v8260 fs_value_1981
rename v8868 fs_value_1982
rename v10239 fs_value_1983
rename v11379 fs_value_1984
rename v12778 fs_value_1985
rename v13880 fs_value_1986
rename v14895 fs_value_1987
rename v16395 fs_value_1988
rename v17811 fs_value_1989
rename v19111 fs_value_1990
rename v20411 fs_value_1991
rename v21713 fs_value_1992
rename er3059 fs_cat_1993		/* Whether used FS  (no value var) */	
rename er6058 fs_cat_1994  		/* Whether used FS  (no value var) */	
rename er8155 fs_cat_1995		/* Whether used FS  (no value var) */	
rename er11049 fs_cat_1996		/* Whether used FS  (no value var) */		
rename er14241 fs_value_1997 
rename er14256 fs_value_1998
drop er14255					/* Whether used FS  */	
drop er14240 					/* Whether used FS  */				
drop er18370 					/* Whether used FS  */		
drop er18386					/* Whether used FS  */		
rename er18371 fs_value_1999
rename er18387 fs_value_2000


* Reshape 
gen inum68 = inum_1968


reshape long age_ fs_bonus_ fs_save_ fs_value_ hrel_ inum_ snum_ fs_cat_ 		///
	weight_, i(id) j(year)

rename *_ *

drop if hrel == 0

*drop if year >1997

tsset id year

order year id inum pnum inum snum gender age hrel weight

** Define food stamp dummy

gen fsp = (fs_save > 0 & fs_save < 9999) if fs_save != .
replace fsp = (fs_bonus > 0 & fs_bonus < 9999) if fs_bonus != . & fsp == .
replace fsp = (fs_value > 0 & fs_value < 9999) if fs_value != . & fsp == .
replace fsp = (fs_cat == 1) if fs_cat != . & fsp == .
replace fsp = 0 if fsp == .
tab year [aw=weight ], su(fsp) //check that this looks ok

drop fs_save fs_bonus fs_value fs_cat
keep if year < 2000

** Generate lead food stamp dummies

tsfill

rename fsp fsp_0 
gen fsp_sub1 = l.fsp_0
gen fsp_1 = f.fsp_0
gen fsp_2 = f2.fsp_0
gen fsp_3 = f3.fsp_0
gen fsp_4 = f4.fsp_0
gen fsp_5 = f5.fsp_0
gen fsp_6 = f6.fsp_0
gen fsp_7 = f7.fsp_0
gen fsp_8 = f8.fsp_0
gen fsp_9 = f9.fsp_0
gen fsp_10 = f10.fsp_0
gen fsp_11 = f11.fsp_0
gen fsp_12 = f12.fsp_0
gen fsp_13 = f13.fsp_0
gen fsp_14 = f14.fsp_0
gen fsp_15 = f15.fsp_0
gen fsp_16 = f16.fsp_0
gen fsp_17 = f17.fsp_0
gen fsp_18 = f18.fsp_0

** Age Calculations - Some of the self-reported ages do not vary across time. 

** Show breakdown in age 
gen age_test = 1 if age == L.age & age != 999 & L.age != 999
replace age_test = 0 if age_test == .

tab age age_test if age < 18 & age > 1 & year <= 1978 & fsp_0 == 1 			///
	[aw = weight] , row 

drop age_test

** Generate calculated age 

gen age_calc = .
sort id year, stable
by id: gen long obsno = _n
replace age_calc = age if obsno == 1 
replace age_calc = 0 if age_calc < 0
replace age_calc = L.age_calc + 1 if obsno == 2 & L.age_calc < 150
replace age_calc = L.age_calc if obsno == 2 & L.age_calc > 150

forvalues y = 3/30 {
	local x = `y' - 1
	replace age_calc = L`x'.age_calc + `x' if obsno == `y' & L`x'.age_calc < 150
	replace age_calc = L`x'.age_calc if obsno == `y' & L.age_calc > 150

}

order year id inum68 inum pnum snum gender age age_calc hrel weight

** Create age-based sample restrictions
gen yob_calc = year - age_calc if age_calc < 150

gen in_sample = . 
replace in_sample = 0 if 	///
		age_calc > 150 | 	///
		age_calc == . 		// Ignore those with missing age
		
replace in_sample = 1 if 	///
		yob_calc >= 1950 & 	///
		yob_calc < 1980 & 	///
		in_sample == .
		
replace in_sample = 0 if in_sample == .

** Create a tag for the first year on food stamps. 
by id (year), sort: ///
	gen byte fsp_first = sum(fsp_0 == 1) == 1 & sum(fsp_0[_n - 1] == 1) == 0  


** Count years with food stamps in childhood 

gen childhood_remaining = 17 - age_calc
replace childhood_remaining = . if childhood_remaining < 0

gen fsp_count_0 = fsp_0	
gen fsp_youth_count = fsp_count_0 if childhood_remaining == 0

forvalues y = 1/17 {

	egen fsp_count_`y' = rowtotal(fsp_0 - fsp_`y')		
	replace fsp_youth_count = fsp_count_`y' if childhood_remaining == `y'
}

gen fsp_yth_ct_share = fsp_youth_count / (18-age_calc)

** Tables + Figures of means by age for different periods
local start_year = 1972
local end_year = 1975

graph bar (mean) fsp_youth_count [pweight = weight] if 	///
		age_calc < 18  & 			///
		fsp_first == 1 & 			///
		in_sample == 1 & 			///
		year >= `start_year' & 		///
		year <= `end_year' , 		///
		over(age_calc) 				///
		graphregion(color(white)) 	///
		bgcolor(white)				///						
		ytitle(Years of Remaining Childhood on FSP) ///
		b1title(Child's Age) 			
		
gr export "${results}`name'_app_fig_2b.pdf", replace
cap gr export "${results}`name'_app_fig_2b.png", replace
			
clear			

** End log file for appendix figure 2
capture log close


** Start log file for appendix figure 7
log using "${logs}`name'_app_fig_7_`date'",  text

** Load Data via IPUMS
cd "${data}census/"
do ipums_do.do
cd ${dir}

** Summarize 
sum 
de

** Drop dataset w/o migration data
*drop if datanum == 3 /* DEPRECIATED */

** Generate required vars

** Dummy for child 
gen kid_dummy = 1 if relate == 3 & age < 19
replace kid_dummy = 0 if kid_dummy == .

tab age year if kid_dummy == 1
tab age year if kid_dummy == 1 [aw = hhwt]

label values age .

** Education 
gen education = . 
replace education = 0 if educ == 0								// None or NA
replace education = 1 if educ == 1 | educ == 2					// Elem or JH
replace education = 2 if educ == 3 | educ == 4 | educ == 5		// Some HS
replace education = 3 if educ == 6 								// HS
replace education = 4 if educ == 7 | educ == 8 | educ == 9    	// 1-3 Y College
replace education = 5 if educ == 10 | educ == 11				// 4+ Y College

label define EDUCATION 	0 "None or NA" 							///
						1 "No HS" 								///
						2 "Some HS" 							///
						3 "HS"									///
						4 "1-3 Years College"					///
						5 "4+ Years College"									
							
label values education EDUCATION 
label variable education "Condensed Education" 

** Look at migration
tab migrate5d
tab migrate5d [aw = hhwt]
tab migrate5d year if kid_dummy == 1 [aw = hhwt]

** Tag individual as moving out of the county if any of the following are true 
/*
22		Different house, moved within state, between counties
31		Moved between contiguous states
32		Moved between non-contiguous states
33		Unknown between states
40		Abroad five years ago
*/

gen out_of_county = 1 if 	migrate5d == 22 | 	///
							migrate5d == 31 |	///
							migrate5d == 32 |	///
							migrate5d == 33 | 	///
							migrate5d == 40

replace out_of_county = 0 if out_of_county == .

tab year out_of_county [aw=hhwt ]  
tab year out_of_county if age < 19 [aw=hhwt ]

label define MOVER 0 "Non-Mover" 1 "Mover"

** Determine marital status

gen temp = 1 if relate == 1 | relate == 2
bysort year serial: egen parent_count = total(temp)
drop temp


** Assign kids their mother's moving indicators, and if there is no mother then
** assign HoH. 

gen parent_out_of_county = out_of_county if 	///
		parent_count == 2 & 					///
		sex == 2 & 								///
		(relate == 1 | relate == 2) 
		
replace parent_out_of_county = out_of_county if ///
		parent_count == 1 & 					///
		relate == 1

bysort year serial: egen hh_mover = total(parent_out_of_county)
drop parent_out_of_county

label values hh_mover MOVER 
label variable hh_mover "5-Year County Migration by Mother (or HoH)" 

** Assign kids a tag for education level of parents

gen head_educ = education if parent_count == 2 & sex == 2 & 					///
							(relate == 1 | relate == 2) 
replace head_educ = education if parent_count == 1 & relate == 1

bysort year serial: egen hh_educ = max(head_educ)
drop head_educ
label values hh_educ EDUCATION
label variable hh_educ "Mother's (or HoH) Education Level"

** Keep in sample 
keep if kid_dummy == 1 

** Keep required variables	
keep year serial hhwt statefip hh_mover hh_educ age

** Collapse data
collapse (count) serial [pw = hhwt], by(year statefip hh_mover hh_educ age) 

** Reshape data 
reshape wide serial, i(year statefip hh_educ age) j(hh_mover)

** Cleaning
rename serial0 non_movers
rename serial1 movers

replace movers = 0 if movers == .
replace non_movers = 0 if non_movers == .

gen total = movers + non_movers
gen share_movers = movers / total
gen share_non_movers = non_movers / total

sum

** Appendix Figure 7A
graph bar (mean) share_movers [fw = total] if 	///
	year == 1970 & hh_educ <= 2, 				///
	graphregion(color(white)) 					///
	bgcolor(white)								///			
	over(age) 									///
	ytitle(Migration Share) 					///
	b1title("Child's Age in 1970") 				//

gr export "${results}`name'_app_fig_7a.pdf", replace
cap gr export "${results}`name'_app_fig_7a.png", replace

** Appendix Figure 7B
graph bar (mean) share_movers [fw = total] if 	///
	year == 1980 & hh_educ <= 2, 				///
	graphregion(color(white)) 					///
	bgcolor(white)								///			
	over(age) 									///
	ytitle(Migration Share) 					///
	b1title("Child's Age in 1980") 				//

gr export "${results}`name'_app_fig_7b.pdf", replace
cap gr export "${results}`name'_app_fig_7b.png", replace
	
clear

** End log file for appendix figure 7
capture log close
