segregation.spatial.BoundarySpatialDissim¶
-
class
segregation.spatial.
BoundarySpatialDissim
(data, group_pop_var, total_pop_var, standardize=False)[source]¶ Calculation of Boundary Spatial Dissimilarity index
- Parameters
- dataa geopandas DataFrame with a geometry column.
- group_pop_varstring
The name of variable in data that contains the population size of the group of interest
- total_pop_varstring
The name of variable in data that contains the total population of the unit
- standardizeboolean
A condition for row standardisation of the weights matrices. If True, the values of cij in the formulas gets row standardized. For the sake of comparison, the seg R package of Hong, Seong-Yun, David O’Sullivan, and Yukio Sadahiro. “Implementing spatial segregation measures in R.” PloS one 9.11 (2014): e113767. works by default without row standardization. That is, directly with border length.
Notes
The formula is based on Hong, Seong-Yun, David O’Sullivan, and Yukio Sadahiro. “Implementing spatial segregation measures in R.” PloS one 9.11 (2014): e113767.
Original paper by Wong, David WS. “Spatial indices of segregation.” Urban studies 30.3 (1993): 559-572.
References: [HOSullivanS14] and [Won93].
Examples
In this example, we will calculate the degree of boundary spatial dissimilarity (D) for the Riverside County using the census tract data of 2010. The group of interest is non-hispanic black people which is the variable nhblk10 in the dataset.
Firstly, we need to perform some import the modules and the respective function.
>>> import pandas as pd >>> import geopandas as gpd >>> import segregation >>> from segregation.spatial import BoundarySpatialDissim
Secondly, we need to read the data:
>>> # This example uses all census data that the user must provide your own copy of the external database. >>> # A step-by-step procedure for downloading the data can be found here: https://github.com/spatialucr/geosnap/blob/master/examples/01_getting_started.ipynb >>> # After the user download the LTDB_Std_All_fullcount.zip and extract the files, the filepath might be something like presented below. >>> filepath = '~/data/LTDB_Std_2010_fullcount.csv' >>> census_2010 = pd.read_csv(filepath, encoding = "ISO-8859-1", sep = ",")
Then, we filter only for the desired county (in this case, Riverside County):
>>> df = census_2010.loc[census_2010.county == "Riverside County"][['tractid', 'pop10','nhblk10']]
Then, we read the Riverside map data using geopandas (the county id is 06065):
>>> map_url = 'https://raw.githubusercontent.com/renanxcortes/inequality-segregation-supplementary-files/master/Tracts_grouped_by_County/06065.json' >>> map_gpd = gpd.read_file(map_url)
It is necessary to harmonize the data type of the dataset and the geopandas in order to work the merging procedure. Later, we extract only the columns that will be used.
>>> map_gpd['INTGEOID10'] = pd.to_numeric(map_gpd["GEOID10"]) >>> gdf_pre = map_gpd.merge(df, left_on = 'INTGEOID10', right_on = 'tractid') >>> gdf = gdf_pre[['geometry', 'pop10', 'nhblk10']]
The value is estimated below.
>>> boundary_spatial_dissim_index = BoundarySpatialDissim(gdf, 'nhblk10', 'pop10') >>> boundary_spatial_dissim_index.statistic 0.28869903953453163
- Attributes
- statisticfloat
Boundary Spatial Dissimilarity Index
- core_dataa geopandas DataFrame
A geopandas DataFrame that contains the columns used to perform the estimate.
-
__init__
(data, group_pop_var, total_pop_var, standardize=False)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(data, group_pop_var, total_pop_var)Initialize self.