segregation.spatial.PerimeterAreaRatioSpatialDissim

class segregation.spatial.PerimeterAreaRatioSpatialDissim(data, group_pop_var, total_pop_var, standardize=True)[source]

Calculation of Perimeter/Area Ratio Spatial Dissimilarity index

Parameters
dataa geopandas DataFrame with a geometry column.
group_pop_varstring

The name of variable in data that contains the population size of the group of interest

total_pop_varstring

The name of variable in data that contains the total population of the unit

standardizeboolean

A condition for standardisation of the weights matrices. If True, the values of cij in the formulas gets standardized and the overall sum is 1.

Notes

Originally based on Wong, David WS. “Spatial indices of segregation.” Urban studies 30.3 (1993): 559-572.

However, Tivadar, Mihai. “OasisR: An R Package to Bring Some Order to the World of Segregation Measurement.” Journal of Statistical Software 89.1 (2019): 1-39. points out that in Wong’s original there is an issue with the formula which is an extra division by 2 in the spatial interaction component. This function follows the formula present in the first Appendix of Tivadar, Mihai. “OasisR: An R Package to Bring Some Order to the World of Segregation Measurement.” Journal of Statistical Software 89.1 (2019): 1-39.

References: [Won93] and [Tiv19].

Examples

In this example, we will calculate the degree of perimeter/area ratio spatial dissimilarity (PARD) for the Riverside County using the census tract data of 2010. The group of interest is non-hispanic black people which is the variable nhblk10 in the dataset.

Firstly, we need to perform some import the modules and the respective function.

>>> import pandas as pd
>>> import geopandas as gpd
>>> import segregation
>>> from segregation.spatial import PerimeterAreaRatioSpatialDissim

Secondly, we need to read the data:

>>> # This example uses all census data that the user must provide your own copy of the external database.
>>> # A step-by-step procedure for downloading the data can be found here: https://github.com/spatialucr/geosnap/blob/master/examples/01_getting_started.ipynb
>>> # After the user download the LTDB_Std_All_fullcount.zip and extract the files, the filepath might be something like presented below.
>>> filepath = '~/data/LTDB_Std_2010_fullcount.csv'
>>> census_2010 = pd.read_csv(filepath, encoding = "ISO-8859-1", sep = ",")

Then, we filter only for the desired county (in this case, Riverside County):

>>> df = census_2010.loc[census_2010.county == "Riverside County"][['tractid', 'pop10','nhblk10']]

Then, we read the Riverside map data using geopandas (the county id is 06065):

>>> map_url = 'https://raw.githubusercontent.com/renanxcortes/inequality-segregation-supplementary-files/master/Tracts_grouped_by_County/06065.json'
>>> map_gpd = gpd.read_file(map_url)

It is necessary to harmonize the data type of the dataset and the geopandas in order to work the merging procedure. Later, we extract only the columns that will be used.

>>> map_gpd['INTGEOID10'] = pd.to_numeric(map_gpd["GEOID10"])
>>> gdf_pre = map_gpd.merge(df, left_on = 'INTGEOID10', right_on = 'tractid')
>>> gdf = gdf_pre[['geometry', 'pop10', 'nhblk10']]

The value is estimated below.

>>> perimeter_area_ratio_spatial_dissim_index = PerimeterAreaRatioSpatialDissim(gdf, 'nhblk10', 'pop10')
>>> perimeter_area_ratio_spatial_dissim_index.statistic
0.31260876347432687
Attributes
statisticfloat

Perimeter/Area Ratio Spatial Dissimilarity Index

core_dataa geopandas DataFrame

A geopandas DataFrame that contains the columns used to perform the estimate.

__init__(data, group_pop_var, total_pop_var, standardize=True)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(data, group_pop_var, total_pop_var)

Initialize self.