segregation.spatial.SpatialProxProf

class segregation.spatial.SpatialProxProf(data, group_pop_var, total_pop_var, m=1000)[source]

Calculation of Spatial Proximity Profile

Parameters
dataa geopandas DataFrame with a geometry column.
group_pop_varstring

The name of variable in data that contains the population size of the group of interest

total_pop_varstring

The name of variable in data that contains the total population of the unit

mint

a numeric value indicating the number of thresholds to be used. Default value is 1000. A large value of m creates a smoother-looking graph and a more precise spatial proximity profile value but slows down the calculation speed.

Notes

Based on Hong, Seong-Yun, and Yukio Sadahiro. “Measuring geographic segregation: a graph-based approach.” Journal of Geographical Systems 16.2 (2014): 211-231.

Reference: [HS14].

Examples

In this example, we will calculate the spatial proximity profile (SPP) for the Riverside County using the census tract data of 2010. The group of interest is non-hispanic black people which is the variable nhblk10 in the dataset.

Firstly, we need to perform some import the modules and the respective function.

>>> import pandas as pd
>>> import geopandas as gpd
>>> import segregation
>>> from segregation.spatial import SpatialProxProf

Secondly, we need to read the data:

>>> # This example uses all census data that the user must provide your own copy of the external database.
>>> # A step-by-step procedure for downloading the data can be found here: https://github.com/spatialucr/geosnap/blob/master/examples/01_getting_started.ipynb
>>> # After the user download the LTDB_Std_All_fullcount.zip and extract the files, the filepath might be something like presented below.
>>> filepath = '~/data/LTDB_Std_2010_fullcount.csv'
>>> census_2010 = pd.read_csv(filepath, encoding = "ISO-8859-1", sep = ",")

Then, we filter only for the desired county (in this case, Riverside County):

>>> df = census_2010.loc[census_2010.county == "Riverside County"][['tractid', 'pop10','nhblk10']]

Then, we read the Riverside map data using geopandas (the county id is 06065):

>>> map_url = 'https://raw.githubusercontent.com/renanxcortes/inequality-segregation-supplementary-files/master/Tracts_grouped_by_County/06065.json'
>>> map_gpd = gpd.read_file(map_url)

It is necessary to harmonize the data type of the dataset and the geopandas in order to work the merging procedure. Later, we extract only the columns that will be used.

>>> map_gpd['INTGEOID10'] = pd.to_numeric(map_gpd["GEOID10"])
>>> gdf_pre = map_gpd.merge(df, left_on = 'INTGEOID10', right_on = 'tractid')
>>> gdf = gdf_pre[['geometry', 'pop10', 'nhblk10']]
>>> spat_prox_index = SpatialProxProf(gdf, 'nhblk10', 'pop10')
>>> spat_prox_index.statistic
0.11217269612149207

You can plot the profile curve with the plot method.

>>> spat_prox_index.plot()
Attributes
statisticfloat

Spatial Proximity Profile Index

core_dataa geopandas DataFrame

A geopandas DataFrame that contains the columns used to perform the estimate.

Methods

plot()

Plot the Spatial Proximity Profile

__init__(data, group_pop_var, total_pop_var, m=1000)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(data, group_pop_var, total_pop_var)

Initialize self.

plot()

Plot the Spatial Proximity Profile