june.infection_seed.observed_to_cases.Observed2Cases¶
-
class
june.infection_seed.observed_to_cases.
Observed2Cases
(age_per_area_df: pandas.core.frame.DataFrame, female_fraction_per_area_df: pandas.core.frame.DataFrame, health_index_generator: HealthIndexGenerator = None, symptoms_trajectories: Optional[TrajectoryMaker] = None, n_observed_deaths: Optional[pandas.core.frame.DataFrame] = None, area_super_region_df: Optional[pandas.core.frame.DataFrame] = None, smoothing=False) Class to convert observed deaths over time into predicted number of latent cases over time, use for the seed of the infection. It reads population data, to compute average death rates for a particular region, timings from config file estimate the median time it takes for someone infected to die in hospital, and the health index to obtain the death rate as a function of age and sex.
- age_per_area_df:
data frame with the age distribution per area, to compute the weighted death rate
- female_fraction_per_area_df:
data frame with the fraction of females per area as a function of age to compute the weighted death rate
- health_index_generator:
generator of the health index to compute death_rate(age,sex)
- symptoms_trajectories:
used to read the trajectory config file and compute the median time it takes to die in hospital
- n_observed_deaths:
time series with the number of observed deaths per region
- area_super_region_df:
df with area, super_area, region mapping
- smoothing:
whether to smooth the observed deaths time series before computing the expected number of cases (therefore the estimates becomes less dependent on spikes in the data)
-
__init__
(age_per_area_df: pandas.core.frame.DataFrame, female_fraction_per_area_df: pandas.core.frame.DataFrame, health_index_generator: HealthIndexGenerator = None, symptoms_trajectories: Optional[TrajectoryMaker] = None, n_observed_deaths: Optional[pandas.core.frame.DataFrame] = None, area_super_region_df: Optional[pandas.core.frame.DataFrame] = None, smoothing=False) Class to convert observed deaths over time into predicted number of latent cases over time, use for the seed of the infection. It reads population data, to compute average death rates for a particular region, timings from config file estimate the median time it takes for someone infected to die in hospital, and the health index to obtain the death rate as a function of age and sex.
- age_per_area_df:
data frame with the age distribution per area, to compute the weighted death rate
- female_fraction_per_area_df:
data frame with the fraction of females per area as a function of age to compute the weighted death rate
- health_index_generator:
generator of the health index to compute death_rate(age,sex)
- symptoms_trajectories:
used to read the trajectory config file and compute the median time it takes to die in hospital
- n_observed_deaths:
time series with the number of observed deaths per region
- area_super_region_df:
df with area, super_area, region mapping
- smoothing:
whether to smooth the observed deaths time series before computing the expected number of cases (therefore the estimates becomes less dependent on spikes in the data)
-
_smooth_time_series
(time_series_df: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame Smooth a time series by applying a gaussian filter in 1d
- time_series_df:
df with time as index
smoothed time series df
-
aggregate_age_sex_dfs_by_region
(age_per_area_df: pandas.core.frame.DataFrame, female_fraction_per_area_df: pandas.core.frame.DataFrame) -> (<class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>) Combines the age per area dataframe and female fraction per area to create two data frames with numbers of females by age per region, and numbers of males by age per region
- age_per_area_df:
data frame with the number of people with a certain age per area
- female_fraction_per_area_df:
fraction of those that are females per area and age bin
- females_per_age_region_df:
number of females as a function of age per region
- males_per_age_region_df:
number of males as a function of age per region
-
aggregate_areas_by_region
(df_per_area: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame Aggregates an area dataframe into a region dataframe
- df_per_area:
data frame indexed by area
-
convert_regional_cases_to_super_area
(n_cases_per_region_df: pandas.core.frame.DataFrame, dates: Union[List[str], Dict[str, List]]) → pandas.core.frame.DataFrame Converts regional cases to cases by super area by weighting each super area within the region according to its population
- n_cases_per_region_df:
data frame with the number of cases by region, indexed by date
- dates:
dates to select (it can be a dictinary with different dates for different regions
data frame with the number of cases by super area, indexed by date
-
filter_symptoms_trajectories
(symptoms_trajectories: List[TrajectoryMaker], symptoms_to_keep: Tuple[str] = 'dead_hospital', 'dead_icu') → List[june.infection.trajectory_maker.TrajectoryMaker] Filter all symptoms trajectories to obtain only the ones that contain given symtpoms in
`symptoms_to_keep`
- symptoms_trajectories:
list of all symptoms trajectories
- symptoms_to_keep:
tuple of strings containing the desired symptoms for which to find trajectories
trajectories containing
`symptoms_to_keep`
-
classmethod
from_file
(health_index_generator, age_per_area_path: str = PosixPath('/home/sadie/JUNE/data/input/demography/age_structure_single_year.csv'), female_fraction_per_area_path: str = PosixPath('/home/sadie/JUNE/data/input/demography/female_ratios_per_age_bin.csv'), trajectories_path: str = PosixPath('/home/sadie/JUNE/configs/defaults/symptoms/trajectories.yaml'), observed_deaths_path: str = PosixPath('/home/sadie/JUNE/data/covid_real_data/n_deaths_region.csv'), area_super_region_path: str = PosixPath('/home/sadie/JUNE/data/input/geography/area_super_area_region.csv'), smoothing=False) → june.infection_seed.observed_to_cases.Observed2Cases Creates class from paths to data
- health_index_generator:
generator of the health index to compute death_rate(age,sex)
- age_per_area_path:
path to data with number of people of a given age by area
- female_fraction_per_area_df:
path to data with fraction of people that are female by area and age bin
- trajectories_path:
path to config file with possible symptoms trajectories and their timings
- observed_deaths_path:
path to time series of observed deaths over time
- area_super_region_path:
path to data on area, super_area, region mapping
- smoothing:
whether to smooth the observed deaths time series before computing the expected number of cases (therefore the estimates becomes less dependent on spikes in the data)
Instance of Observed2Cases
-
get_latent_cases_from_observed
(n_observed: int, avg_rates: List) → int Given a number of observed cases, such as observed deaths or observed hospital admissions, this function converts it into number of latent cases necessary to produce such an observation.
- n_observed:
observed number of cases (such as deaths or hospital admissions)
- avg_rates:
average rates to produce the observed cases, such as average death rate or average admission rates. It is a list, since we might want to look at, for instance, death rate, which is a combination of deat_home, deat_hospital, dead_icu rates.
Number of latent cases
-
get_latent_cases_per_region
(n_observed_df: pandas.core.frame.DataFrame, time_to_get_symptoms: int, avg_rates_per_region: dict) → pandas.core.frame.DataFrame Converts observed cases per region into latent cases per region.
- n_observed_df:
time series of the observed cases
- time_to_get_symptoms:
days it takes form infection to the symptoms of interest (such as time to death)
- avg_rates_per_region:
average probability to get those symptoms per region
- n_cases_per_region_df:
number of latent cases per region time series
-
get_median_completion_time
(stage: Stage) → float Get median completion time of a stage, from its distribution
- stage:
given stage in trajectory
Median time spent in stage
-
get_regional_latent_cases
() → pandas.core.frame.DataFrame Find regional latent cases from the observed one.
data frame with latent cases per region indexed by date
-
get_super_area_population_weights
() → pandas.core.frame.DataFrame Compute the weight in population that a super area has over its whole region, used to convert regional cases to cases by super area by population density
data frame indexed by super area, with weights and region
-
get_symptoms_rates_per_age_sex
() → dict Computes the rates of ending up with certain SymptomTag for all ages and sex.
dictionary with rates of symptoms (fate) as a function of age and sex
-
get_time_it_takes_to_symptoms
(symptoms_trajectories: List[TrajectoryMaker], symptoms_tags: List[str]) Compute the median time it takes to get certain symptoms in
`symptoms_tags`
, such as death or hospital admission.- symptoms_trajectories:
list of symptoms trajectories
- symptoms_tags:
symptoms tags for the symptoms of interest
-
get_weighted_time_to_symptoms
(symptoms_trajectories: List[TrajectoryMaker], avg_rate_for_symptoms: List[float], symptoms_tags: List[str]) → float Get the time it takes to get certain symptoms weighted by population. For instance, when computing the death rate, more people die in hospital than in icu, therefore the median time to die in hospital gets weighted more than the median time to die in icu.
- symptoms_trajectories:
trajectories for symptoms that include the symptoms of interest
- avg_rate_for_symptoms:
list containing the average rate for certain symptoms given in
`symptoms tags`
. WARNING: should be in the same order- symptoms_tags:
tags of the symptoms for which we want to know the median time
Weighted median time to symptoms
-
weight_rates_by_age_sex_per_region
(symptoms_rates_dict: dict, symptoms_tags: List[SymptomTag]) → List[float] Get the weighted average by age and sex of symptoms rates for symptoms in symptoms_tags. For example to get the weighted average death rate per region, select symptoms_tags = (‘dead_hospital’, ‘dead_icu’, ‘dead_home’)
- symtpoms_rates_dict:
dictionary with rates for all the possible final symptoms, indexed by sex and age.
- symptoms_tags:
final symptoms to keep
List of weighted rates for symptoms in
`symptoms_tags`
(ordered in the same way!!)