﻿##This readme file was generated on 2025-01-22 by Luisa C. Eggenschwiler

## GENERAL INFORMATION

# Title of Analysis: Perinatal midwifery care demand in a tertiary hospital: A time-series analysis 

# Principal Investigator Information
Name: Michael Simon
ORCID: https://orcid.org/0000-0003-2349-7219
Institution: Institute of Nursing Science, University of Basel
Address: Basel, Switzerland
Email: m.simon@unibas.ch

# Author Information
Name: Luisa C. Eggenschwiler
ORCID: https://orcid.org/0000-0002-1939-2345
Institution: Institute of Nursing Science, University of Basel
Address: Basel, Switzerland
Email: luisa.eggenschwiler@unibas.ch

# Date of data collection: 2019-01-01 - 2022-12-31

# Geographic location of data collection: Tertiary hospital in Switzerland

# Information about funding sources that supported the collection of the data: Swiss National Science Foundation

# Ethical considerations: Ethical approval was waived by the relevant ethics committee, because it is outside the scope of the Human Research Act (Req -2022-00208). 

# Description:
The aim of the related study was to describe shift-level care demand and available staffing resources in a tertiary hospital's maternity department.
The single-centre retrospective longitudinal study investigated a four-year timeframe (2019–2022). 
All registered midwives and nurses working a three-shift pattern in the prenatal unit, labour ward, or postnatal unit were included. 
To determine care demand, we approached it in a novel way, accounting for both the number of women on each unit and each case’s expected complexity. 
Any unmet care demand was calculated in relation to pre-specified nurse-to-patient ratios for each care area by subtracting demand hours from available staff hours per shift.
In total, 17,558 cases were included and 13,149 worked shifts analysed. The match of staffing resources with care demand was different for each analysed unit. 

## SHARING/ACCESS INFORMATION

# Licenses/restrictions placed on the data: CC-BY-NC 4.0

# Links to publications that cite or use the data: https://doi.org/10.1016/j.ijnsa.2025.100299

# Links to other publicly accessible locations of the data: Not publicly accessible

# Links/relationships to ancillary data sets: N/A

# Was data derived from another source? YES
Data was generated based on routine hospital data. This clinical site runs a clinical data warehouse where patient data is already prepared in a structured way to some extent. 
Some data cleaning was still necessary. 
Staff data was exported from the clinical data warehouse but was not changed by the data analysts and therefore, identical with the export from the rostering system directly.

Patient data sets:
- Birth [Mother & newborn allocation, information on birth date and time]
- Consent [Information on general consent]
- Diagnose [ICD-10-GM codes]
- Procedure [CHOP codes]
- DRG [diagnoses-related groups (DRG) codes]
- Patient [Patient characteristics like age, citizenship, language]
- Movement [Enterprise resource planning (ERP) system information on patient movement]

Nurses / midwives data sets:
- PEP Block [actual worked hours]
- PEP Planung [planned hours]
- PEP Mitarbeiter detail [contract details]

# Recommended citation for this dataset: N/A

## DATA & FILE OVERVIEW

# File List: 
- CB_Mothers_2025-01-22.xlsx
Codebook for data of the demand side (patients / mothers). 
The codebook includes three sheets, one for each unit, as there are minor unit-specific differences.
Data sets would be:
Unit1_Mothers.RData (14 columns, 841,530 rows)
Unit2_Mothers.RData (13 columns, 1,122,040 rows) 
Unit2_Mothers_outpatient.RData (10 columns, 1,431 rows)
Unit3_Mothers.RData (14 columns, 1,262,295 rows)

- CB_Nurses_2025-01-22.xlsx
Codebook for data of the supply side (nurses / midwives). 
The codebook includes three sheets, one for each unit, as there are minor unit-specific differences.
Data sets would be:
Unit1_Nurses.RData (10 columns, 560,912 rows)
Unit2_Nurses.RData (11 columns, 701,140 rows)
Unit3_Nurses.RData (10 columns, 560,912 rows)

For both codebooks, the names of the excel-sheets are the same name as the dataset would have.
It is not possible to share the data sets due to the data protection law of Switzerland.

- Patient_Nurse_Hours_Complexity_2025-01-22.R
To get an understanding of the structure of the data, we decided to provide the codebooks for the main datasets and the R code, with which the main analysis was conducted.
There are many steps before the data is in this format and structure.
Due to hospital specific routine data differences, we decided to provide only the final data structure.
For reproducibility with other routine hospital data, the final data could look as provided in the codebooks.

## METHODOLOGICAL INFORMATION

In the mentioned publication there is some information on the clinical site. Data is not specifically collected for a study’s purpose.
Patient data is collected for different reasons and in different software. 
There is the electronic health record (EHR) in which everything from clinical care, such as medication, diagnoses, procedures and progress is documented. 
Extracting from the EHR the medical controller identifies all diagnoses (ICD-10-GM) and procedures (CHOP). 
Based on these, a DRG code is generated. This DRG code is then used for insurance claims. 
Therefore, the diagnoses, procedures and DRG codes are checked thoroughly. 
Patient characteristics (date of birth, citizenship, insurance) are stored in the ERP system and are updated and checked in every patient’s visit to the clinical site. 
These data are not case-specific but patient-specific. The patients themselves have to check this information at each hospital admission. 
Also in the ERP system are the movement entries stored. Which patient was when on which unit. 
The administrative staff and nurses / midwives with an additional short-training are allowed to add movement entries. 
There are clear rules, how admissions, transfers and discharges need to be registered. 
The admission and discharge entries are later transferred to the Federal Statistical Office.

Staff data is generated through the rostering system (PEP in this clinical site). 
In the rostering system, staff characteristics (date of birth, education, contract information) are saved. 
Furthermore, the planned rostering as well as the actual working time is saved in two individual data sets. 
Changes in the planning are only allowed to the leadership staff after participating in a specific training. 
Changes in the actual working time (e.g., adding overtime) are only allowed to the head nurse of each unit after an additional specific training. 
The actual working time is then transferred to the HR and used for the payroll. 
The staff receive their monthly overview to be checked (if all overtime is entered correctly). 
Since this has an impact on their payroll, we expect a rather high reliability of the data. 

# Methods for processing the data: 
All data was extracted from the clinical data warehouse to a clinical site computer running Windows 10, and R on R Studio (versions not available). 
In this environment, for each patient identifier, case identifier and staff identifier a new identifier was created with random numbers (Patient12345, Case12345, Nurse12345). 
All identifiers were replaced in each data set separately. All identifying information (names, identifiers) were deleted.
These pseudonymized data sets were then transferred via encrypted transfer to the research sites servers where data curation and analysis was conducted.

Patient data:
Patient data sets were first cleaned separately. Rows with label “deleted entry” were deleted and plausibility checks were conducted. 
Then patient data sets were merged by pseudonymized patient and case identifiers. 
To be able to have only one row per patient, each data set that had more than one row per patient (diagnoses, procedures, movement) was nested.
In the end, there existed one data set with all patient and case information in one row per case. 
There may be certain patients that have more than one hospital admission. 
Since we were interested in each case, each row represents one case. Therefore, one patient might have more than one row.

To prepare the unit-specific number of patients, the movement data set was used. Which patient was when where. 
For each case the time between admission, transfers and discharge entries was padded.
This was necessary to be able to summarise the number of patients for each time point (in 15min increments) throughout the time frame for each unit, date and time point. 
This resulted in the final data sets used for the analysis (Unit1_Mothers.RData, Unit2_Mothers.RData, Unit2_Mothers_outpatient.RData, Unit3_Mothers.RData).

Because the patient and nurse characteristics differ slightly between units and the padded data would reach more than 5 million data points, each unit was prepared separately.

Staff data:
Staff data sets were cleaned separately. Rows with label “deleted entry” were deleted and plausibility checks were conducted. 
For example, the duration of worked time was checked for outliers with working time >14 hours. 
The staff characteristics and the actual worked time (including the unit where they worked) were merged by staff identifier to know the profession of these worked hours. 
Then the start and end of working time were transformed (pivot_longer()) to be able to again pad the time between start and end (15min increment). 
With this padded data, per profession (group_by()) the number of staff was summarized by unit, date and time point. 
This resulted in the three data sets (Unit1_Nurses.RData, Unit2_Nurses.RData, Unit3_Nurses.RData).

# Instrument- or software-specific information needed to interpret the data: R Version 4.2.4, R Studio Version 2023.03.0, Ubuntu Linux 22.04 LTS
Packages: tidyverse, padr, chron

# Describe any quality-assurance procedures performed on the data: 
Plausibility checks were conducted and discussed with the clinical collaborator. 
Distributions and patterns of missing data were assessed. 

# People involved with sample collection, processing, analysis and/or submission: 
Routinely collected hospital data involves clinical frontline staff, unit leaders, physicians and medical controllers entering data. 

## DATA-SPECIFIC INFORMATION FOR:
See codebooks provided
