# WAVES: Feature Extraction from Basal Body Temperature (BBT) Signals of Menstrual Cycles

This repository contains the code accompanying the publication:

> **Identifying Menstrual Metrics as Personal Health Markers: Age Trends and Individual Footprints in Temperature Across 5,674 Cycles**  
> Marie Gombert-Labedens, Alan Taitz, Orsolya Kiss1, Fiona C. Baker 


The code implements the **WAVES** algorithm, which extracts a comprehensive set of features from basal body temperature (BBT) time series measured across menstrual cycles. It also includes example workflows for feature extraction and downstream analysis.

---

## Repository Structure

- `1.feature_extraction.ipynb`  
  Jupyter notebook demonstrating how to apply the WAVES algorithm to BBT data to extract cycle-level features.

- `2.analysis.ipynb`  
  Jupyter notebook showing downstream analysis of the extracted features (e.g., summary statistics, visualizations, or statistical modeling used in the manuscript).

- `build_df_functions.py`  
  Helper functions for loading and preprocessing raw data files into pandas DataFrames.  
  Currently includes, among others, a function like:
  - `build_df_raw_data(...)`: read BBT data from `.dbf` files using `dbfread` and return a cleaned pandas DataFrame.

- `waves_functions.py`  
  Core implementation of the **WAVES** feature extraction algorithm for BBT signals.  
  Key functionality includes, e.g.:
  - `extract_features_full(signal, t_smooth, sampling_rate=1)`: compute a wide set of time-, frequency-, and wavelet-based features from a BBT time series.
  
  Internally, this module uses:
  - `numpy`, `pandas`
  - `scipy` (FFT, interpolation, signal processing)
  - `pywt` (wavelet transforms)
  - `sklearn` (regression / metrics)
  - `matplotlib` for optional plotting

- `cosinor.py`  
  Functions for cosinor analysis of rhythmic signals, such as:
  - `cosinor_analysis(time_stamps, signal, period=24, ...)`: estimate MESOR, amplitude, and acrophase from BBT time series.

> **Note:** The module `build_df_functions.py` imports `utils_functions`.  
> If this file is not included in the repository or not required for reproducing the published results, please adapt the import or provide your own implementation. The core WAVES algorithm lives in `waves_functions.py` and does **not** depend on this helper module.

---

## Installation

We recommend using Python 3.9+ and a virtual environment (e.g. `venv`, `conda`).

```bash

# Option 1: using pip and requirements.txt
pip install -r requirements.txt

# or manually, for example:
pip install numpy pandas scipy matplotlib scikit-learn pywavelets dbfread
```
---

## Quick Start: Computing WAVES Features

Below is a minimal example showing how to compute WAVES features for a single BBT cycle using `waves_functions.py`:

```python
import numpy as np
import pandas as pd
from waves_functions import extract_features_full

# Example: create a synthetic BBT signal for one cycle
# (replace this with your own BBT data)
n_points = 100
t_smooth = np.linspace(0, 1, n_points)  # e.g., normalized cycle time
signal = 36.5 + 0.2 * np.sin(2 * np.pi * t_smooth) + 0.05 * np.random.randn(n_points)

# Compute features
features = extract_features_full(signal, t_smooth, sampling_rate=1)

# Convert to a pandas Series or DataFrame for inspection
features_series = pd.Series(features)
print(features_series.head())