yasa.EpochByEpochAgreement¶
- class yasa.EpochByEpochAgreement(ref_hyps, obs_hyps)[source]¶
Evaluate agreement between two hypnograms or two collections of hypnograms.
Evaluation includes averaged agreement scores, one-vs-rest agreement scores, agreement scores summarized across all sleep and summarized by sleep stage, and various plotting options to visualize the two hypnograms simultaneously. See examples for more detail.
New in version 0.7.0.
- Parameters
- ref_hypsiterable of
yasa.Hypnogram A collection of reference hypnograms (i.e., those considered ground-truth).
Each
yasa.Hypnograminref_hypsmust have the samescorer.If a
dict, key values are use to generate unique sleep session IDs. If any other iterable (e.g.,listortuple), then unique sleep session IDs are automatically generated.- obs_hypsiterable of
yasa.Hypnogram A collection of observed hypnograms (i.e., those to be evaluated).
Each
yasa.Hypnograminobs_hypsmust have the samescorer, and this scorer must be different than the scorer of hypnograms inref_hyps.If a
dict, key values must match those ofref_hyps.- .. important::
It is assumed that the order of hypnograms are the same in
ref_hypsandobs_hyps. For example, the third hypnogram inref_hypsandobs_hypsmust come from the same sleep session, and they must only differ in that they have different scorers.- .. seealso:: For comparing just two hypnograms, use :py:meth:`yasa.Hynogram.evaluate`.
- ref_hypsiterable of
Notes
Many steps here are influenced by guidelines proposed in Menghini et al., 2021 [Menghini2021]. See https://sri-human-sleep.github.io/sleep-trackers-performance/AnalyticalPipeline_v1.0.0.html
References
- Menghini2021
Menghini, L., Cellini, N., Goldstone, A., Baker, F. C., & de Zambotti, M. (2021). A standardized framework for testing the performance of sleep-tracking technology: step-by-step guidelines and open-source code. SLEEP, 44(2), zsaa170. https://doi.org/10.1093/sleep/zsaa170
Examples
>>> import yasa >>> ref_hyps = [yasa.simulate_hypnogram(tib=600, scorer="Human", seed=i) for i in range(10)] >>> obs_hyps = [h.simulate_similar(scorer="YASA", seed=i) for i, h in enumerate(ref_hyps)] >>> ebe = yasa.EpochByEpochAgreement(ref_hyps, obs_hyps) >>> agr = ebe.get_agreement() >>> agr.head(5).round(2) accuracy balanced_acc kappa mcc precision recall f1 sleep_id 1 0.31 0.26 0.07 0.07 0.31 0.31 0.31 2 0.33 0.33 0.14 0.14 0.35 0.33 0.34 3 0.35 0.24 0.06 0.06 0.35 0.35 0.35 4 0.22 0.21 0.01 0.01 0.21 0.22 0.21 5 0.21 0.17 -0.06 -0.06 0.20 0.21 0.21
>>> ebe.get_agreement_bystage().head(12).round(3) fbeta precision recall support stage sleep_id WAKE 1 0.391 0.371 0.413 189.0 2 0.299 0.276 0.326 184.0 3 0.234 0.204 0.275 255.0 4 0.268 0.285 0.252 321.0 5 0.228 0.230 0.227 181.0 6 0.407 0.384 0.433 284.0 7 0.362 0.296 0.467 287.0 8 0.298 0.519 0.209 263.0 9 0.210 0.191 0.233 313.0 10 0.369 0.420 0.329 362.0 N1 1 0.185 0.185 0.185 124.0 2 0.121 0.131 0.112 160.0
>>> ebe.get_confusion_matrix(sleep_id=1) YASA WAKE N1 N2 N3 REM Human WAKE 78 24 50 3 34 N1 23 23 43 15 20 N2 60 58 183 43 139 N3 30 10 50 5 32 REM 19 9 121 50 78
>>> import matplotlib.pyplot as plt >>> fig, ax = plt.subplots(figsize=(6, 3), constrained_layout=True) >>> ebe.plot_hypnograms(sleep_id=10)
>>> fig, ax = plt.subplots(figsize=(6, 3)) >>> ebe.plot_hypnograms( >>> sleep_id=8, ax=ax, obs_kwargs={"color": "red", "lw": 2, "ls": "dotted"} >>> ) >>> plt.tight_layout()
>>> session = 8 >>> fig, ax = plt.subplots(figsize=(6.5, 2.5), constrained_layout=True) >>> style_a = dict(alpha=1, lw=2.5, ls="solid", color="gainsboro", label="Michel") >>> style_b = dict(alpha=1, lw=2.5, ls="solid", color="cornflowerblue", label="Jouvet") >>> legend_style = dict( >>> title="Scorer", frameon=False, ncol=2, loc="lower center", bbox_to_anchor=(0.5, 0.9) >>> ) >>> ax = ebe.plot_hypnograms( >>> sleep_id=session, ref_kwargs=style_a, obs_kwargs=style_b, legend=legend_style, ax=ax >>> ) >>> acc = ebe.get_agreement().multiply(100).at[session, "accuracy"] >>> ax.text( >>> 0.01, 1, f"Accuracy = {acc:.0f}%", ha="left", va="bottom", transform=ax.transAxes >>> )
When comparing only 2 hypnograms, use the
evaluate()method:>>> hypno_a = yasa.simulate_hypnogram(tib=90, scorer="RaterA", seed=8) >>> hypno_b = hypno_a.simulate_similar(scorer="RaterB", seed=9) >>> ebe = hypno_a.evaluate(hypno_b) >>> ebe.get_confusion_matrix() RaterB WAKE N1 N2 N3 RaterA WAKE 71 2 20 8 N1 1 0 9 0 N2 12 4 25 0 N3 24 0 1 3
Methods
__init__(ref_hyps, obs_hyps)get_agreement([sample_weight, scorers])Return a
pandas.DataFrameof weighted (i.e., averaged) agreement scores.get_agreement_bystage([beta])Return a
pandas.DataFrameof unweighted (i.e., one-vs-rest) agreement scores.get_confusion_matrix([sleep_id, agg_func])Return a
ref_hyp/``obs_hyp``confusion matrix from either a single session or all sessions concatenated together.Return a
pandas.DataFrameof sleep statistics for each hypnogram derived from both reference and observed scorers.multi_scorer(df, scorers)Compute multiple agreement scores from a 2-column dataframe (an optional 3rd column may contain sample weights).
plot_hypnograms([sleep_id, legend, ax, ...])Plot the two hypnograms of one session overlapping on the same axis.
summary([by_stage])Return group-level agreement scores.
Attributes
A
pandas.DataFrameincluding all hypnograms.The number of unique sleep sessions.
The name of the observed scorer.
The name of the reference scorer.
- get_agreement(sample_weight=None, scorers=None)[source]¶
Return a
pandas.DataFrameof weighted (i.e., averaged) agreement scores.- Parameters
- self
EpochByEvaluation A
EpochByEvaluationinstance.- sample_weightNone or
pandas.Series Sample weights passed to underlying
sklearn.metricsfunctions where possible. If apandas.Series, the index must match exactly that ofdata.- scorersNone, list, or dictionary
The scorers to be used for evaluating agreement. If None (default), default scorers are used. If a list, the list must contain strings that represent metrics from the sklearn metrics module (e.g.,
accuracy,precision). If more customization is desired, a dictionary can be passed with scorer names (str) as keys and custom functions as values. The custom functions should take 3 positional arguments (true values, predicted values, and sample weights).
- self
- Returns
- agreement
pandas.DataFrame A
DataFramewith agreement metrics as columns and sessions as rows.
- agreement
- get_agreement_bystage(beta=1.0)[source]¶
Return a
pandas.DataFrameof unweighted (i.e., one-vs-rest) agreement scores.- Parameters
- self
EpochByEvaluation A
EpochByEvaluationinstance.- betafloat
- self
- Returns
- agreement
pandas.DataFrame A
DataFramewith agreement metrics as columns and aMultiIndexwith session and sleep stage as rows.
- agreement
- get_confusion_matrix(sleep_id=None, agg_func=None, **kwargs)[source]¶
Return a
ref_hyp/``obs_hyp``confusion matrix from either a single session or all sessions concatenated together.- Parameters
- self
yasa.EpochByEpochAgreement A
yasa.EpochByEpochAgreementinstance.- sleep_idNone or a valid sleep ID
If None (default), cross-tabulation is derived from the entire group dataset. If a valid sleep ID, cross-tabulation is derived using only the reference and observed scored hypnograms from that sleep session.
- agg_funcNone or str
If None (default), group results returns a
DataFramecomplete with all individual session results. If not None, group results returns aDataFrameaggregated across sessions whereagg_funcis passed asfuncparameter inpandas.DataFrame.groupby.agg(). For example, setagg_func="sum"to get a single confusion matrix across all epochs that does not take session into account.- **kwargskey, value pairs
Additional keyword arguments are passed to
sklearn.metrics.confusion_matrix().
- self
- Returns
- conf_matr
pandas.DataFrame A confusion matrix with stages from the reference scorer as indices and stages from the test scorer as columns.
- conf_matr
Examples
>>> import yasa >>> ref_hyps = [yasa.simulate_hypnogram(tib=90, scorer="Rater1", seed=i) for i in range(3)] >>> obs_hyps = [h.simulate_similar(scorer="Rater2", seed=i) for i, h in enumerate(ref_hyps)] >>> ebe = yasa.EpochByEpochAgreement(ref_hyps, obs_hyps) >>> ebe.get_confusion_matrix(sleep_id=2) Rater2 WAKE N1 N2 N3 REM Rater1 WAKE 1 2 23 0 0 N1 0 9 13 0 0 N2 0 6 71 0 0 N3 0 13 42 0 0 REM 0 0 0 0 0
>>> ebe.get_confusion_matrix() Rater2 WAKE N1 N2 N3 REM sleep_id Rater1 1 WAKE 30 0 3 0 35 N1 3 2 7 0 0 N2 21 12 7 0 4 N3 0 0 0 0 0 REM 2 8 29 0 17 2 WAKE 1 2 23 0 0 N1 0 9 13 0 0 N2 0 6 71 0 0 N3 0 13 42 0 0 REM 0 0 0 0 0 3 WAKE 16 0 7 19 19 N1 0 7 2 0 5 N2 0 10 12 7 5 N3 0 0 16 11 0 REM 0 15 11 18 0
>>> ebe.get_confusion_matrix(agg_func="sum") Rater2 WAKE N1 N2 N3 REM Rater1 WAKE 47 2 33 19 54 N1 3 18 22 0 5 N2 21 28 90 7 9 N3 0 13 58 11 0 REM 2 23 40 18 17
- get_sleep_stats()[source]¶
Return a
pandas.DataFrameof sleep statistics for each hypnogram derived from both reference and observed scorers.See also
See also
- Parameters
- self
yasa.EpochByEpochAgreement A
yasa.EpochByEpochAgreementinstance.
- self
- Returns
- sstats
pandas.DataFrame A
DataFramewith sleep statistics as columns and two rows for each individual (one for reference scorer and another for test scorer).
- sstats
- static multi_scorer(df, scorers)[source]¶
Compute multiple agreement scores from a 2-column dataframe (an optional 3rd column may contain sample weights).
This function offers convenience when calculating multiple agreement scores using
pandas.DataFrame.groupby.apply(). Scikit-learn doesn’t include a function that returns multiple scores, and the GroupBy implementation ofapplyin pandas does not accept multiple functions.- Parameters
- df
pandas.DataFrame A
DataFramewith 2 columns and length of n_samples. The first column contains reference values and second column contains observed values. If a third column, it must contain sample weights to be passed to underlyingsklearn.metricsfunctions assample_weightwhere applicable.- scorersdictionary
The scorers to be used for evaluating agreement. A dictionary with scorer names (str) as keys and functions as values.
- df
- Returns
- scoresdict
A dictionary with scorer names (
str) as keys and scores (float) as values.
- plot_hypnograms(sleep_id=None, legend=True, ax=None, ref_kwargs={}, obs_kwargs={})[source]¶
Plot the two hypnograms of one session overlapping on the same axis.
See also
- Parameters
- self
yasa.EpochByEpochAgreement A
yasa.EpochByEpochAgreementinstance.- sleep_ida valid sleep ID or None
The sleep session to plot. If multiple sessions are included in the
EpochByEpochAgreementinstance, asleep_idmust be provided. If only one session is present,None(default) will plot the two hypnograms of the only session.- legendbool or dict
If True (default) or a dictionary, a legend is added. If a dictionary, all key/value pairs are passed as keyword arguments to the
matplotlib.pyplot.legend()call.- ax
matplotlib.axes.Axesor None Axis on which to draw the plot, optional.
- ref_kwargsdict
Keyword arguments passed to
yasa.plot_hypnogram()when plotting the reference hypnogram.- obs_kwargsdict
Keyword arguments passed to
yasa.plot_hypnogram()when plotting the observed hypnogram.
- self
- Returns
- ax
matplotlib.axes.Axes Matplotlib Axes
- ax
Examples
>>> from yasa import simulate_hypnogram >>> hyp = simulate_hypnogram(scorer="Anthony", seed=19) >>> ax = hyp.evaluate(hyp.simulate_similar(scorer="Alan", seed=68)).plot_hypnograms()
- summary(by_stage=False, **kwargs)[source]¶
Return group-level agreement scores.
Default aggregated measures are
- Parameters
- self
EpochByEpochAgreement A
EpochByEpochAgreementinstance.- by_stagebool
If
False(default),summarywill include agreement scores derived from average-based metrics. IfTrue, returnedsummaryDataFramewill include agreement scores for each sleep stage, derived from one-vs-rest metrics.- **kwargskey, value pairs
Additional keyword arguments are passed to
pandas.DataFrame.groupby.agg(). This can be used to customize the descriptive statistics returned.
- self
- Returns
- summary
pandas.DataFrame A
pandas.DataFramesummarizing agreement scores across the entire dataset with descriptive statistics.>>> ebe = yasa.EpochByEpochAgreement(...) >>> agreement = ebe.get_agreement() >>> ebe.summary()
This will give a
DataFramewhere each row is an agreement metric and each column is a descriptive statistic (e.g., mean, standard deviation). To control the descriptive statistics included as columns:>>> ebe.summary(func=["count", "mean", "sem"])
- summary
- property data¶
A
pandas.DataFrameincluding all hypnograms.
- property n_sleeps¶
The number of unique sleep sessions.
- property obs_scorer¶
The name of the observed scorer.
- property ref_scorer¶
The name of the reference scorer.
