Timeseries Functions

This module allows the manipulation of timeseries.

pyleoclim.Timeseries.binvalues(x, y, bin_size=None, start=None, end=None)

Bin the values

Args:

x (array): the x-axis series. y (array): the y-axis series. bin_size (float): The size of the bins. Default is the average resolution start (float): Where/when to start binning. Default is the minimum end (float): When/where to stop binning. Defulat is the maximum

Returns:

binned_values - the binned output

bins - the bins (centered on the median, i.e., the 100-200 bin is 150)

n - number of data points in each bin

error - the standard error on the mean in each bin

pyleoclim.Timeseries.interp(x, y, interp_step=None, start=None, end=None)

Linear interpolation onto a new x-axis

Args:

x (array): the x-axis y (array): the y-axis interp_step (float): the interpolation step. Default is mean resolution. start (float): where/when to start the interpolation. Default is min.. end (float): where/when to stop the interpolation. Default is max.

Returns:

xi - the interpolated x-axis

interp_values - the interpolated values

pyleoclim.Timeseries.onCommonAxis(x1, y1, x2, y2, method='interpolation', step=None, start=None, end=None)

Places two timeseries on a common axis

Args:

x1 (array): x-axis values of the first timeseries y1 (array): y-axis values of the first timeseries x2 (array): x-axis values of the second timeseries y2 (array): y-axis values of the second timeseries method (str): Which method to use to get the timeseries on the same x axis.

Valid entries: ‘interpolation’ (default), ‘bin’

step (float): The interpolation step. Default is mean resolution of lowest resolution series start (float): where/when to start. Default is the maximum of the minima of the two timeseries end (float): Where/when to end. Default is the minimum of the maxima of the two timeseries

Returns:

xi - the interpolated x-axis

interp_values1 - the interpolated y-values for the first timeseries interp_values2 - the intespolated y-values for the second timeseries

pyleoclim.Timeseries.standardize(x, scale=1, axis=0, ddof=0, eps=0.001)

Centers and normalizes a given time series. Constant or nearly constant time series not rescaled.

Args:

x (array): vector of (real) numbers as a time series, NaNs allowed scale (real): a scale factor used to scale a record to a match a given variance axis (int or None): axis along which to operate, if None, compute over the whole array ddof (int): degress of freedom correction in the calculation of the standard deviation eps (real): a threshold to determine if the standard deviation is too close to zero

Returns:

z (array): the standardized time series (z-score), Z = (X - E[X])/std(X)*scale, NaNs allowed mu (real): the mean of the original time series, E[X] sig (real): the standard deviation of the original time series, std[X]

References:
  1. Tapio Schneider’s MATLAB code: http://www.clidyn.ethz.ch/imputation/standardize.m

  2. The zscore function in SciPy: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py

@author: fzhu

pyleoclim.Timeseries.ts2segments(ys, ts, factor=10)

Chop a time series into several segments based on gap detection.

The rule of gap detection is very simple:

we define the intervals between time points as dts, then if dts[i] is larger than factor * dts[i-1], we think that the change of dts (or the gradient) is too large, and we regard it as a breaking point and chop the time series into two segments here

Args:

ys (array): a time series, NaNs allowed ts (array): the time points factor (float): the factor that adjusts the threshold for gap detection

Returns:

seg_ys (list): a list of several segments with potentially different lengths seg_ts (list): a list of the time axis of the several segments n_segs (int): the number of segments

@author: fzhu

pyleoclim.Timeseries.clean_ts(ys, ts)

Delete the NaNs in the time series and sort it with time axis ascending

Args:

ys (array): a time series, NaNs allowed ts (array): the time axis of the time series, NaNs allowed

Returns:

ys (array): the time series without nans ts (array): the time axis of the time series without nans

pyleoclim.Timeseries.annualize(ys, ts)

Annualize a time series whose time resolution is finer than 1 year

Args:

ys (array): a time series, NaNs allowed ts (array): the time axis of the time series, NaNs allowed

Returns:

ys_ann (array): the annualized time series year_int (array): the time axis of the annualized time series

pyleoclim.Timeseries.gaussianize(X)

Transforms a (proxy) timeseries to Gaussian distribution.

Originator: Michael Erb, Univ. of Southern California - April 2017

pyleoclim.Timeseries.gaussianize_single(X_single)

Transforms a single (proxy) timeseries to Gaussian distribution.

Originator: Michael Erb, Univ. of Southern California - April 2017

pyleoclim.Timeseries.detrend(y, x=None, method='linear', params=['default', 4, 0, 1])

Detrend a timeseries according to three methods

Detrending methods include, “linear” (default), “constant”, and using a low-pass

Savitzky-Golay filters.

Args:

y (array): The series to be detrended. x (array): The time axis for the timeseries. Necessary for use with

the Savitzky-Golay filters method since the series should be evenly spaced.

method (str): The type of detrending. If linear (default), the result of

a linear least-squares fit to y is subtracted from y. If constant, only the mean of data is subtrated. If “savitzy-golay”, y is filtered using the Savitzky-Golay filters and the resulting filtered series is subtracted from y.

params (list): The paramters for the Savitzky-Golay filters. The first parameter

corresponds to the window size (default it set to half of the data) while the second parameter correspond to the order of the filter (default is 4). The third parameter is the order of the derivative (the default is zero, which means only smoothing.)

Returns:

ys (array) - the detrended timeseries.