Model#
- class mellon.model.BaseEstimator(cov_func_curry=<class 'mellon.cov.Matern52'>, n_landmarks=5000, rank=0.999, method='auto', jitter=1e-06, optimizer='L-BFGS-B', n_iter=100, init_learn_rate=1, landmarks=None, nn_distances=None, d=None, mu=0, ls=None, ls_factor=1, cov_func=None, L=None, initial_value=None)#
Bases:
object
Base class for mellon estimators.
- fit()#
Fit the model.
- fit_predict(x)#
Fit model and make prediction on training data x.
- Parameters:
x (array-like) – Data points.
- Returns:
Predictions.
- Return type:
array-like
- gradient(x, jit=True)#
Conputes the gradient of the predict function for each line in x.
- Parameters:
x (array-like) – Data points.
jit (bool) – Use jax just in time compilation. Defaults to True.
- Returns:
gradiants - The gradient of function at each point in x. gradients.shape == x.shape
- Return type:
array-like
- hessian(x, jit=True)#
Conputes the hessian of the predict function for each line in x.
- Parameters:
x (array-like) – Data points.
jit (bool) – Use jax just in time compilation. Defaults to True.
- Returns:
hessians - The hessian matrix of function at each point in x. hessians.shape == X.shape + X.shape[1:]
- Return type:
array-like
- hessian_log_determinant(x, jit=True)#
Conputes the logarirhm of the determinat of the predict function for each line in x.
- Parameters:
x (array-like) – Data points.
jit (bool) – Use jax just in time compilation. Defaults to True.
- Returns:
signs, log_determinants - The sign of the determinant at each point x and the logarithm of its absolute value. signs.shape == log_determinants.shape == x.shape[0]
- Return type:
array-like, array-like
- predict(x)#
Make prediction for new data x.
- Parameters:
x (array-like) – Data points.
- Returns:
Predictions.
- Return type:
array-like
- prepare_inference(x)#
Set all attributes in preparation for fitting. It is not necessary to call this function before calling fit.
- class mellon.model.DensityEstimator(cov_func_curry=<class 'mellon.cov.Matern52'>, n_landmarks=5000, rank=0.999, method='auto', jitter=1e-06, optimizer='L-BFGS-B', n_iter=100, init_learn_rate=1, landmarks=None, nn_distances=None, d=None, mu=None, ls=None, ls_factor=1, cov_func=None, L=None, initial_value=None, jit=False)#
Bases:
BaseEstimator
A non-parametric density estimator. DensityEstimator performs Bayesian inference with a Gaussian process prior and Nearest Neighbor likelihood. All intermediate computations are cached as instance variables, so the user can view intermediate results and save computation time by passing precomputed values as arguments to a new model.
- Parameters:
cov_func_curry (function or type) – The generator of the Gaussian process covariance function. Must be a curry that takes one length scale argument and returns a covariance function of the form k(x, y) \(\rightarrow\) float. Defaults to the type Matern52.
n_landmarks (int) – The number of landmark points. If less than 1 or greater than or equal to the number of training points, does not compute or use inducing points. Defaults to 5000.
rank (int or float) – The rank of the approximate covariance matrix. If rank is an int, an \(n \times\) rank matrix \(L\) is computed such that \(L L^\top \approx K\), the exact \(n \times n\) covariance matrix. If rank is a float 0.0 \(\le\) rank \(\le\) 1.0, the rank/size of \(L\) is selected such that the included eigenvalues of the covariance between landmark points account for the specified percentage of the sum of eigenvalues. Defaults to 0.999.
method (str) – Explicitly specifies whether rank is to be interpreted as a fixed number of eigenvectors or a percent of eigenvalues to include in the low rank approximation. Supports ‘fixed’, ‘percent’, or ‘auto’. If ‘auto’, interprets rank as a fixed number of eigenvectors if it is an int and interprets rank as a percent of eigenvalues if it is a float. Provided for explictness and to clarify the ambiguous case of 1 vs 1.0. Defaults to ‘auto’.
jitter (float) – A small amount to add to the diagonal of the covariance matrix to bind eigenvalues numerically away from 0 ensuring numerical stabilitity. Defaults to 1e-6.
optimizer (str) – Select optimizer ‘L-BFGS-B’ or stochastic optimizer ‘adam’ for the maximum a posteriori density estimation. Defaults to ‘L-BFGS-B’.
n_iter (int) – The number of optimization iterations. Defaults to 100.
init_learn_rate (float) – The initial learn rate. Defaults to 1.
landmarks (array-like or None) – The points to quantize the data for the approximate covariance. If None, landmarks are set as k-means centroids with k=n_landmarks. Ignored if n_landmarks is greater than or equal to the number of training points. Defaults to None.
nn_distances (array-like or None) – The nearest neighbor distances at each data point. If None, computes the nearest neighbor distances automatically, with a KDTree if the dimensionality of the data is less than 20, or a BallTree otherwise. Defaults to None.
d (int or None) – The local dimensionality of the data, i.e., the dimansionality of the embedded manifold. If None, sets d to the size of axis 1 of the training data points. Defaults to None.
mu (float or None) – The mean of the Gaussian process. If None, sets mu to the 1th percentile of \(mle(nn\text{_}distances, d) - 10\), where \(mle = \log(\text{gamma}(d/2 + 1)) - (d/2) \cdot \log(\pi) - d \cdot \log(nn\text{_}distances)\). Defaults to None.
ls (float or None) – The length scale of the Gaussian process covariance function. If None, sets ls to the geometric mean of the nearest neighbor distances times a constant. If cov_func is supplied explictly, ls has no effect. Defaults to None.
cov_func (function or None) – The Gaussian process covariance function of the form k(x, y) \(\rightarrow\) float. If None, automatically generates the covariance function cov_func = cov_func_curry(ls). Defaults to None.
L (array-like or None) – A matrix such that \(L L^\top \approx K\), where \(K\) is the covariance matrix. If None, automatically computes L. Defaults to None.
initial_value (array-like or None) – The initial guess for optimization. If None, finds \(z\) that minimizes \(||Lz + mu - mle|| + ||z||\), where \(mle = \log(\text{gamma}(d/2 + 1)) - (d/2) \cdot \log(\pi) - d \cdot \log(nn\text{_}distances)\), where \(d\) is the dimensionality of the data. Defaults to None.
jit (bool) – Use jax just in time compilation for loss and its gradient during optimization. Defaults to False.
- Variables:
cov_func_curry – The generator of the Gaussian process covariance function.
n_landmarks – The number of landmark points.
rank – The rank of approximate covariance matrix or percentage of eigenvalues included in approximate covariance matrix.
method – The method to interpret the rank as a fixed number of eigenvectors or a percentage of eigenvalues.
jitter – A small amount added to the diagonal of the covariance matrix for numerical stability.
n_iter – The number of optimization iterations if adam optimizer is used.
init_learn_rate – The initial learn rate when adam optimizer is used.
landmarks – The points to quantize the data.
nn_distances – The nearest neighbor distances for each data point.
d – The local dimensionality of the data.
mu – The Gaussian process mean.
ls – The Gaussian process covariance function length scale.
ls_factor – Factor to scale the automatically selected length scale. Defaults to 1.
cov_func – The Gaussian process covariance function.
L – A matrix such that \(L L^\top \approx K\), where \(K\) is the covariance matrix.
initial_value – The initial guess for Maximum A Posteriori optimization.
optimizer – Optimizer for the maximum a posteriori density estimation.
x – The training data.
transform – A function \(z \sim \text{Normal}(0, I) \rightarrow \text{Normal}(mu, K')\). Used to map the latent representation to the log-density on the training data.
loss_func – The Bayesian loss function.
pre_transformation – The optimized parameters \(z \sim \text{Normal}(0, I)\) before transformation to \(\text{Normal}(mu, K')\), where \(I\) is the identity matrix and \(K'\) is the approximate covariance matrix.
opt_state – The final state the optimizer.
losses – The history of losses throughout training of adam or final loss of L-BFGS-B.
log_density_x – The log density at the training points.
log_density_func – A function that computes the log density at arbitrary prediction points.
- fit(x=None, build_predict=True)#
Fit the model from end to end.
- Parameters:
x (array-like) – The training instances to estimate density function.
build_predict (bool) – Whether or not to build the prediction function. Defaults to True.
- Returns:
self - A fitted instance of this estimator.
- Return type:
Object
- fit_predict(x=None, build_predict=False)#
Perform Bayesian inference and return the log density at training points.
- Parameters:
x (array-like) – The training instances to estimate density function.
- Returns:
log_density_x - The log density at each training point in x.
- predict(x)#
Predict the log density at each point in x.
- Parameters:
x (array-like) – The new data to predict.
- Returns:
log_density - The log density at each test point in x.
- Return type:
array-like
- prepare_inference(x)#
Set all attributes in preparation for optimization, but do not perform Bayesian inference. It is not necessary to call this function before calling fit.
- Parameters:
x (array-like) – The training instances to estimate density function.
- Returns:
loss_func, initial_value - The Bayesian loss function and initial guess for optimization.
- Return type:
function, array-like
- process_inference(pre_transformation=None, build_predict=True)#
Use the optimized parameters to compute the log density at the training points. If build_predict, also build the prediction function.
- Parameters:
pre_transformation (array-like) – The optimized parameters. If None, uses the stored pre_transformation attribute.
build_predict (bool) – Whether or not to build the prediction function. Defaults to True.
- Returns:
log_density_x - The log density
- Return type:
array-like
- run_inference(loss_func=None, initial_value=None, optimizer=None)#
Perform Bayesian inference, optimizing the pre_transformation parameters. If you would like to run your own inference procedure, use the loss_function and initial_value attributes and set pre_transformation to the optimized parameters.
- Parameters:
loss_func (function) – The Bayesian loss function. If None, uses the stored loss_func attribute.
initial_value (array-like) – The initial guess for optimization. If None, uses the stored initial_value attribute.
- Returns:
pre_transformation - The optimized parameters.
- Return type:
array-like
- class mellon.model.FunctionEstimator(cov_func_curry=<class 'mellon.cov.Matern52'>, n_landmarks=5000, rank=0.999, method='auto', jitter=1e-06, optimizer='L-BFGS-B', n_iter=100, init_learn_rate=1, landmarks=None, nn_distances=None, d=None, mu=0, ls=None, ls_factor=1, cov_func=None, L=None, sigma=0)#
Bases:
BaseEstimator
Uses a conditional normal distribution to smoothen and extend a function on all cell states using the Mellon abstractions.
- Parameters:
cov_func_curry (function or type) – The generator of the Gaussian process covariance function. Must be a curry that takes one length scale argument and returns a covariance function of the form k(x, y) \(\rightarrow\) float. Defaults to the type Matern52.
n_landmarks (int) – The number of landmark points. If less than 1 or greater than or equal to the number of training points, does not compute or use inducing points. Defaults to 5000.
rank (int or float) – The rank of the approximate covariance matrix. If rank is an int, an \(n \times\) rank matrix \(L\) is computed such that \(L L^\top \approx K\), the exact \(n \times n\) covariance matrix. If rank is a float 0.0 \(\le\) rank \(\le\) 1.0, the rank/size of \(L\) is selected such that the included eigenvalues of the covariance between landmark points account for the specified percentage of the sum of eigenvalues. Defaults to 0.999.
method (str) – Explicitly specifies whether rank is to be interpreted as a fixed number of eigenvectors or a percent of eigenvalues to include in the low rank approximation. Supports ‘fixed’, ‘percent’, or ‘auto’. If ‘auto’, interprets rank as a fixed number of eigenvectors if it is an int and interprets rank as a percent of eigenvalues if it is a float. Provided for explictness and to clarify the ambiguous case of 1 vs 1.0. Defaults to ‘auto’.
jitter (float) – A small amount to add to the diagonal of the covariance matrix to bind eigenvalues numerically away from 0 ensuring numerical stabilitity. Defaults to 1e-6.
landmarks (array-like or None) – The points to quantize the data for the approximate covariance. If None, landmarks are set as k-means centroids with k=n_landmarks. Ignored if n_landmarks is greater than or equal to the number of training points. Defaults to None.
nn_distances (array-like or None) – The nearest neighbor distances at each data point. If None, computes the nearest neighbor distances automatically, with a KDTree if the dimensionality of the data is less than 20, or a BallTree otherwise. Defaults to None.
mu (float or None) – The mean of the Gaussian process. Defaults to 0.
ls (float or None) – The length scale of the Gaussian process covariance function. If None, sets ls to the geometric mean of the nearest neighbor distances times a constant. If cov_func is supplied explictly, ls has no effect. Defaults to None.
cov_func (function or None) – The Gaussian process covariance function of the form k(x, y) \(\rightarrow\) float. If None, automatically generates the covariance function cov_func = cov_func_curry(ls). Defaults to None.
L (array-like or None) – A matrix such that \(L L^\top \approx K\), where \(K\) is the covariance matrix. If None, automatically computes L. Defaults to None.
sigma (float) – The white moise standard deviation. Defaults to 0.
- Variables:
n_landmarks – The number of landmark points.
rank – The rank of approximate covariance matrix or percentage of eigenvalues included in approximate covariance matrix.
method – The method to interpret the rank as a fixed number of eigenvectors or a percentage of eigenvalues.
jitter – A small amount added to the diagonal of the covariance matrix for numerical stability.
landmarks – The points to quantize the data.
nn_distances – The nearest neighbor distances for each data point.
d – The local dimensionality of the data.
mu – The Gaussian process mean.
ls – The Gaussian process covariance function length scale.
ls_factor – Factor to scale the automatically selected length scale. Defaults to 1.
cov_func – The Gaussian process covariance function.
L – A matrix such that \(L L^\top \approx K\), where \(K\) is the covariance matrix.
sigma – White noise standard deviation.
x – The cell states.
y – Function values on cell states.
- compute_conditional(x=None, y=None)#
Compute and return the conditional mean function.
- Parameters:
x (array-like) – The training instances to estimate density function.
y (array-like) – The training function values on cell states.
- Returns:
condition_mean_function - The conditional mean function.
- Return type:
array-like
- fit(x=None, y=None)#
Fit the model from end to end.
- Parameters:
x (array-like) – The training cell states.
y (array-like) – The training function values on cell states.
build_predict (bool) – Whether or not to build the prediction function. Defaults to True.
- Returns:
self - A fitted instance of this estimator.
- Return type:
Object
- fit_predict(x=None, y=None)#
Compute the conditional mean and return the smoothed function values at the points x.
- Parameters:
x (array-like) – The training instances to estimate density function.
y (array-like) – The training function values on cell states.
- Returns:
condition_mean - The conditional mean function value at each test point in x.
- Return type:
array-like
- predict(x)#
Predict the function at each point in x.
- Parameters:
x (array-like) – The new data to predict.
- Returns:
condition_mean - The conditional mean function value at each test point in x.
- Return type:
array-like
- prepare_inference(x, y)#
Set all attributes in preparation. It is not necessary to call this function before calling fit.
- Parameters:
x (array-like) – The cell states.
y (array-like) – The function values on the cell states.
- Returns:
loss_func, initial_value - The Bayesian loss function and initial guess for optimization.
- Return type:
function, array-like