Fit Tensor to PDB with Models

This example shows how to fit a \({\Delta\chi}\)-tensor to experimental PCS data using an NMR structure that contains many models. The tensor can be fit to ensemble averaged PCS values, or to individual models. An ensemble averaged PCS is the mean calculated PCS of all models. No structural averages are ever taken.

Data for calbindin D9k are used as in the previous example Fit Tensor to PCS Data.

Downloads

  • Download the data files 2bcb.pdb and calbindin_Er_HN_PCS.npc from here:

  • Download the script pcs_fit_models.py

Script + Explanation

Firstly, the standard preamble and loading of data.

from paramagpy import protein, fit, dataparse, metal

# Load data
prot = protein.load_pdb('../data_files/2bcb.pdb')
rawData = dataparse.read_pcs('../data_files/calbindin_Er_HN_PCS.npc')
mStart = metal.Metal()
mStart.position = prot[0]['A'][56]['CA'].position

The default method of fitting is to fit the tensor independently to each model of the PDB file. To achieve ensemble averaging behaviour, this requires setting the argument ensembleAverage to true within the fitting function. The default ensemble averaging behaviour is to average atoms with the same serial number in the PDB file. To manipulate ensemble averaging, you can specify the idx column of the input dataArray for paramagpy.fit.nlr_fit_metal_from_pcs(). The idx array contains common integers for corresponding atoms to be averaged. After fitting the Q-factor is calculated (with the ensembleAverage argument set to True), and ensemble averaging of the calculated data values is achieved with the function paramagpy.fit.ensemble_average()

#### Ensemble average fitting ####
parsedData = prot.parse(rawData)
[mGuess], [data] = fit.svd_gridsearch_fit_metal_from_pcs(
	[mStart],[parsedData], radius=10, points=10, ensembleAverage=True)
[mFit], [data] = fit.nlr_fit_metal_from_pcs([mGuess], [parsedData], ensembleAverage=True)

qfac = fit.qfactor(data, ensembleAverage=True)
dataEAv = fit.ensemble_average(data)

mFit.save('calbindin_Er_HN_PCS_tensor_ensemble.txt')

Fitting a separate tensor to each model of the PDB is the default behaviour of the fitting functions, the average of all fitted tensors is then returned. The model with the minimum Q-factor is then found by looping over the calculated data and sorting them by the calculated Q-factor.

#### Single model fitting ####
[mFitMod], [dataMod] = fit.nlr_fit_metal_from_pcs(
	[mGuess], [parsedData])
qs = {}
for mdl in set(dataMod['mdl']):
	qs[mdl] = fit.qfactor(dataMod[dataMod['mdl']==mdl])

minModel, minQfac = sorted(qs.items(), key=lambda x: x[1])[0]
minData = dataMod[dataMod['mdl']==minModel]

Finally we plot three sets of data:

  • The ensemble average fit calculated for each model (green)

  • The ensemble average of the calculated values of the ensemble fit (red)

  • The best fitting single model (blue)

Note that to calculate the ensemble average of the calculated values we use the function paramagpy.fit.ensemble_average(). This can take any number of arguments, and will average values based on common serial numbers of the list of atoms in the first argument.

# #### Plot the correlation ####
from matplotlib import pyplot as plt
fig, ax = plt.subplots(figsize=(5,5))

# Plot all models
ax.plot(data['exp'], data['cal'], marker='o', lw=0, ms=2, c='g', 
	alpha=0.5, label="All models: Q = {:5.4f}".format(qfac))

# Plot the ensemble average
ax.plot(dataEAv['exp'], dataEAv['cal'], marker='o', lw=0, ms=2, c='r', 
	alpha=0.5, label="Ensemble Average: Q = {:5.4f}".format(qfac))

# Plot the model with minimum Q-factor
ax.plot(minData['exp'], minData['cal'], marker='o', lw=0, ms=2, c='b', 
	alpha=0.5, label="Best Model ({0:}): Q = {1:5.4f}".format(
		minModel, minQfac))

# Plot a diagonal
l, h = ax.get_xlim()
ax.plot([l,h],[l,h],'-k',zorder=0)
ax.set_xlim(l,h)
ax.set_ylim(l,h)

# Make axis labels and save figure
ax.set_xlabel("Experiment")
ax.set_ylabel("Calculated")
ax.legend()
fig.savefig("pcs_fit_models.png")

Output: [pcs_fit_models.png]

../_images/pcs_fit_models.png