## Description This dataset contains additional material related to the fitting for the article: "Cost-effective Potential for Accurate Polarizable Embedding Calculations in Protein Environments" (preprint is available at [https://doi.org/10.26434/chemrxiv.8126912](https://doi.org/10.26434/chemrxiv.8126912)). It contains the final fitted atom-centered point charges and atom-centered isotropic dipole--dipole polarizabilities of all the standard amino acids including alternate protonation states and different N- and C-terminals, and the structures and electric potentials used in the parameter fitting. See [preprint](https://doi.org/10.26434/chemrxiv.8126912) regarding details of the fitting procedure and how the structures and electric potentials were produced. ## Files This dataset contains the following files: - readme.txt - CP3.csv - fitting_data.zip The *CP3.csv* file contains the final atom-centered point charges and atom-centered isotropic dipole--dipole polarizabilities (both in au) of the standard amino-acid residues including alternate protonation states. Included are also parameters for terminal residues where the N-terminal is charged, neutral, or terminated with an acetyl group, and the C-terminal is either charged, neutral, or terminated with an N-methyl group. In each case, the residue names are appended as follow (XXX is the residue name): - NXXX -- charged N-terminal - CXXX -- charged C-terminal - nXXX -- neutral N-terminal - cXXX -- neutral C-terminal - AXXX -- N-terminal is capped with an acetyl group - BXXX -- C-terminal is capped with an N-methyl group Note that the file contains duplicates corresponding to some alternate residue and atom names that are used in different programs. The file contains the following columns: - RESNAME -- residue name - ATOMTYPE -- atom type - q -- point charge - axx -- xx-component of polarizability - axy -- xy-component of polarizability - axz -- xz-component of polarizability - ayy -- yy-component of polarizability - ayz -- yz-component of polarizability - azz -- zz-component of polarizability Since the polarizabilities are isotropic the xx-, yy-, and zz-components are equal and the others are zero. The *fitting_data.zip* file contains directories named *AAA_XXX_BBB* where *AAA* and *BBB* are the N- and C-terminal capping groups and *XXX* is the central amino-acid residue (see below for an explanation of the terminology used for capping groups and residues). The directories contain files containing structures and electric potentials that were used in the fitting procedure (see preprint for details). The structures are provided in XYZ format (file extension *.xyz*) and the electric potentials in HDF5 format (where the file extension is *.h5*). The latter files are also prepended with *EP_*. The files are numbered according to highest occurrence, starting from 1 as the most occurring. The HDF5 files contain five fields: - "atom_coordinates" -- atom coordinates in au - "atom_numbers" -- atomic numbers (in the same order as the atom coordinates) - "atom_symbols" -- element symbols (in the same order as the atom coordinates) - "electric_potential" -- electric potential in au on a set of grid points - "grid_coordinates" -- coordinates in au of the grid points (in the same order as the electric potential) ## Terminology AAA_XXX_BBB -- AAA an BBB are the N- and C-terminal capping groups and XXX is the central amino-acid residue ### Capping groups Capping-group charges that are different from zero are indicated in square brackets. ACE -- acetyl NH2 -- amine NH3 -- protonated amine [+1] NME -- N-methyl COOH -- carboxyl COO -- carboxylate [-1] ### Amino-acid residues Residue charges that are different from zero are indicated in square brackets. ALA -- alanine ARG -- arginine ASP -- aspartate [-1] ASH -- aspartic acid ASN -- asparagine CYS -- cysteine CYD -- deprotonated cysteine [-1] CYX -- disulfide-bonded cysteine (has an additional thiomethyl (SCH3) capping group) GLU -- glutamate GLH -- glutamic acid [-1] GLN -- glutamine GLY -- glycine HIS -- protonated histidine [+1] HID -- delta-protonated histidine HIE -- epsilon-protonated histidine ILE -- isoleucine LEU -- leucine LYS -- protonated lysine [+1] LYD -- lysine MET -- methionine PHE -- phenylalanine PRO -- proline SER -- serine THR -- threonine TRP -- tryptophan TYR -- tyrosine VAL -- valine