Comparative electrostatic analysis of adenylyl cyclase for isoform dependent regulation properties

The enzyme adenylyl cyclase (AC) plays a pivotal role in a variety of signal transduction pathways inside the cell, where it catalyzes the cyclization of adenosine triphosphate (ATP) into the second‐messenger cyclic adenosine monophosphate (cAMP). Among other roles, AC regulates processes involved in neural plasticity, innervation of smooth muscles of the heart and the endocrine system of the pancreas. The functional diversity of AC is manifested in its different isoforms, each having a specific regulation pattern. There is an increasing amount of data available concerning the regulatory properties of AC isoforms, however little is known about the interactions on a structural level. Here, we conducted a comparative electrostatic analysis of the catalytic domains of all nine transmembrane AC isoforms with the aim of detecting, verifying and predicting the binding sites of molecular regulators on AC. The results provide support for the positioning of the binding site of the inhibitory protein Giα at a pseudo‐symmetric position to the stimulatory Gsα binding site. They also provide a structural interpretation of the Gβγ interaction with ACs 2, 4, and 7 and suggest a new binding site for RGS2. Comparison of the small molecule binding sites on AC shows that overall they have high electrostatic similarity, but regions of electrostatic differences are identified. These could provide a basis for the development of novel compounds with isoform‐specific modulatory effects on AC. Proteins 2016; 84:1844–1858. © 2016 Wiley Periodicals, Inc.


INTRODUCTION
Cyclic adenosine monophosphate (cAMP) acts as a secondary messenger involved in transducing extracellular signals coming from first-messengers (e.g., hormones) by altering the intracellular environment of a cell. 1 Adenylyl cyclase (AC) and phosphodiesterases (PDEs) are the main regulators of cellular cAMP concentration. ACs catalyze the cyclization of adenosine triphosphate (ATP) to cAMP and are also responsible for the integration of signals conveyed by first-messengers, 2 while PDEs are responsible for cAMP signaling arrest. 3 There are ten known mammalian isoforms of AC, 4 nine of which are membrane-bound (ACs 1-9), while the tenth (AC10 or sAC) is soluble. The isoforms differ in their regulation pattern and tissue distribution and thus differ in their role in cellular signaling. The importance of ACs and their isoform-specific properties is underlined by recent approaches to develop isoform-specific drugs against ACs for a variety of different diseases. 5 All isoforms are expressed in the central nervous system (CNS), where they integrate different neurochemical signals to generate the appropriate cellular responses. 6 The AC-cAMP system is known to play pivotal roles in several processes involved in heart muscle contractility, 7 synaptic plasticity, for example in the NMDA-dependent long-term potentiation (LTP) of the hippocampus and LTP in striatum, 8 and in insulin regulation in the pancreas. 9 All transmembrane ACs consist of a N-terminal domain (N), two membrane spanning domains (M1, M2) and two cytoplasmic domains (C1, C2) of about 20 kDa. 10 The cytoplasmic domains are further subdivided into C1a, C1b, C2a, and C2b domains. The crystal structure of the catalytic unit, formed by cytoplasmic domains C1a and C2a, has been solved for the heterodimer formed between the C1a domain of canine AC5 and the C2a domain of rat AC2. 11,12 These crystal structures revealed the binding sites of the substrate, ATP, the stimulatory protein G s a, and forskolin, a small molecule activator of AC (Fig. 1). C1a and C2a form a pseudo-symmetric heterodimer, with the catalytic site lying at their interface. 12 G s a binds at a groove formed between the a'1 helix, the a'2 helix and a'3 helix of the C2a domain, and also interacts with the N-terminus of the C1a domain (Fig. 1). The switch II helix of G s a inserts rigidly into this groove and induces conformational changes by moving the helices that form the groove apart, and producing a relative rotation of the C1a and C2a domains of 78. 13 The C1b domain is highly divergent and is suggested to play an important role in isoform-specific regulation. The structure of C1b is not resolved in any crystal structure, but there is evidence that C1b interacts with a 10residue long region of the b2-strand of the C2a domain, which is located near the active site, and might play a role in the regulation of catalytic activity. 14 In fact, it has been suggested that C1b alters the G s a-dependent activity in ACs 5 and 7. 14, 15 The regulation of AC isoforms is complex, and involves several groups of different molecules. Nevertheless, the isoforms can be grouped according to their regulation patterns (Table I). 4 G proteins constitute the main family of AC regulators, and they play an important role in signal transduction across the cell membrane. All isoforms are stimulated by the GTP-bound form of the a-subunits of the stimulatory G protein, G s, and the olfactory G protein, G olf . 4 The G s a binding site has been mapped to the C2a domain and the N-terminus of C1a, as seen in the crystal structure (Fig. 1). 11 However, the reaction kinetics of G s a-activated AC point toward the possibility of a second low-affinity binding site. 14 The inhibitory G i a proteins are known to modulate the activity of ACs 1, 5 and 6. 4 A possible role in AC9 regulation cannot be excluded as recent studies show inhibitory effects following activation of G i,o -coupled D2L dopamine receptors. 16 While ACs 5 and 6 are inhibited by G i a in their G s a-activated states, inhibition of AC1 occurs only when it is stimulated by calmodulin (CaM) or forskolin. 17,18 In addition, AC1 is also inhibited by G z a and G o a. 4 All AC isoforms except AC9 are modulated by Gbg subunits of G proteins. Gbg regulation is distinct in each of the AC groups. Group I ACs (ACs 1, 3 and 8) are inhibited by Gbg, 19-21 Group II ACs (ACs 2, 4 and 7) are conditionally stimulated, with the activities of their G s a-stimulated states strongly enhanced, while Gbg alone shows no effect. 19,22 Possible binding sites on AC2 were mapped to the C2 domain 23,24 and the C1b domain. 20 Finally, group III ACs (ACs 5 and 6) are stimulated by Gbg subunits, and the interaction site is suggested to be the N-terminal domain. 25 Several small molecules are known to modulate AC activity, including P-site inhibitors and competitive ATPanalogues, 4,26 although these compounds have shown, at best, weak isoform specificity. Forskolin, a diterpene Cartoon representation of the crystal structure of the cytoplasmic domains C1a (red/yellow/orange) and C2a (blue/green/light green) of AC, which form the catalytic unit (taken from PDB: 1CJK, 12 which contains a C1a domain of canine AC5 and a C2a domain of rat AC2). AC domains C1b and C2b are not resolved in this structure, dashed lines in the figure identify the C termini of the C1a and C2a domains where the protein sequence continues to these missing domains. A flexible loop region is missing from the structure, as it could not be resolved by crystallography. Asterisks mark the truncated termini of this missing region. The a'1, a'2 and a'3 helices of C2a form a groove in which the a subunit of the stimulatory G protein G s binds. The crystal structure includes bovine G s a in this position (grey) bound via its switch II helix (dark grey). The a1, a2 and a3 helices of C1a form a similar groove in a pseudosymmetric position to this G S a binding groove. Also included in the structure is a sulfur derivative of AC's substrate ATP (green) and forskolin (cyan), a promotor of AC activity, which both lie at the interface of the C1a and C2a domains. Two metal ions (magenta), identified as Mn 21 and Mg 21 in the structure, are coordinated by the C1a domain and the phosphate groups of ATP. The G s a subunit includes a sulfur derivative of its substrate GTP (green) which coordinates a single metal ion (magenta). [Color figure can be viewed at wileyonlinelibrary.com] from the root of Coleus forskohlii, is a potent stimulator of ACs 1-8. 4 Its binding pocket has been resolved by X-ray crystallography, 11 and its stimulatory effects are suggested to come from the stabilization of the C1a-C2a interaction. 10 There is a conserved leucine residue in the forskolin binding pockets of ACs 1-8, whereas the forskolininsensitive AC9 has a tyrosine in this position (Tyr1082, S1 Alignment). Mutation of this tyrosine to leucine has been shown to induce forskolin-sensitivity in AC9. 27 Other additional regulators have been reported in recent years. 4 Most notable is the regulator of G protein signaling protein 2 (RGS2), which has been reported to interact with the C1 domain and inhibit AC5 activity. 28 It has also been reported that RGS2 is recruited to the membrane when co-expressed with AC isoforms, and that a direct and specific interaction is found between RGS2 and ACs 2, 3, 5, and 6. 29 In this work, we bring further understanding to the isoform-specific regulation of AC with respect to its structural and physico-chemical properties. By comparison of the electrostatic properties of different AC isoforms, we find that similarities among the groups of isoforms that show similar regulation patterns provide evidence for their likely sites of interaction with regulatory proteins. We also obtain insights that provide a basis for the design of small molecules that can target AC with isoform specificity. The comparisons of electrostatic properties were made using the Protein Interaction Property Similarity Analysis (PIPSA) approach, 30,31 where the pairwise similarity in the protein electrostatic potentials is calculated in pre-defined analysis regions. This approach has previously been used to investigate the interaction properties of a wide variety of protein families, including blue copper proteins 32 and human Rab GTPase proteins, 33 where it has been shown to offer insights that cannot be obtained from analysis at the sequence level alone.

Generation of structural models of AC isoforms
Homology modeling was used to generate structures for all 9 membrane-associated AC isoforms of mouse (Mus musculus), using sequences taken from UniProt (S1 Table). The crystal structure chosen as a template (PDB: 1AZS) contains the C1a domain of canine AC5 dimerized with the C2a domain of rat AC2, with a forskolin derivative located at their interface. An activated bovine G s a subunit, with a GTP analogue, is also present, bound to the C2a domain. 11 This structure was chosen as the template structure due to its higher resolution (2.30 Å ) compared to other available structures. The chimeric AC dimer in the 1AZS structure does not contain a bound ATP molecule, so a second crystal structure (PDB: 1CJK), in which a sulfur-containing analogue of ATP and coordinating Mg 21 and Mn 21 ions are bound, was used to identify the ATP binding site (Fig. 1). To ensure that the G s a binding groove on AC remained in its bound conformation during the modeling and refinement process, the AC models were generated in complex with bovine G s a, which was then removed from the structure prior to the electrostatic analysis.
First, a multiple sequence alignment (MSA) of mouse ACs 1-9 and chains A and B of the crystal structure was performed using the Clustal Omega web server 34 (S1 Alignment). The template structure was missing atomic coordinates for a loop between the b3'-strand and the a3'-helix of the C2a domain (S1 Alignment, S1 Table). Modeller 9.10 35,36 was used to generate the homology models for mouse ACs 1-9 based on the sequence alignment obtained from Clustal Omega. The "automodel" procedure was used to generate 100 initial structural models of each AC isoform. The loop region that is not resolved in the template structure was modeled without  any structural constraints. Variability in the modeled loop conformations was evaluated for each of the isoforms, since large structural variability could influence the surrounding electrostatic potentials, thus making comparisons between isoforms difficult. For each isoform, the non-loop backbone atoms of the 100 generated models were structurally aligned, and the RMSD of the loop backbone atoms, relative to their average positions, was calculated (S1 Table). ACs 1, 5, 6, 8 and 9, which have comparably short loops (4-7 residues, S1 Table, S1 Alignment), showed less variation in their loop conformations, and for these isoforms, the highest scoring models, as defined by the DOPE score, 37 were used for electrostatic analysis. ACs 2, 4 and 7 have loops of 9-10 residues length, and greater loop variability was seen in the models generated for these isoforms (S1 Table). The loop region has been reported to play a role in the binding of Gbg to AC2. 24 Therefore, for these isoforms, additional loop refinement was conducted using the highest scoring initial models as starting points. Loop refinement using the slow simulated annealing molecular dynamics method of Modeller generated 50 alternative loop structures (S2 Table), these were again assessed by their DOPE scores, and the highest scoring structure was used for the electrostatic analysis. AC3, which has the longest loop with a length of 17 residues, showed a high degree of variability in the loop conformations of the initially generated models (S1 Table, S1 Alignment). Loop refinement was also performed for the highest scoring initial AC3 model, which again resulted in a set of highly variable loop models (S2 Table). The highest scoring model was used for electrostatic analysis, however given the lack of convergence of the modeled structures of this loop, the comparisons of the electrostatic properties of AC3 and the other isoforms in this region should be made with care. The final models of each isoform were assessed using ProCheck, 38 to ensure there were no large violations in expected dihedral angle values.
A second set of models was generated with no structural optimization. The "very_fast" option was chosen in Modeller, which omits the initial randomization of sidechain torsions, and any simulated annealing molecular dynamics refinement steps. This approach results in only a single modeled structure for each isoform. These models were used in the analysis of the small molecule binding sites because, despite sequence conservation, significant differences in the side-chain orientations of equivalent residues in these tight binding pockets were observed in the first set of models. These differences would affect the calculation of electrostatic similarity in these regions by making the similarity values lower than they should be. To ensure that the ATP binding site was modeled in a conformation that is available for binding, a crystal structure (PDB ID: 1CJK) that contains a bound ATP analogue was used as the structure template for the creation of these models.

Electrostatic analysis
The protonation states of titratable residues in the modeled AC isoforms were assigned using the PROPKA method 39 in PDB2PQR 40,41 at pH 7.4. Atomic partial charges and van der Waals radii were taken from the Cornell et al. Amber force field. 42 For each isoform, the linearized Poisson-Boltzmann equation was solved using the APBS 1.4 solver 43 to compute the electrostatic potential on a protein-centered grid of 129 points in each dimension, spaced at 1 Å . An ionic strength of 100 mM, a mobile ion radius of 1.5 Å , an interior protein dielectric constant of 1, and a solvent dielectric constant of 78 were assigned. The dielectric boundary between the protein cavity and the solute was defined by a smoothed molecular surface, 44 and a boundary condition was set such that, at the grid boundaries, the electrostatic potential matched that of a single Debye-H€ uckel sphere at the protein center.
The computed electrostatic potentials surrounding each AC isoform were compared using the Protein Interaction Property Similarity Analysis (PIPSA) method. 30,31 In the PIPSA method, potential fields are compared at points in a skin surrounding the proteins. The skin surrounding each protein was defined as the volume bound by two accessible surfaces, described by the centers of probe spheres of radii 2 and 5 Å rolled along the van der Waals surface of the protein. The proteins are first structurally aligned, and then an analysis region is defined as either the whole skin or the intersection of the skin with a sphere of a given radius centered on an interaction center.
First, a global PIPSA analysis was performed. Here, the analysis region included the whole AC structure, allowing the entire skins surrounding each isoform to be compared. For the analysis of regulatory protein interaction sites, PIPSA analyses were performed for skin regions inside spheres of radius 10 Å , centered on each solvent accessible Ca atom of AC5. For the small molecule binding pockets, PIPSA analyses were performed with spheres centered at the geometric centers for the molecules in their bound positions. The radii of these spheres were chosen such that they fully enclosed the atoms of the molecules. For forskolin, a radius of 8 Å was used; while for ATP, a radius of 10 Å was used. Additionally, for the ATP binding pocket, subregions of the pocket were defined using spheres of 6 Å radius centered on the geometric centers of the triphosphate, ribose and adenine moieties.
The Hodgkin similarity index 45 was used to define the similarity between two potentials: where p 1 ; p 2 ð Þ is the scalar product of the potentials surrounding proteins 1 and 2, in the analysis region, which is calculated as: where i, j and k are the three-dimensional spatial coordinates, and / 1 i; j; k ð Þ is the potential at point i; j; k ð Þ on the grid for protein 1. The similarity index runs from 21, for completely anti-correlated potentials, to 1 for identical potentials. At 0, there is no correlation between the two potentials. Alternatively, a distance measure, D 12 , between the potentials was defined as: which runs from 0 for identical potentials to 2 for anticorrelated potentials. Pairwise similarity and distance matrices were created for all interaction regions, and were used for subsequent clustering and scoring.
To investigate the sensitivity of PIPSA analysis of AC isoforms to the choice of protein dielectric constant and solvent ionic strength, we performed initial PIPSA analyses of the entire molecular skin of the AC isoforms, and the molecular skin in the region of the groove formed by the a1, a2 and a3 helices of the C1a domain ( Fig. 1), while varying these values. Using ionic strengths of 100 and 150 mM, and protein dielectric constants of 1, 2, 4, and 8, we observed negligible changes in the calculated SI 12 values (Supporting Information Figs. S1 and S2).

Clustering of electrostatic similarity matrices
The similarity matrices obtained from PIPSA analyses centered on the Ca atom of each AC surface residue were clustered using an affinity propagation algorithm, 46 adapted from the Scikit-learn Python library, 47 using a damping factor of 0.5 and no preference list. The clustering was performed using a feature vector obtained from the upper triangle of the similarity matrices and clustering was performed until there was no change in the number of clusters obtained during 15 iterations of the algorithm. The members of the resulting clusters were mapped to the residues on which the PIPSA analysis was centered, and the mean pairwise isoform similarity was examined to identify clusters of residues that showed similarity patterns that matched the known regulatory patterns of AC (Table I).

Similarity matrix scoring
Whereas clustering is unsupervised and generates a result without any specific objective, a more direct method is often more useful where prior knowledge exists. In the case of AC regulation, some experimental data on regulation patterns are available. Therefore, a procedure to include these data was developed to allow for a direct search for regions where the electrostatic potential similarity pattern matches the grouping of similarly regulated AC isoforms.
We defined two scoring functions that could be used to identify which regions of AC showed conserved electrostatic potential patterns that match the regulation of AC by a given regulatory protein. Using the data in Table  I, we identified groups of similarly regulated AC isoforms. For example, when investigating the inhibition of ACs by G i a, ACs 1, 5, and 6 formed a group. Both scoring functions were used to assign similarity scores to each atom on which PIPSA analysis was centered. The mean scoring function, M S , was developed to calculate the difference between the mean of the pairwise similarities of group isoforms with each other (intra-group similarity) and the mean of the pairwise similarities of group isoforms with nongroup isoforms (inter-group similarity). The value of M S is given by where SI 12 h i intra is the mean intra-group and SI 12 h i inter the mean inter-group pairwise similarity. The M S score ranges from 21 to 1, whereby a value of 1 shows that, at this position, the electrostatic potential surrounding all similarly regulated isoforms is identical, but is not similar in any of the other isoforms.
If all isoforms forming a group are highly similar in a certain region, it is possible that a high M S score could still be achieved, even if some of the nongroup isoforms also show a high similarity to the group, provided other nongroup isoforms have a very low similarity. To account for these cases, a compactness score C S , based on the silhouette coefficient, 48 was employed. This score is defined as where D 12 h i intra and D 12 h i inter are the mean intra-and inter-group distances, respectively. The silhouette coefficient evaluates the compactness and the degree of separation between clusters formed by the grouping. The score ranges from 21, showing bad clustering, to 1 for dense clusters. A score around 0 denotes that there are possibly overlapping clusters. Unlike the simple mean score M S [Eq. (4)], the compactness score C S does not take the magnitude of distances into account, but it is more sensitive to the correct grouping, whereas in the M S score, high similarities can mask a bad grouping.

RESULTS AND DISCUSSION
Global comparison of the electrostatic potentials surrounding AC isoforms reconstructs the sequence-based similarity A global comparison of AC isoform electrostatic potentials was made by comparing the electrostatic potentials on the entire skin surrounding each protein.
Electrostatic similarity was computed after structural alignment to create a matrix containing the pairwise similarity between all isoforms. This matrix was then used to generate a dendrogram of electrostatic potential similarities (an epogram), which was compared to the sequence alignment-based phylogenetic analysis obtained after multiple sequence alignment with the Clustal Omega webserver. Figure 2 shows that the phylogeny obtained from sequence alignment is reproduced in the epogram. For instance, the important classes of AC isoforms (e.g., Group II: ACs 2, 4 and 7; Group III: ACs 5 and 6) are distinctly grouped in both the sequence-based and electrostatics-based comparisons. This confirms that PIPSA detects sequence-based information, while taking into account the spatial structure of the proteins.
The G s a binding site on AC is correctly identified to be isoform-unspecific using local PIPSA analysis The interaction between G s a and ACs has been extensively investigated, and it is known that G s a stimulates the catalysis of ATP cyclization by all AC isoforms through insertion of its switch II loop into the binding groove on the C2a domain of AC. The known locations of G s a and ATP binding on AC thus provide the opportunity to test whether these positions can be confirmed by electrostatic similarity analyses. Clustering and scoring of PIPSA similarity matrices centered on each solvent accessible Ca atom was performed to find regions of high electrostatic similarity across all AC isoforms (Fig.  3). A cluster was observed that contained similarity matrices showing high similarity across all AC isoforms. Mapping the PIPSA analysis centers on to residues on the surface of AC5 [ Fig. 3(B)] shows that within the clusters, the G s a binding site is correctly identified. Furthermore, the region of the ATP binding pocket where the phosphate atoms and coordinated metal ions bind, and the center of the membrane facing side of AC, are also included in the cluster.
The M S scoring function was also used to identify regions of high similarity in all AC isoforms. As all isoforms were included in the group in this analysis, the second term in Eq. (4) was zero, simplifying the score to half the mean pairwise similarity for all isoform pairs. Figure 3(C) shows that the M S score also identifies the G s a binding site, the catalytic region of the ATP binding pocket and the membrane facing surface.
Examining the electrostatic potentials of each isoform directly (Fig. 4), the overall electrostatic conservation of the G s a binding groove across all isoforms can be seen, with the groove showing a negative potential on the catalytic face and a positive potential on the membrane facing side. This distribution complements the electrostatics of the switch II helix of G s a (see Supporting Information Fig. S3).
Electrostatic similarity supports binding of G i a pseudo-symmetrically to G s a The Inhibitory G protein G i a is known to inhibit Group III isoforms (ACs 5 and 6) and the Group I  isoform AC1 (Table I). Similarity scoring of AC isoforms was performed to identify regions of AC where these three isoforms showed high electrostatic similarity, but the potentials surrounding the other isoforms differed [ Fig. 5]. The M S score showed that while overall there was little specific correlation of the electrostatic potentials of the G i a-sensitive AC isoforms, two regions showed similar electrostatic patterns in these isoforms, while other isoforms showed divergence [ Fig. 5(B)]. The first of these (shown in a box in the middle panel of Figure 5, and expanded in the lower panel) lies in the groove formed between the a1 and a2 helices and the a3 helix of C1a, which is pseudo-symmetric to the G s a binding site on C2a. The second site [right side of the top panel and the left of the middle panel in Fig. 5(B)] includes the loop region that was missing in the template structure, which makes analysis of this region difficult.
The C S score, which scores how specific the chosen group of isoforms is in a region, but does not account for the difference in magnitude of the intra-and intergroup distances, also scores the site pseudo-symmetric to the G s a binding groove highly [ Fig. 5(C)]. This score also shows a high grouping of ACs 1, 5 and 6 at the G s a groove, however the very high electrostatic similarity of all isoforms in this region suggest it can be discounted as the site of G i a binding. Interestingly, however, it has been shown that high concentrations of G i a were able to modulate the activity of AC2, which normally does not interact with G i a, and it was assumed that binding of G i a at the G s a binding groove occurs. 18 The C S score also suggests that the similarity grouping of ACs 1, 5 and 6 in the loop region with the missing template structure is less clearly defined than at the pseudo-symmetric groove.
Mutation studies of the inhibition of AC5 by G i a have previously suggested that a number of residues at the pseudo-symmetric groove are important for their interaction. 49 The lower panel of Figure 5(A) shows that these residues lie mainly on the membrane facing side of AC. Similarity scoring shows that isoform specificity is observed around these important residues [ Fig. 5(B,C)], further supporting this groove as the likely interaction site of G i a.
The electrostatic potentials of ACs 1, 5, and 6 [ Fig.  4(B,F,G)] show that the pseudosymmetric groove lies in a large negative lobe of the potential, in contrast to the G s a binding groove which presents a positive potential on the membrane face. The electrostatic potential of G i a    (Supporting Information Fig. S3) shows why the pseudosymmetric groove may be more favorable for G i a binding. Its switch II helix is surrounded by a largely positive potential, with the negative potential patch seen on the upper side of the electrostatic potential of the G s a switch II helix replaced by a small positive patch.
The flexible loop region in the C2a domain may be important for binding of Gbc to group II AC isoforms Gbg exerts diverse effects on AC regulation. Group I ACs (ACs 1, 3, and 8) are inhibited by Gbg, group III ACs (ACs 5 and 6) are stimulated, whereas Group II ACs (ACs 2, 4, and 7) are conditionally stimulated in their G s a-activated state. The distinct effects could arise from one of two scenarios. Either the binding site of Gbg is different in each group of ACs, leading to different effects, or the binding site is the same but the mechanisms by which the binding affects the activity of the enzyme are distinct. Experiments have suggested that different regions are important for the interaction of Gbg with Group III isoforms (ACs 5 and 6) and AC2, with the N-terminal domain thought to be the site of interaction on ACs 5 and 6, 25 and the C1b and C2a domains thought to be the interaction sites on AC2. 20,23,24 Two regions on the C2a domain of AC2 have been identified as interacting with Gbg. The first is a short KF loop, 23 which is conserved across all Group II ACs and AC3, while in ACs 5, 6 and 8, the lysine residue is mutated to Identification of regions of conserved electrostatic potentials in G i a inhibited AC isoforms 1, 5, and 6, mapped onto the surface of AC5 and shown in two orientations and at the pseudo-symmetric groove (A), using the M S score (B), and the C S score (C). The regions in the AC structure discussed in the text are labeled (G s a: G s a binding groove; PG: pseudosymmetric binding groove; KF: K(R)F loop region; FL: flexible loop region for which no template structure was present in modeling). Residues near the pseudosymmetric groove, whose mutation was shown by Dessauer et al. 49 to affect inhibition of AC5 by G i a, are labeled. [Color figure can be viewed at wileyonlinelibrary.com] a similarly charged arginine (Alignment S1). The second region is a QEHA motif, 24 which lies on the flexible loop that was not present in the template structure used to generate our AC models.
Clustering of residues based on pairwise electrostatic distances generated four clusters that surround the KFloop and the flexible loop region [ Fig. 6(A,B)]. Both regions are located on C2a on the opposite side to the G s a binding groove. The cluster which lies on the upper side of the KF-loop [ Fig. 6(B), red] shows high similarity between ACs 2, 4 and 7 [ Fig. 6(C)], while the other three, which lie below the KF loop and include the flexible loop region, showed a high similarity between AC2 and AC4, but lower similarity to AC7 [ Fig. 6(D)]. Notably, within the red cluster, all isoforms other than ACs 2, 4 and 7 appear to be similar to one another [ Fig. 6(C)]. This is particularly notable in the case of ACs 5 and 6, which also exhibit similarity in the other three clusters [ Fig. 6(D)].
Directly comparing the electrostatic potentials of the isoforms [ Fig. 4(C,E,H)], it is apparent that ACs 2, 4 and 7 display a conserved electrostatic pattern of a negative electrostatic potential over the flexible loop region and a positive potential over the KF loop, suggesting this may be important for Gbg recognition. Indeed, the red cluster [ Fig. 6(B)], which has the highest degree of specific similarity for ACs 2, 4 and 7, includes the lysine residue of the KF loop. This pattern is also evident to a certain degree for AC3, which is inhibited by Gbg, although the locations of the positive and negative isopotential contour lobes appear somewhat different in this case. AC3 is the only other isoform that has includes the KF loop motif. ACs 5, 6 and 8 have an arginine in place of the lysine, while AC1 and AC9 have phenylalanine and aspartate residues. In addition, the AC 1, 2, 4, 7, and 9 sequences include a lysine residue in a position two places N-terminal of the KF loop (Alignment S1). Only ACs 2, 4 and 7 have lysines in these two positions, which appears to define the presence and location of the positive potential lobe close to the KF loop of these isoforms. In comparison, all other isoforms have large negative potential regions that lie above the KF loop position (Fig. 4), which accounts for their similarity. The pairwise similarity matrices [ Fig. 6(C,D) and Supporting Information Fig. S4] do not identify any similarity between AC3 and the group II ACs, due to the different position of this positive lobe. Also, the flexible loop of AC3 is considerably longer than that of ACs 2, 4, and 7 (Supporting  Information Table S1), therefore its increased flexibility may prevent binding of Gbg to this region, or, should binding occur, inhibit the propagation of an allosteric signal to the catalytic site.
The lower electrostatic similarity of AC7, compared to ACs 2 and 4, in the clusters that contain the flexible loop region is caused by a large negative potential bulge that traverses the membrane facing side of the a'3 helix of the C2a domain of ACs 2 and 4, which is absent in AC7 [ Fig. 3(C,E,H)]. The green cluster [ Fig. 6(B)] has the largest overlap with this negative potential region, and is identified as having the lowest mean pairwise similarity with ACs 2 and 4 (Supporting Information Fig. S4B). This suggests that the approach of Gbg toward ACs 2, 4 and 7 during binding occurs in a direction parallel to the membrane.
This result supports the previously described experimental data showing the importance of both the KF loop and the QEHA motif, which is located on the flexible loop of AC2, to the binding of Gbg. 23,24 It appears that these two independently reported binding sites describe the same interaction site for Gbg. Due to the spatial proximity of the two binding sites, the flexible loop region might be responsible for the generation of Identification of the potential binding site of Gbg on AC isoforms 2, 4, and 7 in the region of the KF-loop and the flexible loop (FL) on the C2a domain. Viewed with the membrane-facing side (Mem) to the right, and the catalytic face to the left (Cat) (A). Clustering identified a cluster with high similarity between ACs 2, 4, and 7 (red) and three clusters with high similarity between ACs 2 and 4 (blue, green and yellow) (B). The mean pairwise isoform similarity (SI 12 ) of the red cluster was calculated (C) along with the mean pairwise similarity (SI 12 ) across the blue, green and yellow clusters (D) (The individual mean pairwise similarities (SI 12 ) of these clusters are shown in Figure S2). two distinct clusters that suggest a distinct binding mechanism of Gbg on AC7.
A possible binding site of RGS2 on the AC C1a domain can be detected by similarity matrix scoring RGS proteins are known to regulate G protein signaling by acting as GTPase activating proteins (GAPs) on Ga subunits. 50 They, therefore, play a pivotal role in the fast termination of G protein signaling, and they are important in a variety of signaling networks. RGS2 has been shown to inhibit the catalytic G s a-stimulated ACs 3, 5, and 6. 51 However it does not appear to show direct GAP activity on G s a, 29,50 and direct binding of the Nterminal residues to the C1a domain of AC5 has been suggested. 28 The M S score was used to search for possible binding regions of RGS2 on ACs 3, 5 and 6 ( Fig. 7). This group of isoforms resulted in a lower similarity scoring of the C2a domain of AC, compared to the C1a domain, in agreement with the suggested interaction of RGS2 with the C1a domain of AC5. 28 Most parts of the C1a domain are also low-scoring, however one high scoring region was identified that includes the membrane facing sides of the a2 and a3 helices of the pseudosymmetric groove [ Fig. 7(B)]. Binding of RGS2 to AC5 has been shown not to be affected by mutation of the Glu411 and Leu472 residues of AC5 to alanine, suggesting RGS2 does not bind to this region 28 [see Fig. 5(A) for the location of these residues]. A second, higher scoring region, was identified on the membrane facing side of the C1a domain [ Fig. 7(B), boxed area]. The C S score also identifies the region on the membrane facing side as a position that shows a good grouping of ACs 3, 5 and 6 [ Fig.  7(C)]. Interestingly, this suggested binding site for RGS2 also allows the possibility of a ternary complex between AC, RGS2 and G s a, due to close but not overlapping proximity to the native binding site of G s a. Roy et al. 29 reported a weak affinity of RGS2 to AC6, and suggested that stabilization of this interaction may be due to the formation of a ternary complex with G s a. Our findings give further support to this hypothesis, however it should be noted that RGS2 is able to reduce the activity of forskolin activated ACs 3, 5 and 6, in the absence of G s a. 51 Electrostatic properties of the forskolin binding site differ widely among the isoforms and do not correlate with forskolin binding affinity Forskolin, a potent activator of ACs 1-8, binds to a groove formed at the interface between C1a and C2a domains, next to the active site. 11 Due to its stabilizing effects on the dimer structure, forskolin greatly enhances the enzyme activity. The forskolin structure has been used for designing allosteric modulators for AC, most being stimulatory. 52 PIPSA analysis was performed in a sphere of radius 6 Å centered on the geometric center of forskolin in the 1CJK crystal structure [ Fig. 8(A)]. The analysis showed that the forskolin binding pockets of all AC isoforms, except ACs 3 and 7, had very similar electrostatic properties [ Fig. 8(B)]. The AC7 binding pocket displayed a negative electrostatic similarity to all isoforms except AC3, suggesting the electrostatic potential of these isoforms are anticorrelated with that of AC7. Indeed this can be seen when visualizing the electrostatic potentials directly (Supporting Information Fig. S5). The forskolin binding pocket of AC7 has a largely positive electrostatic potential, while all others, except AC3, are largely negative. This is particularly true for ACs 5, 6, and 8, with which AC7 has the most negative similarity indices.  AC9 is the only isoform of membrane-bound AC whose activity is not stimulated by forskolin, however the differences we observed in the electrostatic potentials of the forskolin binding pockets of AC isoforms, do not correlate with this regulation pattern. The lack of sensitivity of AC9 to forskolin has been shown to be due to the mutation of a leucine residue in the forskolin binding pocket, conserved across all other isoforms, to tyrosine. Mutating this tyrosine back to leucine has been shown to induce forskolin sensitivity in AC9. 27 This, along with the electrostatic similarity we observe between AC9 and many of the forskolin simulated isoforms, suggests that the differing regulation of AC9 by forskolin is due to steric effects of the larger tyrosine residue, rather than any electrostatic effects that may affect the binding of forskolin.
Overall, it appears that the electrostatics of the forskolin binding site do not play a large role in determining its affinity for forskolin. Indeed, the forskolin binding cavity is largely hydrophobic, with only two charged residues identified in the 1CJK structure (Supporting Information Fig. S6A), namely Asp505 and Lys896. Neither of these interacts directly with forskolin, and only Asp505 is conserved across all isoforms. The lysine residue, which lies on the a'1 helix of the C2a domain, is present only in ACs 2, 4 and 7, but as the side-chain amino group lies some way from the forskolin atoms, is solvent exposed (Supporting Information Fig. S6A), and therefore it alone does not have a large effect on the electrostatic potential in the vicinity of forskolin, as shown by the overall negative potential of the binding pockets of ACs 2 and 4. In the modeled structure of AC7 this lysine lies close to the arginine residue on the a4 helix of the C1a domain. AC7 is the only isoform to have both of these positively charged residues (Alignment S1), which accounts for the overall positive potential in its forskolin binding pocket. It should be noted that the positive potential of the AC7 pocket is likely to be slightly overestimated. This is because these models were created without refinement, to prevent small side-chain differences masking similar electrostatics in the confined binding pocket, thus the side-chain nitrogen atom of the lysine residue and the closest guanidinium nitrogen of the arginine lie 3.2 Å from each other in the AC7 model (Supporting Information Fig. S7B). In the AC3 structure, which also has a more positively charged forskolin binding pocket, there is a lysine residue in the position on the a4 helix where AC7 has an arginine (Alignment S1). This lysine is also present in AC8, however, in AC3 it lies close to a second lysine on the a5 helix of C1a (Supporting Information Fig. S7A), but with a nitrogen-nitrogen separation of 6.8 Å . PIPSA analysis was performed in a spherical region of radius 6 Å surrounding the geometric center of forskolin (A) resulting in a matrix of pairwise similarity indices (SI 12 ) for the electrostatic potentials of each AC isoform within the forskolin binding pocket (B). A similar analysis, within a spherical region of radius 9 Å surrounding the geometric center of ATP within its binding pocket (C), was also performed (D). The ATP binding pocket was further divided into three subregions: of radius 5 Å , centered on the geometric center of the atoms of the triphosphate group (cyan); of radius 4 Å , centered on the geometric center of the ribose moiety (magenta); and of 4 Å , centered on the geometric center of the atoms of the adenine moiety (yellow) (E). PIPSA analysis was then performed to generate pairwise similarity (SI 12 ) matrices for these subregions (F-H, respectively).
Isoform specificity in the electrostatic properties of the adenine binding part of the ATP binding site Although ATP binds to all AC isoforms, as is required for catalytic activity, finding regions of isoform specificity within the ATP binding pocket could be of interest for the design of compounds that selectively target AC isoforms. Since direct binding to the enzyme's active site is an effective way of modulating enzymatic activity, many small molecules targeting AC, which are based on the ATP structure, have been reported, 26 . However, these compounds often show poor isoform specificity, and are therefore of limited therapeutic use. Therefore, we investigated the electrostatic landscape of the ATP binding pocket to evaluate the possibility of creating isoform specific molecules to target this site.
PIPSA analysis was performed on the ATP binding site, within a spherical region of radius 9 Å , centered on the geometric center of the ATP heavy atoms in the 1CJK crystal structure [ Fig. 8(C)]. The analysis showed that overall, the ATP binding site displays a very high similarity across all AC isoforms [ Fig. 8(D)]. To further investigate the isoform similarity of the binding site, it was subdivided into three smaller regions of radii 5, 4 and 4 Å , centered on the triphosphate, ribose and adenine atoms of ATP, respectively [ Fig. 8(E)]. The electrostatic potential in the triphosphate region was found to be highly conserved across all AC isoforms [ Fig. 8(F)]. In all isoforms, the electrostatic potential in this region is dominated by a pair of arginine residues and a lysine, that interact directly with the oxygen atoms of the phosphate group, and glutamate and aspartate residues that interact with the phosphates via a pair of coordinated metal ions, which were not included in the electrostatic calculations. This results in a potential that has a positive character on one side of the phosphate atoms and a negative character on the other (Supporting Information Fig. S8).
In contrast to the phosphate-binding side of the ATP pocket, the adenine-binding region displays less conservation of electrostatics across isoforms [ Fig. 8(H)]. While the pairwise similarity indices between many of the isoforms remains high, divergence is seen in some cases. In particular, AC7 appears to have significantly different electrostatic properties to ACs 8 and 9, and to a lesser extent AC4. Looking directly at the electrostatic potentials of the AC isoforms in this region (Supporting Information Fig. S8), the reason for these differences is clear. In the AC8 and AC9 binding pockets, the adenine atoms lie within a negative potential region, while in AC7 they lie in a largely positive region, with the only exception being a small negative patch due to the presence of an aspartate residue, which is conserved across all isoforms, that interacts with the distal aromatic amine of ATP (Supporting Information Fig. S6B). The electrostatic potentials of the other isoforms lie somewhere between these two extremes. PIPSA analysis was also performed centered on the ribose atoms of ATP [ Fig. 8(G)]. As this region in part overlaps with the phosphate-centered and adenine-centered regions, the isoform similarity patterns observed showed intermediate behavior, and appeared to show no additional information.

CONCLUDING REMARKS
As the formation of favorable electrostatic interactions guides the recognition of protein-protein complexes, it is reasonable to assume that proteins that interact with the same binding partners should show similar electrostatic properties along the binding interface. The nine membrane-bound isoforms of the enzyme adenylyl cyclase show a diverse pattern of regulation by a number of regulatory proteins. 4 By analyzing the conservation of the electrostatic potentials surrounding similarly regulated isoforms, we have been able to provide insights into the binding of regulatory proteins to these isoforms.
We have extended the previously reported PIPSA approach 30,31 for comparing the electrostatic properties of homologous proteins by incorporating experimental data on known protein-protein interactions. This approach allows the identification of regions on the protein surfaces where the electrostatic potentials are conserved across similarly regulated proteins, while showing divergence in other proteins. It enabled us to identify the known interaction site of the protein G s a, which stimulates all AC isoforms, as well as the binding site of the substrate of the enzyme. The previously suggested binding site of G i a, which inhibits the activity of ACs 1, 5 and 6, on the C1a domain of AC was also confirmed by our approach. We also identified residues that have been shown to be responsible for the binding of Gbg to ACs 2, 4 and 7. For the more recently discovered regulatory RGS2, we predicted a binding region on the C1a domain, close to the G s a binding site. This site allows the possibility of the formation of a ternary complex between AC, G s a and RGS2, as has been suggested previously. 29 PIPSA analysis of the small molecule binding pockets of AC showed that, overall, there was a high degree of conservation between isoforms. Regions of electrostatic differences were identified that could be exploited for the design of compounds that act on AC in an isoform specific manner.
In summary, in this work we have introduced a new approach to using PIPSA that integrates experimental and computational data for the comparative analysis of protein electrostatic properties. The results provide a basis for future work on the isoform-specific regulation of AC, and the design of novel compounds that target AC isoforms.