Identification of Interaction Hot Spots in Structures of Drug Targets on the Basis of Three‐Dimensional Activity Cliff Information

Activity cliffs are defined as pairs or groups of structurally similar or analogous compounds that share the same specific activity but have large differences in potency. Although activity cliffs are mostly studied in medicinal chemistry at the level of molecular graphs, they can also be assessed by comparing compound binding modes. If such three‐dimensional activity cliffs (3D‐cliffs) are studied on the basis of X‐ray complex structures, experimental ligand–target interaction details can be taken into account. Rapid growth in the number of 3D‐cliffs that can be derived from X‐ray complex structures has made it possible to identify targets for which a substantial body of 3D‐cliff information is available. Activity cliffs are typically studied to identify structure–activity relationship determinants and aid in compound optimization. However, 3D‐cliff information can also be used to search for interaction hot spots and key residues, as reported herein. For six of seven drug targets for which more than 20 3D‐cliffs were available, series of 3D‐cliffs were identified that were consistently involved in interactions with different hot spots. These 3D‐cliffs often encoded chemical modifications resulting in interactions that were characteristic of highly potent compounds but absent in weakly potent ones, thus providing information for structure‐based design.

Activity cliffs are formed by structurally similar or analogous compounds that share the same specific activity but have significant differences in potency (1,2). So defined activity cliffs are of prime interest for medicinal chemistry because they often reveal determinants of structure-activity relationships (SARs) and provide valuable information for compound optimization (2,3). For a meaningful and systematic analysis of activity cliffs, it is essential to clearly define molecular similarity and potency difference criteria (2,3). Activity cliffs can be studied in two or three dimensions on the basis of molecular graphs or three-dimensional (3D) compound structures, respectively (3). A conventional way of studying activity cliffs based on molecular graphs is the use of fingerprints as molecular representations combined with the calculation of Tanimoto similarity (3)(4)(5). Alternatively, for activity cliff analysis, graph-based similarity can also be assessed by determining substructure relationships, for example, on the basis of shared scaffolds (3) or matched molecular pairs (6). The study of 3D activity cliffs (3D-cliffs) requires the use of predicted or experimentally determined compound binding modes. Predicted structures include modeled conformations (5) or docking poses of ligands (7). Experimental binding modes are obtained from X-ray structures of protein-ligand complexes (7,8). The assessment of 3D-cliffs also requires the quantitative comparison of compound binding modes, for example, using interaction fingerprints (7) or 3D similarity measures (8,9). Given the uncertainties associated with predicted compound binding modes, we strongly prefer the evaluation of 3D-cliffs on the basis of well-refined X-ray structures of protein-small molecule complexes. In 2012, a first systematic activity cliff analysis was reported (10) on the basis of publicly available X-ray complex structures (11). a To identify 3D-cliffs, X-ray structures of specific targets in complex with different ligands were optimally superposed and an atomic property density function taking conformational and positional differences into account (9) was calculated for ligands to provide a measure of 3D similarity. 3D-cliffs were then defined as pairs of ligands bound to the same target with at least 80% 3D similarity and an at least 100-fold difference in potency, leading to the identification of 216 3D-cliffs formed by 269 crystallographic ligands with activity against a total of 38 targets (10). Recently, the survey of public domain 3D-cliffs was repeated applying the same criteria for cliff formation and high-confidence criteria for compound potency data (12). This analysis identified 630 3D-cliffs involving 580 small molecule ligands with activity against 61 human targets from 25 different families (12) and thus revealed an astonishing increase in the number of available high-confidence 3D-cliffs within only approximately 3 years.
Herein we present a different application of 3D-cliff analysis. In addition to studying SAR determinants (3), comparing individual interactions, and deriving pharmacophore hypotheses (7), 3D-cliff information can also be used to delineate interaction hot spots in targets. Therefore, making use of the large pool of high-confidence 3D-cliffs (12), we have identified targets for which at least 20 3D-cliffs were available, compared differences in crystallographic interactions between cliff-forming ligands, and identified hot spots in active site regions on which these interactions were centered. Selected targets included human thrombin (THR), factor Xa (FXa), beta-secretase 1 (BACE1), cyclin-dependent kinase-2 (CDK2), carbonic anhydrase II (CAII), leukotriene A 4 hydrolase (LTA 4 H), and heat-shock protein 90 alpha (Hsp90a) all of which are prominent drug targets. Hot spot analysis was complemented by calculating ligand efficiencies (LE) and lipophilic ligand efficiencies (LLE) (13) of compounds forming 3D-cliffs. The analysis identified ligand-target interactions and key residues in different targets that were involved in differentiating between structurally similar compounds with large potency differences and might thus be considered as interaction hot spots in structure-based design.

Methods and Materials
3D-cliffs 3D-cliffs were taken from our recent survey (12). Briefly, high-confidence compound activity data for human targets from ChEMBL (release 19) (14) b were associated with complex X-ray structures available in the Protein Data Bank (PDB) (11) a with the help of UniProt (15) target accession identifiers. A total of 3083 X-ray complex structures involving 340 human targets with high-confidence compound activity data were analyzed. Pairs of co-crystalized ligands qualified as 3D-cliffs if they displayed at least 80% 3D similarity (9) and an at least 100-fold difference in potency (10). In total, 630 unique 3D-cliffs formed by 580 different crystallographic ligands with activity against 61 human targets were identified. From this pool, seven targets were selected for our current analysis for which at least 20 3D-cliffs were available (Table 1).

Ligand efficiency and lipophilic ligand efficiency
Ligand efficiency and LLE values were calculated according to Hopkins et al. (13): In the LE formula, R is the universal gas constant (8.3145 J/mol K), T is the absolute temperature (300 K), and HA is the number of heavy (non-hydrogen) atoms in a molecule. The term 2.303RT 9 pK i approximates the free energy of binding (13

Search for interaction hot spots
In the context of our analysis, we define interaction hot spots as regions or residues in active/binding sites that substantially contribute to interactions with highly potent ligands (as opposed to weakly potent ones). The search for such regions was carried out as follows. After optimal superposition of structures of a given target available in different complexes using the protein alignment function of MOE, short-range interaction patterns of cliff-forming ligands were calculated, visualized, and compared for each 3D-cliff using MOE. Interaction hot spots were then identified based on visual analysis and comparison of 3Dcliffs, detected interactions, and the residues involved. As each 3D-cliff was individually analyzed and visually inspected, the results reported herein have a high level of confidence.

3D-cliff information
A search for interaction hot spots using activity cliff information requires the availability of multiple 3D-cliffs for a given target involving different ligands whose interaction patterns can be compared. For individual targets, large numbers of more than 20 distinct 3D-cliffs could only be identified if high-confidence K i and IC 50 measurements were jointly considered as potency annotations (although these measurements can under rigorous conditions not be  24 25 a The number of 3D-cliffs and ligands involved in the formation of these cliffs is reported for the seven targets for which more than 20 3D-cliffs were available. directly compared). Accordingly, consistency of interaction patterns differentiating between highly and weakly potent cliff-forming compounds on the basis of K i or IC 50 values was considered an important criterion in our analysis. Seven human targets were identified that qualified for our analysis including three proteases (THR, FXa, and BACE1), a kinase (CDK2), CAII, LTA 4 H, and Hsp90a (Table 1). For these targets, between 24 (CAII) and 166 (THR) 3D-cliffs were obtained that involved between 19 (LTA 4 H) and 63 (THR) distinct crystallographic ligands. Notably, the 57 3Dcliffs formed by BACE1 inhibitors involved almost the same number of distinct ligands (62 compounds) as the 166 3Dcliffs formed by THR inhibitors (63 compounds), which was by far the largest number of 3D-cliffs for an individual target we detected in our collection of 3D-cliffs. Taken together, the 3D-cliffs summarized in Table 1 were thought to provide a sound basis for hot spot analysis.

Principles of hot spot analysis
For hot spot analysis, it must be taken into consideration that short-range interaction patterns revealed by ligandtarget X-ray structures only provide an incomplete account of binding events as desolvation or entropy effects and long-range interactions cannot be assessed by analyzing complex structures. In addition, the strength of individual interactions and their contributions to binding can only be reliably determined in follow-up experiments. However, the analysis and comparison of multiple 3Dcliffs might identify recurrent interaction patterns that distinguish structurally similar compounds with large potency differences and active site regions or individual residues that are frequently involved in these interactions. Such insights can hardly be obtained by only investigating one or two cliffs. However, with the availability of larger numbers of 3D-cliffs, recurrent interaction patterns and hot spots might emerge.
Interactions shared by weakly and highly potent cliff-forming compounds might point at regions within active sites that are essential for binding and basic (inhibitory) activity against a given target. By contrast, frequently occurring interactions only formed by highly but not weakly potent cliff-forming compounds are indicative of subsites or residues that are likely to be important or essential for engagement with highly potent ligands and thus represent interaction hot spots, as defined herein.
To further qualify 3D-cliff information, LE and LLE values were also calculated for compounds forming individual cliffs. If activity cliffs are a consequence of specific interactions and contributions to binding, rather than only increasing size or hydrophobicity of ligands, cliff formation should be accompanied by increases in LE and LLE, as has been shown previously (16). Of course, there are also specific lipophilic interactions that determine activity cliffs, which must be taken into consideration when judging about LLE values.
Target-based evaluation Following the analysis scheme outlined above, 3D-cliffs were characterized and compared for each of the seven targets, with a particular focus on the consistency of observed interaction differences between highly and weakly potent ligands across multiple cliffs. For one of seven targets, Hsp90a, no consistent interaction pattern was detectable, despite the availability of 43 3D-cliffs involving 32 unique inhibitors. For Hsp90a, most cliff-forming compounds displayed differences in shape complementarity without detectable interaction preferences with specific residues. In many cases, it was difficult to conclude which active site residues might be responsible for improved lipophilic contacts to the highly potent inhibitors. Furthermore, no recurrent interaction patterns were observed and therefore no hot spot areas could be assigned. In order to avoid potential over-interpretation of the structural data, we decided to focus on targets with clearly defined short-range interaction patterns. Hence, Hsp90a was omitted from further analysis. By contrast, for the six remaining targets, consistent interaction patterns differentiating highly and weakly potent ligands were detected for subsets of cliffs and hot spots identified. In the majority of cases, the formation of 3D-cliffs with conserved interaction patterns was also accompanied by consistent increases in LE and LLE values, as further discussed below.

Interaction patterns and hot spots
Thrombin The trypsin-like serine protease THR plays a key role in the blood coagulation cascade, and the direct THR inhibitor dabigatran was marketed as an oral anticoagulant in 2008 (17). With 166 3D-cliffs formed by 63 distinct ligands, an unusually large number of 3D-cliffs was available for this target. All cliff-forming compounds bound in a substrate-like manner, extending from the S1 to the S3 and S4 pocket within the THR active site. The analysis of the 3D-cliffs revealed different regions on which recurrent interaction differences between cliff partners were centered. Consistent with known interaction hot spots for serine proteases, many 3D-cliffs revealed interaction differences within the S1 and the S3/S4 pockets (139 of 166 3D-cliffs), whereas differences in the S2 pocket were only rarely detected. However, in addition to the wellknown key role of inhibitor interactions within the S1 pocket involving Asp189 or Tyr228 and, in addition, hydrophobic residues forming the S3/S4 binding site, compounds forming several 3D-cliffs were also distinguished by hydrogen bonding interactions with residues Gly216 and Gly219, the presence of which was characteristic for highly potent inhibitors. Figure 1A shows an exemplary 3D-cliff identifying a hot spot within the S1 pocket. Although both compounds had high potency (and a >100fold potency difference), ionic interactions with the side chain of residue Asp189 in the more potent compound were stronger than the interaction of the chloro substituent with residue Tyr228 in the less potent compound. Nonetheless, a chlorobenzyl group, interacting with residue Tyr228 was still very well accommodated by the S1 pocket of THR and present in many of the highly potent cliff compounds. In addition, Figure 1B shows an additional region that was frequently involved in interactions distinguishing highly and weakly potent cliff partners. Many highly potent cliff partners reached deeper into the S3/S4 pocket enabling lipophilic interactions with residues Leu99, Ile174, and Trp215. Furthermore, Figure 1C shows an example highlighting the critical role of hydrogen bond for-mation to Gly219 in the presence of comparable interactions within the S3/S4 pocket. In this case, the presence of the additional hydrogen bond increased the potency of a structurally analogous inhibitor by more than 130-fold. As one would anticipate on the basis of the observed interaction differences discussed above, potency increases were consistently accompanied by increases in LE and LLE values, as also reported in Figure 1.
Factor Xa Similar to THR, the trypsin-like serine protease FXa also plays a key role in the blood coagulation cascade, and the direct FXa inhibitors rivaroxaban and apixaban were approved as oral anticoagulants in recent years (17,18). Compared to THR, a much smaller number of 28 3D-cliffs was available for FXa formed by 25 unique inhibitors. FXa has a slightly larger S1 pocket than THR, which has been exploited to generate inhibitors with selectivity for FXa over THR (18). All 3D-cliff-forming FXa inhibitors were bound to the S1 and the S4 pocket of the enzyme. In contrast to THR, available FXa 3D-cliffs did not reveal consistent interaction differences within the S1 pocket. This might be explained by the fact that most FXa inhibitors with at least 100-fold differences in potency were already potent in the nanomolar range and optimized for binding to S1, in accord with the general FXa inhibitor paradigm (18). However, more than half of 3D-cliffs available for FXa (16 of 28 3D-cliffs) revealed distinguishing hydrogen bond interactions between inhibitors with higher potency and residues Gly216 and/or Gly218, as illustrated in Figure 2A. In this example, an unusual decrease in LLE was observed accompanying 3D-cliff formation. However, this decrease was due to the replacement of a sulfonyl group with a phenyl ring in the more potent compound that was partly solvent exposed and not involved in key S1 or S4 site interactions. A second hot spot region was observed in the S4 pocket. Highly potent inhibitors reached deeply into this pocket and formed extensive lipophilic/aromatic interactions with side chains of residues Tyr99, Trp215, and Phe174, as shown in Figure 2B.
Beta-secretase 1 BACE1 belongs to the family of aspartic proteases and is involved in the production of amyloid b-peptides. Therefore, it is considered a promising target for the treatment of Alzheimer's disease (19,20). With 57 3D-cliffs formed by 62 distinct inhibitors, there was a large knowledge base available for the exploration of interaction hot spots for this target. Most cliff-forming compounds bound to and bridged between the S2 0 and the S3/S4 pockets of BACE1. In this area, three interaction hot spots became apparent that were associated with the formation of most BACE1 3D-cliffs. As illustrated in Figure 3A, additional hydrogen bonding interactions to residues Asn233 and Ser325 in the S2 pocket distinguished many highly potent cliff compounds from their less potent partners. In this  Figure 1: 3D-cliffs formed by thrombin inhibitors identify interaction hot spots. (A-C) Three exemplary 3D-cliffs are shown that encode distinguishing interactions with different hot spots within the active site region of thrombin. The highly potent cliff partner is depicted in cyan and the weakly potent partner in orange. The active site is rendered using a transparent gray surface representation and selected residues forming interaction hot spots are shown in magenta and encircled (red). Compound identifiers [Protein Data Bank-IDs], potency, ligand efficiency and lipophilic ligand efficiency values (as well as differences) are given on the left. 'D-logPot' refers to the difference in pK i or pIC 50 values between 3D-cliff-forming compounds.
representative example, the formation of these interactions was accompanied by a 250-fold improvement in potency compared to the less potent cliff partner. A second hot spot was located in the S2 0 region shown in Figure 3B. In this case, both compounds displayed a nearly identical binding mode, the only difference being a iodobenzyl moiety in the highly potent cliff partner occupying a lipophilic pocket formed by residues Tyr71 and Ile126. This additional interaction resulted in a 230-fold increase in potency (i.e., transforming a micromolar into a low-nanomolar compound). Similar interaction differences in this region were observed in various 3D-cliffs (11 of 57 3D-cliffs). Furthermore, Figure 3C shows an exemplary 3D-cliff where detectable interactions involving the two cliff partners only differed by a single hydrogen bond to the side chain NH group of residue Trp76. The two inhibitors forming this 3D-cliff exhibited a more than 140-fold difference in potency. This characteristic interaction difference was observed in five BACE1 3D-cliffs.
In the three BACE1 hot spots, distinguishing interactions encoded by structural modifications of inhibitors in 3Dcliffs had predicable effects on LE and LLE. Additional interactions within the lipophilic Tyr71/Ile126 subsite led to increased LE but constant or reduced LLE values. By contrast, distinguishing hydrogen bond interactions with the Asn233/Ser325 or Trp76 hot spots resulted in significant increases in both LE and LLE values for the more potent cliff partners.
Leukotriene A 4 hydrolase LTA 4 H is a zinc-containing enzyme with hydrolase and aminopeptidase activity (21). It hydrolyzes leukotriene A 4 to the pro-inflammatory leukotriene B 4 and is associated with a variety of leukotriene-related allergic, respiratory, or cardiovascular diseases (21,22). For LTA 4 H, 37 3D-cliffs formed by 19 distinct ligands were identified. Different from BACE1, most of these 3D-cliffs were formed by very weakly (milli-or micromolar) and highly potent (nanomolar) inhibitors, hence spanning significantly large differences in potency. Interestingly, despite these large potency variations, differences in interaction patterns between cliff-forming compounds were limited to hydrogen bonds to residues close to the catalytic center at the terminus of the L-shaped ligand binding site of LTA 4 H. As illustrated in Figure 4A, a basic amino function of the highly potent cliff partner often formed ionic or hydrogen bonding interactions with residues Glu271 and/or Glu318 and an additional hydrogen bond to Gln136. In the presence of these interaction differences, very large potency changes were observed (such as a~2800-fold potency increase for the exemplary 3D-cliff in Figure 4A). Compounds forming about half of LTA 4 H 3D-cliffs (18 of 37 3D-cliffs) only dif-    Figure 2: 3D-cliffs formed by factor Xa inhibitors and interaction hot spots. (A and B) Two exemplary 3D-cliffs are shown that encode distinguishing interactions with different hot spots within the active site region of factor Xa. The representation is according to Figure 1.
fered by well-defined hydrogen bond interactions with residue Gln134 and/or Gln136, as depicted in Figure 4B. The formation of such cliffs was typically accompanied by a potency increase of three to four orders of magnitude. An exception of this interaction pattern revealed by LTA 4 H 3D-cliffs, the most potent inhibitor, was found to complex the catalytic zinc ion via a carboxylate group. Consistent with the exclusive presence of hydrogen bonds as distinguishing interactions with LTA 4 H hot spots, the formation of 3D-cliffs consistently led to significant increases in LE and LLE values.

Carbonic anhydrase II
Carbonic anhydrases maintain the pH homeostasis by catalyzing the reversible hydration of carbon dioxide (23). CAII is among the most intensely investigated isoforms of carbonic anhydrases and involved in the regulation of intraocular pressure (24). In addition, this isoform has also been implicated in cancer development (25). For CAII, 24 3Dcliffs formed by 25 distinct inhibitors were obtained.
Carbonic anhydrases are a text book example of cationdependent enzymes that are effectively inhibited by complexing their catalytic zinc ion coordination sphere formed by residues His94, His119, and Thr199. Accordingly, strong interactions with the zinc cation and its coordination sphere are a hallmark of potent carbonic anhydrase inhibitors. Therefore, essentially all potent inhibitors contain a sulfonamide moiety that extensively interacted with the coordination sphere. In addition, inhibitory interactions were modulated by residues at the entrance of the active site and might differentiate between isoforms. Figure 5 shows an exemplary CAII 3D-cliff that nicely illustrates conserved and variable interactions. The highly potent cliff partner, dorzolamide (used as an antiglaucoma agent), shared the critically important sulfonamide group with its less potent partner, but formed two additional hydrogen bonds with residues Gln92 and Thr200, resulting in a~670-fold difference in potency. In 11 of 24 CAII 3Dcliffs, highly potent compounds formed these hydrogen bond interactions with either residue Gln92 alone or both residues Gln92 and Thr200, which were absent in weakly potent cliff compounds, hence making these residues potency-modulating hot spots.
Cyclin-dependent kinase-2 CDK2 is a member of the serine/threonine kinase family and a popular anticancer target, given its regulatory role in the mammalian cell cycle (26,27). For CDK2, a total number of 41 3D-cliffs formed by 51 distinct inhibitors were obtained. The majority of these cliffs revealed interaction differences within two hot spot regions. Figure 6A shows a representative 3D-cliff encoding differential interactions with the first hot spot at the entrance of the ligand binding site in CDK2. The sulfonamide moiety of the highly potent cliff-forming compound was in hydrogen bonding distance to the side chain of residue Lys89 and both the backbone NH group and side chain of Asp86. The presence of this interaction pattern was accompanied by a~230-fold increase in compound potency. Similar differences in hydrogen bond patterns to one or both residues in this region were observed in 19 of 41 CDK2 3D-cliffs. A second hot spot was delineated by the majority of 3D-cliffs representing a small hydrophobic pocket close to the hinge region formed by residues Ala31, Val64, andmost importantly -Phe80, the gatekeeper residue. Figure 6B shows an exemplary 3D-cliff with interactions involving this hydrophobic hot spot. The iodo substituent of the highly potent cliff partner, which was 2600-fold more potent than its counterpart, occupied this pocket and closely interacted with the side chain of Phe80. This type of interaction was also detected in several highly potent cliff-forming compounds (for 15 of 41 3D-cliffs). In addition to the compound containing a iodo substituent occupying the pocket depicted in Figure 6B, there were several other highly Thr200 Figure 5: 3D-cliff of carbonic anhydrase II (CAII) inhibitors and hot spots. A representative 3D-cliff is shown that highlights conserved and variable interactions in the active site of CAII. potent cliff-forming compounds with lipophilic substituents, for example, methyl, cyclopropyl or isopropyl, that filled this lipophilic cavity. Accommodating these substituents in the pocket resulted in higher shape complementarity and additional lipophilic interactions. As residue Phe80 closed this pocket, these substituents directly faced the side chain of Phe80, making it the most important residue for lipophilic interactions within this region.
Only a few 3D-cliffs encoded interaction differences at the hinge region of the CDK2 active site (a well-known focal point of kinase inhibitor design). 3D-cliff-forming compounds with interactions in this region mimicked the hydrogen bond network of the adenine moiety in the cofactor ATP (a conserved interaction pattern in many kinase inhibitors). A CDK2 3D-cliff with interaction differences in the hinge region of CDK2 is shown in Figure 6C. The highly potent cliff compound was 2500-fold more potent than its partner and formed two hydrogen bonds to the backbone NH and CO groups of residue Leu83, which were absent in the weakly potent cliff compound. The formation of CDK2 3D-cliffs engaging all three hot spots con-sistently resulted in increasing LE and LLE values for highly potent cliff compounds.

Conclusions
Activity cliffs are of interest in medicinal chemistry because they often reveal SAR determinants. Although activity cliffs can be assessed in two or three dimensions, most of their use in SAR analysis has focused on molecular graph representations. Given the significant increase in the number of activity cliffs that can be evaluated on the basis of experimentally determined compound binding modes, 3D-cliffs become increasingly relevant as a source of SAR knowledge taking details of ligand-target interactions into account. Herein, we have described another application of 3D-cliffs, that is, their use to delineate interaction hot spots in active sites. Well-defined interactions within these regions were a characteristic feature of highly potent cliff-forming compounds that distinguished them from weakly potent partners.
For seven human targets, sufficiently large numbers of different 3D-cliffs were available to search for characteristic interaction patterns and hot spots that distinguished highly and weakly potent compounds. For six of these targets, series of 3D-cliffs encoded characteristic interaction patterns and identified different hot spots. In some instances (e.g., THR and FXa), interactions involving these hot spots were well known, and in others (e.g., LTA 4 H), they were not. Although specific ligand-target interactions are typically well analyzed in reports of X-ray complex structures, their potential relevance for compound potency improvements becomes often only apparent if multiple series of ligands are compared. To these needs, 3D-cliffs provide a very informative data structure because they help to focus on individual interactions that are encoded by structural modifications distinguishing compounds having very similar binding modes.
In our analysis, 3D-cliff comparisons identified a number of residues and interactions that consistently differed between similar compounds having an at least 100-fold difference in potency, thereby providing additional focal points for structure-based design efforts. The study demonstrated the utility of structure-based activity cliff analysis for targets for which multiple 3D-cliffs are available, enabling the identification of recurrent patterns of differentiating interactions involving specific residues. To support SAR exploration and structure-based design efforts in the scientific community, the collection of 3D-cliffs reported herein and all associated information is made freely available for further study as a deposition on the open access ZENODO platform under the authors' names (28).