Synthesis of a HyCoSuL peptide substrate library to dissect protease substrate specificity

This protocol describes HyCoSuL, an approach that uses tetrapeptides containing natural and >100 unnatural amino acids to screen for protease substrate specificity and to engineer highly active and selective substrates and activity-based probes. Many biologically and chemically based approaches have been developed to design highly active and selective protease substrates and probes. It is, however, difficult to find substrate sequences that are truly selective for any given protease, as different proteases can demonstrate a great deal of overlap in substrate specificities. In some cases, better enzyme selectivity can be achieved using peptide libraries containing unnatural amino acids such as the hybrid combinatorial substrate library (HyCoSuL), which uses both natural and unnatural amino acids. HyCoSuL is a combinatorial library of tetrapeptides containing amino acid mixtures at the P4–P2 positions, a fixed amino acid at the P1 position, and an ACC (7-amino-4-carbamoylmethylcoumarin) fluorescent tag occupying the P1′ position. Once the peptide is recognized and cleaved by a protease, the ACC is released and produces a readable fluorescence signal. Here, we describe the synthesis and screening of HyCoSuL for human caspases and legumain. We also discuss possible modifications and adaptations of this approach that make it a useful tool for developing highly active and selective reagents for a wide variety of proteolytic enzymes. The protocol can be divided into three major parts: (i) solid-phase synthesis of the fluorescence-labeled HyCoSuL, (ii) screening of protease P4–P2 preferences, and (iii) synthesis of the optimized activity probes equipped with an AOMK (acyloxymethyl ketone) reactive group and a biotin label for easy detection. Beginning with the library design, the entire protocol can be completed in 4–8 weeks (HyCoSuL synthesis: 3–5 weeks; HyCoSuL screening per enzyme: 4–8 d; and activity-based probe synthesis: 1–2 weeks).

However, biological diversity approaches are innately composed of only natural amino acids, and most chemical diversity approaches use natural amino acids during peptide synthesis, making it very difficult to obtain individual selectivity for closely related enzymes with overlapping substrate specificities (e.g., caspases, cathepsins, and deubiquitinating enzymes) [13][14][15] . These difficulties create a great opportunity to develop new strategies for profiling proteases in order to find substrates with better selectivity and higher activity. The objective of profiling protease specificity preferences is not limited to the identification of new substrates. The optimal peptide sequence can be further converted into an inhibitor by attaching a mechanism-based warhead, or into an activity-based probe (ABP) equipped with both a warhead and an easily detected tag (e.g., biotin, isotope, and fluorophore). For proteases that display covalent catalysis (serine, cysteine, threonine), the mechanism-based warhead is an electrophilic reactive group that forms a covalent, irreversible complex with the protease catalytic amino acid (Ser, Cys or Thr) 1,16 . Both inhibitors and ABPs are essential for monitoring protease activity in a wide scope of biological systems, from in vitro test tubes, through cell cultures to whole organisms [17][18][19] .

Development of the protocol
We hypothesized that the chemical space in protease active sites can be explored and substantially expanded by the use of unnatural amino acids 20,21 ; thus, we developed the HyCoSuL approach (Fig. 1). This chemical tool is built on a classic PS-SCL scaffold in which P4-P2 positions in the peptide substrate are varied with natural amino acids 12,22 . In HyCoSuL, the P4-P2 positions of a tetrapeptide use a wide set of unnatural amino acids (>100 derivatives) 19,23 conjugated with the ACC (7-aminocoumarin-4-acetic acid) fluorescent tag, composed of three sublibraries-the P4 sublibrary: Ac-Aaa-Mix-Mix-P1-ACC, the P3 sublibrary: Ac-Mix-Aaa-Mix-P1-ACC, and the P2 sublibrary: Ac-Mix-Mix-Aaa-P1-ACC-where Mix is an equimolar mixture of 19 natural amino acids (cysteine is omitted because of oxidation problems, Nle is used instead of Met) defined by Ostresh 24 and Aaa is one of the 19 natural or (>100) unnatural amino acids fixed in the position of interest. In many protease families (such as the serine protease and caspase clans), the primary specificity is defined by the S1 pocket that occupies the P1 residue, allowing us to fix P1 as a predefined amino acid. The ACC tag is a coumarin-derivative molecule that emits a high fluorescence signal (emission max. = 460 nm) when excited by an appropriative wavelength (excitation max. = 355 nm). In the peptide library, the fluorescence of ACC is quenched by an amide bond formed between the P1 amino acid (-COOH) and the ACC amine group. Once the substrate is cleaved by a protease, the fluorophore is released from the peptide and the increase of fluorescence signal is monitored over time. We emphasize that although HyCoSuL is synthesized on the solid support, after the synthesis, the whole library is cleaved from the resin and used for the enzymatic studies in solution. The use of HyCoSuL was reported for the first time by Kasperkiewicz et al. to design a highly active substrate and probe for human neutrophil elastase (the P1 position in the library was occupied by Ala) 19 . Shortly after, Poreba et al. reported the application of HyCoSuL to discriminate between human apoptotic caspases (Asp at the P1 position) 23 . These two primary studies using serine and cysteine proteases on HyCoSuL demonstrate that this approach is very useful in order to obtain highly active and selective substrates and substrate-derived activity-based probes for important human proteases.
Here, we present a HyCoSuL screening protocol to obtain highly active and selective substrates for human caspases and legumain. The entire technique consists of several major steps: (i) solidphase synthesis of an ACC-labeled hybrid combinatorial P4-P2 library with a predefined Asp at the P1 position, (ii) screening and analysis of the caspase and legumain preferences at the P4-P2 positions and (iii) selection of optimal sequences (selective and active) for further enzyme investigation. Caspases (CD clan) are key players in cell death, in which they are responsible for driving the apoptotic program by cleaving protease substrates 25 (initiator and executioner caspases) or for the generation of inflammatory signals that lead to cell swelling and membrane breakdown 26 (inflammatory caspases; pyroptosis). We also selected legumain (AEP, asparaginyl endopeptidase), a CD clan cysteine protease, to be characterized by HyCoSuL, as this enzyme also recognizes Asp at the P1 position; thus, its substrate preferences overlap with caspases 3 . Legumain is mainly involved in antigen presentation, and its upregulation is linked to a plethora of diseases (inflammation, arteriosclerosis, tumorigenesis and more) 27 . The key feature unique to this protocol is the use of unnatural amino acids in the library structure to create a detailed map of the interactions between substrates and substrate-binding pockets of caspases, legumain, and other proteases. In this protocol, we use the term 'unnatural amino acids' for all amino acids except the 20 proteinogenic ones. This would include post-translationally modified and chemically synthesized amino acids, highlighting the context-specific differences between the PS-SCL and HyCoSuL approaches.   Figure 1 | Outline of the hybrid combinatorial substrate library (HyCoSuL) method. HyCoSuL is composed of three ACC-labeled sublibraries (P4, P3 and P2) that are synthesized on solid phase (Stage 1). Next, the library is used to dissect protease preferences in S4, S3 and S2 active-site pockets, which refer to the P4, P3, and P2 positions, respectively, of the substrates (Stage 2). The most active (or most selective) amino acids are then extracted and used for the synthesis of optimal (active or selective) protease substrate (Stage 3).

Advantages and limitations of the protocol
As HyCoSuL is a new concept in protease research, it is still being validated and improved in order to create a chemical model that covers the structural requirements of multiple proteases. However, we have already found HyCoSuL to be a versatile tool for profiling proteases of different classes, catalytic mechanisms, origins or activities 19,23,[28][29][30][31] . The library structure allows for detailed exploration of the protease active site; thus, the substratespecificity map is much more informative than the classic analysis based on PS-SCL, phage display, or natural substrate cleavages. Furthermore, the possible number of individual peptides that can be synthesized and validated is much higher than those of traditional approaches, in which only natural amino acids are used. For example, the traditional PS-SCL can create 19×19×19 (P4×P3×P2 all natural-around 7,000) individual structures that can be selected for further analysis, whereas HyCoSuL can produce 120×120×120 (natural + unnatural-around 1.7 million) structures or more, depending on the number of unnatural amino acids used. Moreover, the selection of unnatural amino acids used for the synthesis is protease and library dependent. In our studies, we selected a very broad range of diverse chemical structures to gain a detailed knowledge of protease interaction sites; however, this selection can be more channeled. For example, more d-amino acids can be selected to study some bacterial proteases, or only bulky and hydrophobic amino acids can be used if the protease active site has a strongly hydrophobic character. Unnatural amino acids can also be used to study post-translational modifications in proteins that influence protease substrate specificity and activity. At present, there are a large number of commercially available solid-phase peptide-synthesis-suited amino acids that have various post-translational modifications (PTMs) in their structures. The other advantage of our protocol is that the synthesis of peptide libraries of different types on solid phase (or in solution) is very well described in the literature 12,32 . All the chemicals (including resins, coupling reagents, or natural and unnatural amino acids) are commercially available and affordable. HyCoSuL (and PS-SCL) uses positional substrate libraries, in which some positions are fixed with one amino acid, whereas others are randomized. Each position (P4, P3 and P2) is screened separately, and the protease substrate-specificity map is the sum of three sublibrary analyses. Such methodology does not disclose protease potential subsite cooperativity; thus, after determination of protease substrate specificity, several individual substrates must be synthesized in order to validate the screening results. In this protocol, we address this issue in detail and provide the best solutions. Importantly, the method, by its nature, uses unnatural amino acids and therefore cannot be used to predict the location of cleavage sites in naturally occurring proteins. On the other hand, the use of unnatural amino acids promotes the discovery of sites on proteases that natural amino acids are incapable of exploring, as demonstrated in the structure of a HyCoSuL-derived ABP in complex with neutrophil elastase 33 .

Comparison with existing methods for protease screenings
The past 20-25 years yielded multiple methods for profiling protease substrate preferences 7 . The most prominent two of them are phage display, developed by Smith 34 and adapted into protease research by Matthews and Wells 10 , and PS-SCL, developed by Rano, Thornberry, and coworkers 11,22 . These methods allow for the precise determination of protease preferences at the prime and nonprime region (phage display) or only at the nonprime region (PS-SCL). The great advantage of phage display is that up to 10 10 individual peptides can be displayed and subjected to protease analysis, which would be very difficult to obtain through chemical synthesis. However, these peptides are label-free, and additional steps to determine the optimal substrate sequences are needed, making this method labor intensive. The protease analysis through PS-SCL is much faster and more reliable, as peptides are equipped with a fluorescent reporter tag. However, this method is suitable only for determination of nonprime regions of protease catalytic cleft. Other methods for protease substrate-specificity investigations include internally quenched fluorescent substrate libraries (to profile prime and nonprime regions) 35 , microarrays that transform fluorescent substrate libraries into microscale formats 36 , and multiple proteomic approaches using mass spectrometry for fishing out protease substrates from biological samples [37][38][39] . All these methods have their pros and cons; i.e., biological proteomicsbased methods are more informative of natural targets, whereas phage display and related techniques, in common with chemical methods, are more comprehensive. Nevertheless, all these methods are based on natural amino acids. The HyCoSuL concept is based on the use of unnatural amino acids that allow for a more extensive exploration of the protease active site. As this approach (similarly to PS-SCL) is of chemical origin, it provides fast and reliable protease screening and can be easily adapted to study most proteolytic enzymes.

Applications and modifications of HyCoSuL
As this methodology consists of two main parts (i) chemical synthesis of HyCoSuL and (ii) its use for dissecting protease substrate specificity, its application is broad and can be adapted to the context of the research conducted. Here, we present several examples of HyCoSuL applications that we have successfully used in protease studies.
Application/case study 1 (human neutrophil elastase). Human neutrophil elastase is a serine protease that is released by neutrophils during inflammation 40 . For many years, the activity of this enzyme was measured by classic short substrates containing the Ala-Ala-Pro-Val (P4-P1) tetrapeptide sequence. Screening of elastase with HyCoSuL revealed that the chemical space of amino acids that occupy its active site can be substantially expanded to reveal exquisitely sensitive substrates; one of the best contained all unnatural amino acids Nle(O-Bzl)-Met(O 2 )-Oic-Abu 19 . This unnatural substrate was almost 10,000-fold more sensitive than the classic natural epitope (Ac-AAPV-ACC k cat /K M = 4.92×10 3 M −1 s −1 , and Ac-Nle(O-Bzl)-Met(O 2 )-Oic-Abu-ACC k cat /K M = 4.70×10 7 M −1 s −1 ). In the next step, this substrate was converted into a biotin-labeled phosphonate-activity-based probe (PK101) that selectively detected active elastase in neutrophils during neutrophil extracellular trap formation 19 . Structural analysis of PK101 bound to elastase revealed that some of the side chains explored pockets on the enzyme not available to natural amino acids, accounting for the enhanced activity and specificity 33 . This example demonstrates that an optimized peptide sequence from HyCoSuL profiling can be used as a scaffold to develop ultrasensitive probes for protease detection.
Application/case study 2 (human apoptotic caspases). Apoptotic caspases were the first to be characterized with a PS-SCL approach by Thornberry and coworkers 11 . This analysis revealed that these enzymes have overlapping substrate specificity and cannot be distinguished from each other by substrates consisting of natural amino acids 13 . The prototypic caspase substrates and inhibitors (Asp-Glu-Val-Asp for caspase-3, Leu-Glu-His-Asp for caspase-9, or Ile-Glu-Thr-Asp for caspase-8) are commonly used as 'caspasespecific' tools, which has led to severe data misinterpretation in several studies 41 . Several broad studies have clearly demonstrated that these tetrapeptide-based sequences display a limited degree of selectivity; thus, they are not appropriate for dissecting individual caspases in complex mixtures 13,42,43 . Recently, we profiled six human apoptotic caspases, demonstrating that the use of >100 unnatural amino acids can overcome the overlapping preferences of this group of enzymes 23 . Several substrates with unnatural amino acids were shown to display much higher selectivity than commercially available structures. We validated their utility in a Natural and unnatural amino acids Natural (proteinogenic) amino acids  paradigm of cell-free apoptosis by demonstrating that one of the caspase-9 substrates (Ac-Oic-Tle-His-Asp-ACC) is hydrolyzed only by the initiator caspase-9, and not by executioner caspase-3, caspase-6 and caspase-7. This was the first example demonstrating that caspase-9 activation can be monitored by a small-molecule fluorescent substrate.

Application/case study 3 (ZIKA virus NS2B-NS3 protease).
Recently, we used HyCoSuL to profile the P4-P2 positions of ZIKA virus NS2B-NS3 protease 29 . To do this, we used a hybrid library with Arg at the P1 position. We found that the amino acids best recognized by this protease are nonproteinogenic ornithine (Orn) at P2, lysine at P3 and unnatural d-Arg at P4. The screening results mirrored the k cat /K M value for several substrates that were synthesized based on screening data (the Ac-d-Arg-Lys-Orn-Arg-ACC substrate displayed the highest cleavage efficiency). In this study, we explored the HyCoSuL concept by synthesizing a P1 library with >100 unnatural amino acids. The structure of this library (Ac-Ala-Arg-Leu-P1-ACC) was not optimal for ZIKA virus NS2B-NS3 protease, as we wanted to make it useful also for P1 screening of many other proteases. Nevertheless, the P4-P1 profiling of NS2B-NS3 protease with unnatural amino acids allowed us to develop the first potent and irreversible-activity-based probe for this enzyme.
These three examples illustrate the scope of HyCoSuL as a broad and diverse discovery and development platform. On the one hand, it can be used to design much more active substrates and probes for proteases of interest, and on the other it can be used to distinguish between closely related proteases. However, HyCoSuL is not only a simple Ac-P4-P3-P2-P1-fluorophore library. We propose HyCoSuL as a general concept for the use of a wide range of unnatural amino acids in an organized manner to dissect protease activesite preferences. HyCoSuL architecture can be general and unified (as we presented in our case studies), but it can also be adapted to protease structural requirements, for example, for dipeptides (for diaminopeptidases) and pentapeptides (for, e.g., caspase-2), as well as extended into the prime region of the protease active sites (for matrix metalloproteases, for example). The choice of the unnatural amino acids in the library structure can also be channeled depending on the protease preferences. Moreover, the use of unnatural amino acids in protease screening does not have to be limited to combinatorial peptide mixtures. HyCoSuL was published in 2014 (ref. 19); however, before that our group presented HyCoSuL-related strategies in which individual substrate libraries with unnatural amino acids was successfully applied in protease substrate screens (H 2 N-P1-ACC for aminopeptidases 44,45 or H 2 N-P2-P1-ACC for dipeptidyl dipeptidases 46 ). Recently, we have also reported on the use of tripeptide libraries with unnatural amino acids tailored for ClpP proteases 47,48 .
We have already demonstrated that HyCoSuL is a concept that opens doors for biochemical and biological experiments, providing an approach for chemists who synthesize substrate libraries of various sizes and lengths, biochemists who need new tools for the more accurate and rapid protease active-site analysis, and finally biologists who use chemical tools (substrates and activity-based probes) for protease detection in cellulo or in vivo.
Experimental design ACC fluorophore. The classic combinatorial fluorescent substrate libraries were equipped with an AMC (7-amino-4-methylcoumarin) fluorescence tag 22 . In 2000, Ellman and Craik synthesized the first PS-SCL library with the ACC fluorophore 12 . The bifunctional characteristic of this molecule allows for the solid-phase synthesis of the entire library with virtually all amino acids at the P1 position; the carboxy end of the ACC is attached to the resin before adding the first amino acid. In this protocol, we use an ACC tag synthesized according to the protocol described by Maly 49 . However, in another Nature Protocols article Patterson et al. described the synthesis and use of another bifunctional fluorophore (AMCA (N-acyl 7-amino-4-methylcoumarin acetic acid)), which can be used in the HyCoSuL approach as well 50 .
Unnatural amino acids. The main concept of HyCoSuL is to use unnatural amino acids to better explore protease active-site preferences. In this protocol, we describe the use of >100 unnatural amino acids in a P1-Asp library for screening caspases and legumain. These amino acids represent a very wide range of chemical structures, in order to increase the number of possible interactions with the enzyme's active site. Importantly, these amino acids must be compatible with our synthetic approach (Fmoc/Boc Red numbers indicate natural amino acids (cysteine was replaced with norleucine) and blue numbers indicate unnatural amino acids. On Plate B, several good natural substrates are screened again in order to combine the results from two plates into one diagram. chemistry); thus, they must remain stable through all the synthetic steps, as well as in the enzymatic assay (Supplementary Table 1). However, the palette of these amino acids can be customized to the protease of interest. On the basis of the selection criteria, we can divide all amino acids into different groups 51 ; however, for our purposes we usually distinguish six groups of amino acids (Fig. 2). One group are amino acids with different protecting groups (e.g., Asp(benzyl), Arg(methyl)). These amino acids can be considered to be unnatural analogs, if they are stable under conditions of the library synthesis. As our protocol is based on the widely used Fmoc/Boc strategy, these groups must be resistant to reagents for Fmoc-(piperidine) and tBu-(trifluoroacetic acid (TFA)) deprotection. For instance, aspartic acid protected with -methyl, -cyclohexyl, -benzyl, or -β-menthyl groups will stay intact during the whole synthesis, whereas -tert-butyl or -2phenylisopropyl groups are TFA-labile. On the other hand, basic amino acids, such as arginine, can be protected with -methyl, -benzyl, or -nitro groups that are not hydrolyzed, but several other protecting groups as -Pbf and -bis-Boc are easily removed upon TFA treatment. A comprehensive list of chemical groups for amino acid protection, and the conditions for their removal, was reviewed by Isidro-Llobet et al. 52 .

Library synthesis.
In this protocol, we use orthogonal Fmoc/Boc chemistry for the synthesis of a tetrapeptide ACC-labeled combinatorial library. This library contains aspartic acid at P1 (suitable for caspase, legumain, and proteasome caspase-like subunit screening), but any other amino acid (natural or unnatural) can be used. This synthesis is calculated for three sublibraries: P4, P3 and P2, each of them containing 19 natural and 110 unnatural amino acids-the size of the library depends on the number of unnatural amino acids used. This synthesis is performed in 48-well cartridges for solid phase, and thus the substrates are synthesized in parallel. If the number of amino acids (natural + unnatural) exceeds 48, the synthesis for other amino acids (49-96, 97-144 and so on) can be repeated under the same experimental setup as for the first (1-48) set, or by using three 48-well cartridges at the same time. The general procedure for the synthesis of peptides on the solid support has been known for many years and is very well described in protocols for the synthesis of individual peptides. In this protocol, we applied this general procedure and optimized it for the synthesis of combinatorial peptides with unnatural amino acids.
The key element of this procedure is to obtain 100% coupling of the ACC fluorophore, individual amino acids, or isokinetic mixture to the solid support. Here we highlight some critical points. (i) The best reagent for coupling Fmoc-ACC-OH to Rink amide resin is the 1-hydroxybenzotriazole and diisopropylcarbodiimide (HOBt/DICI) pair. Although (1-[Bis(dimethylamino) methylene]-1H-1,2,3-triazolo[4,5-b]pyridinium 3-oxid hexafluorophosphate) (HATU)/2,4,6-collidine is more potent, the final product is not pure, as there are some other side reactions between Fmoc-ACC-OH and HATU. (ii) The optimal reagents for coupling of the first amino acid to the H 2 N-ACC-resin are the HATU/2,4,6-collidine pair. However, even when using HATU, some amino acids are not coupled completely to the resin 49 . In these cases, the unreacted N-terminal amine of the H 2 N-ACCresin must be acetylated following the protocol described by Maly et al. 49 . (iii) In our primary study 19, we used 2-(1H-benzotriazol-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate (HBTU)/N,N-diisopropylethylamine (DIPEA) for coupling of the individual amino acids and isokinetic mixtures to the solid support; however, as the price of HATU is becoming lower, this reagent can be used in place of HBTU (which makes the coupling time shorter). (iv) In the protocols describing Fmoc removal with piperidine in N,N-dimethylformamide (DMF) solution, we found several ways to do this (different piperidine concentrations, numbers of deprotection cycles and timing). As the result of HyCoSuL is a combinatorial peptide mixture that cannot be purified on HPLC, we must be sure that all the Fmoc groups are removed after each cycle. Thus, for Fmoc deprotection, we routinely use 20% (vol/vol) piperidine in DMF in three cycles (5, 5 and 30 min).
Isokinetic mixture. An isokinetic mixture contains 19 amino acids at a ratio that corresponds to their individual reaction kinetic of coupling to the free amine (for example, the slower the rate of coupling with amine, the higher the ratio of the amino acid in the mixture). Using such a mixture ensures equal distribution of all amino acids in the product after coupling. In our protocol, we use the isokinetic mixture determined by Ostresh 24 . As HyCoSuL can contain various numbers of unnatural amino acids, the amounts of reagents (including the isokinetic mixture) must be scaled to the numbers of positions in the library. To create our library, we used the following ratios of Fmoc-protected amino acids (ratio in mol% )-Fmoc-l-Ala-OH: 3. Library screening and data analysis. Library screening of proteases of interest can be performed on any multiwell (96 or 384) plate appropriate for fluorescence assays. We pipette our libraries manually (1 µl of library in DMSO), and thus we use 96well plates (Fig. 3). However, a 384-well plate format can also be used if automated pipetting equipment is available. In such a case, the whole sublibrary with >100 unnatural amino acids can be screened on one plate. Once the proper plates are chosen, the library is pipetted into the wells, enzyme is added with a multichannel pipette, and the increase of fluorescence is monitored over time. The critical factor in the library screening is to identify and use only the linear part of the plot for determination of reaction rates. This is important, especially for enzymes with narrow specificity at certain positions, for example human caspase-3, which is highly stringent for aspartic acid at the P4 position. Thus, for a comprehensive analysis, it is important to also probe nonoptimal substrates. To achieve this, it is essential to sometimes work with high enzyme concentrations, for which the cleavage of the optimal substrate may be nonlinear. This will ensure that nonoptimal (weak) amino acids are read in a linear range (Fig. 4).
Thus, to overcome this limitation, we recommend synthesizing several individual substrates with the selected amino acids and determining the kinetic parameters toward the enzyme of interest. Although strong subsite cooperativity is not a common issue in proteases, we strongly encourage validation of screening results, especially when new proteases are tested. More information about protease subsite cooperativity can be found elsewhere 53 .
In this protocol, we describe the use of HyCoSuL to design short substrates that can discriminate between human apoptotic caspases and legumain. After screening the whole P4-P2 library toward six apoptotic caspases and legumain, and creating a specificity matrix (heat maps) for all these enzymes (Fig. 5), we were able to design selective sequences for almost all tested enzymes.
Adapting the substrate library for different proteases. This protocol focuses on the screening of caspase preferences with a P1-Asp hybrid combinatorial library. This library can be also used for other proteases that accept Asp at the P1 position (legumain or the proteasome caspase-like subunit). However, for other enzymes, different libraries are needed. We have already synthesized and validated two other libraries with Arg and Ala amino acids at the P1 position (Fig. 6). These three libraries satisfy the primary requirements of several proteolytic enzymes; however, in some cases other P1 libraries are also needed. On the other hand, many proteases display broad P1 preferences; thus, unnatural amino acids can be also incorporated at the P1 position. As it is very challenging to synthesize combinatorial libraries with different amino acids at the P1 position (especially with unnatural amino acids), we have proposed another strategy with the synthesis of individual substrate libraries with predefined amino acids at the P4-P2 positions and various amino acids at P1. Such libraries were successfully applied for P1 human neutrophil elastase and ZIKA virus NS2B-NS3 protease screenings. In Figure 7, we present a simplified algorithm for the synthesis of (i) a P1-defined combinatorial library and (ii) individual libraries with various amino acids at the P1 position.
From substrate to inhibitor and activity-based probe. Highly selective and active substrates for proteolytic enzymes can be converted into an inhibitor and/or activity-based probe that can be used to block or track a selected protease in biological systems (e.g., cells or whole organisms). There are multiple protocols and procedures for the synthesis of inhibitors/probes   with different warheads and tags for all classes of proteolytic enzymes 16,19,28,30,[54][55][56][57] .
In our protocol, we describe the synthesis of the first P1-Asp legumain-specific biotin-labeled activity-based probe for legumain 28 . The synthesis of the biotin-labeled activity-based probe is divided into three parts: (block A) synthesis of the biotin-6-ahx-D-Tyr(tBu)-l-Tic-l-Ser(tBu)-OH peptide on solid phase (using 2-chlorotrityl chloride resin), (block B) synthesis of the H 2 N-Asp(Bzl)-acyloxymethyl ketone (AOMK) electrophilic warhead (in solution) and (block C) joining of these molecules to form the final product. This example presents a classic approach for the synthesis of protease inhibitors/activity-based probes.

REAGENTS
! cautIon Most of the reagents used in this protocol are toxic and require wearing of proper gloves, goggles and lab coats.  crItIcal Below we present the list of reagents and suppliers that we use to complete this protocol; however, all the reagents can be purchased from other suppliers, as long as they display at least the same level of purity as those indicated below. Fmoc

REAGENT SETUP
Caspase buffer Prepare caspase buffer by mixing 10% (wt/vol) sucrose, 20 mM PIPES, 10 mM NaCl, 1 mM EDTA and 10 mM DTT (pH 7.2-7.4) in deionized water 23 . The buffer for initiator caspases (-8, -9 and -10) is supplemented with 0.75 M sodium citrate (to allow caspases for dimerization) 61,62 . Buffers are prepared at room temperature. Add DTT to the buffer just before the assay. The buffer without DTT can be stored at room temperature (23 °C) up to several weeks, or at +4 °C for several months. Legumain buffer Prepare legumain buffer by mixing 40 mM citric acid, 1 mM EDTA, 120 mM Na 2 HPO 4 and 10 mM DTT at pH 5.8 (ref. 28). Add DTT to the buffer just before the assay. The buffer without DTT can be stored at room temperature up to several weeks, or at +4 °C for several months. Extraction reagents For the extraction steps mentioned in the PROCE-DURE, use the following reagents: (i) brine (saturated solution of NaCl in deionized water); (ii) saturated solution of NaHCO 3 in water; (iii) 5% (wt/vol) solution of NaHCO 3 in deionized water (for 5% (wt/vol) solution, weigh 5g of NaHCO 3 in a glass beaker, fill it with water up to 100 ml, and mix it until all powder is dissolved); and (iv) 5% (wt/vol) citric acid (prepare in the same way as 5% (wt/vol) NaHCO 3 ). These reagents can be stored up to 2 months at room temperature, or up to one year at +4 °C.  4| Wash the resin six times with DMF (60 ml per wash) to remove all piperidine and side-reaction products.  crItIcal step It is important to remove all the piperidine from the resin, as even a small amount of this amine can deprotect the Fmoc group from the amino acids used in the next step or catalyze side reactions during the coupling of amino acids.
7| Pour this mixture onto the resin and shake the vessel gently until all liquid and resin are mixed well. Add DMF, if needed. Protect the cartridge from light by covering it with aluminum foil.  crItIcal step It is very important to dissolve all reagents in a minimal amount of DMF to increase the reagent concentrations in the mixture and allow for high-yield coupling. However, the mixture must be diluted enough to allow easy mixing.  pause poInt Turn the shaker on, and gently shake the reaction vessel for 24 h.
8| Remove the mixture from the resin by vacuum filtration and wash the resin three times with DMF (60 ml per wash).

9|
To ensure a high yield of Fmoc-ACC-OH coupling to the resin, repeat the coupling using half the amount of the reagents from the first coupling. In a 50-ml Falcon tube, place 3.18 g of Fmoc-ACC-OH (7.2 mmol, 1.25 equiv.) and 1.1 g of HOBt (7.2 mmol, 1.25 equiv.), and dissolve them in a minimal amount of DMF. Then, add 1 ml of DICI (7.2 mmol, 1.25 equiv.) and preactivate this mixture for 5 min by gentle stirring.
10| Pour this mixture onto the resin and shake the vessel gently until all liquid and resin are mixed well. Add DMF if needed. Protect the cartridge from light by covering it with aluminum foil.  pause poInt Gently shake the reaction vessel for 24 h on the shaker.
11| Remove the mixture from the resin by vacuum filtration and wash the resin three times with DMF (60 ml per wash).
14| Perform a ninhydrin test for free NH 2 -ACC-resin (Step 5). This is an aromatic amine, so the free amine groups are indicated by an orange to red color of the resin beads.
15| Wash the resin six times with DMF (60 ml per wash) to remove all piperidine and side-reaction products.
37| Perform a ninhydrin test on resin samples from several randomly selected wells (Step 5).  crItIcal step For some amino acids (proline and its derivatives, such as hydroxyproline and azetidine) the ninhydrin test is not a suitable method for the detection of free N-termini. To detect free proline (and its derivatives), perform an acetaldehyde/chloranil test 63 . In two separate tubes, prepare a 2% (wt/vol) solution of p-chloranil in DMF and a 2% (vol/ vol) solution of acetaldehyde in DMF. Then place a few beads of resin in a glass tube and add 1-3 drops of each solution. Mix and incubate the resin at room temperature for 5 min. Beads that turn blue indicate the free N-terminal proline. Perform this test for every sample that contains proline (or one of its derivative) on the N terminus.
? trouBlesHootInG 38| Prepare an isokinetic mixture of natural amino acids for the P3 position (see Experimental design-Isokinetic mixture). In a 50-ml Falcon tube, prepare an isokinetic mixture (9.6 mmol, 240 equiv.), add 1.5 g of HOBt (9.6 mmol, 240 equiv.), and dissolve the contents in DMF to 48 ml (per one multiwell cartridge). Add 1.25 ml of DICI (9.6 mmol, 240 equiv.) to the mixture and activate it for 3 min.  crItIcal step It is very important to activate the mixture for 3 min to ensure that all amino acids from the isokinetic mixture are activated equally.
39| Divide the preactivated mixture into aliquots in the multicartridge wells (1 ml per well) using a Pasteur or plastic pipette. Close the multicartridge with a top lid.  pause poInt Gently shake the multiwell cartridge for 3 h on a shaker.
40| Wash all the wells three times with DMF (1-3 ml per wash) using a wash bottle and filter by using a vacuum.
41| Perform a ninhydrin test on resin samples from several randomly selected wells (Step 5).
? trouBlesHootInG 42| Remove the Fmoc-protecting group from the P3 amino acid by using the procedure from Step 3. For a multiwell system, you can use a wash bottle with 20% (vol/vol) piperidine in DMF or a plastic transfer pipette. Add ~1-2 ml of deprotecting solution to each well.
43| Wash all the wells six times with DMF (1-3 ml per wash) using a wash bottle and filter by using a vacuum.
44| Perform a ninhydrin test on the resin samples from several randomly selected wells (Step 5).
? trouBlesHootInG 45| Prepare an isokinetic mixture of natural amino acids for the P4 position (see Experimental design-Isokinetic mixture). In a 50-ml Falcon tube, prepare an isokinetic mixture (9.6 mmol, 240 equiv.), add 1.5 g of HOBt (9.6 mmol, 240 equiv.), and dissolve the contents in DMF to 48 ml (per one multicartridge). Add 1.24 ml of DICI (9.6 mmol, 240 equiv.) to the mixture and activate it for 3 min.  crItIcal step It is very important to activate the mixture for 3 min to ensure that all amino acids from the isokinetic mixture are activated equally.
46| Divide the preactivated mixture into aliquots in the multicartridge wells (1 ml per well) using a Pasteur or plastic pipette. Close the multicartridge with a top lid.  pause poInt Gently shake the multiwell cartridge for 3 h on the shaker.
47| Wash all the wells three times with DMF (1-3 ml per wash) using a wash bottle and filter by using a vacuum.
48| Perform a ninhydrin test on the resin samples from several randomly selected wells (Step 5).
? trouBlesHootInG 49| Remove the Fmoc-protecting group from P4 amino acid by using the procedure from Step 3. For a multiwell system, you can use a wash bottle with 20% (vol/vol) piperidine in DMF or a plastic transfer pipette. Add ~1-2 ml of deprotecting solution to each well.
50| Wash all the wells six times with DMF (1-3 ml per wash) using a wash bottle and filter by using a vacuum. protocol 2202 | VOL.12 NO.10 | 2017 | nature protocols 51| Perform a ninhydrin test on the resin samples from several randomly selected wells (Step 5).
? trouBlesHootInG 52| For acetylation of the N-terminal end of the peptide library, place 3.6 g of HBTU (9.6 mmol, 240 equiv.) in a 50-ml Falcon tube and dissolve it in DMF to 48 ml (one Falcon tube per cartridge). Then add 550 µl of acetic acid (9.6 mmol, 240 equiv.) and 1.7 ml of DIPEA (9.6 mmol, 240 equiv.). Preactivate this mixture for 1 min by gentle shaking.
53| Divide the preactivated mixture into aliquots in the multicartridge wells (1 ml per well) using a Pasteur or plastic pipette.  pause poInt Gently shake the multiwell cartridge for 45 min on the shaker.
54| Wash all the wells three times with DMF (1-3 ml per wash) using a wash bottle and filter by using a vacuum.
55| Perform a ninhydrin test on the resin samples from several randomly selected wells (Step 5).
? trouBlesHootInG 56| Wash all the wells three times with DCM (1-3 ml per wash) using a wash bottle and filter by using a vacuum.
57| Wash all the wells three times with methanol (1-3 ml per wash) using a wash bottle and filter by using a vacuum. 60| Add 1 ml aliquots of cleavage solution to each well using a Pasteur pipette, and shake the cartridge once every 10-15 min for 2 h. Save the remaining 50 ml of cleavage solution on ice or at +4°C.

61|
Filter the contents of each well separately and collect the filtrate from each well into a separate 15-ml Falcon tube (1 well-1 tube).  crItIcal step Use a permanent marker to label all the 15-ml Falcon tubes, as these samples will be further frozen at −80°C and lyophilized. We highly recommend scratching the numbers on Falcon tubes with a sharp knife, scalpel, or similar tool.
62| Wash each well with the remaining cleavage solution (1 ml per well) using a Pasteur pipette.
63| Filter and collect the mixture from each well into the same 15-ml Falcon tube as was used in Step 61. Now each tube contains around 2 ml of substrate solution.
64| Add 13 ml of ice-cold diethyl ether to each 15-ml Falcon tube. Close the tube, shake it vigorously, and allow the substrate to precipitate at −20°C for 30 min.
66| Add 5 ml of ice-cold diethyl ether to each tube, shake it vigorously, allow the substrate to precipitate at −20°C for 30 min, and centrifuge the tubes using the same conditions as in Step 65.  crItIcal step As HyCoSuL is a combinatorial library, it is important to ensure that the coupling of the isokinetic mixture provides an equimolar distribution of amino acids in the final product. In this protocol, we used the isokinetic mixture developed by Ostresh 24 , which was demonstrated to be very accurate. Nevertheless, additional quality control can be performed using Edman degradation, as we showed in our previous work 64 (supplementary Fig. 1).
preparation of initial screening assay • tIMInG 4-6 h  crItIcal Before any kinetic assay, each enzyme should be active-site titrated in order to obtain reproducible data.
73| Remove the P4-P2 HyCoSuL from the −80°C freezer and allow it to warm to room temperature for at least 3-4 h.
74| Prepare 200 ml of the caspase buffer (see Reagent Setup). Add DTT to the buffer.
75| Remove a caspase-3 aliquot from the −80°C freezer and thaw it on ice.
76| Turn on the spectrofluorometer and set the temperature to 37 °C.
77| Before screening the whole library, perform an initial screening (only natural amino acids; 3 × 19 = 57 samples) to determine the optimal enzyme concentration. Vortex each substrate and, using a micropipette, place 1 µl of it into a 96-well plate (Fig. 3). Such a prepared plate can be stored at room temperature for 2-3 h (avoid direct exposure to light).

78|
To test all peptides containing only natural amino acids (3 positions × 19 amino acids = 57 peptide mixtures), incubate caspase-3 in 6 ml (100 µl per sample) of caspase buffer for 15 min in a 37 °C water bath (or in the incubator). Use the information from previous studies from literature (e.g., substrate kinetics, inhibitor kinetics) to select the initial enzyme concentration. We recommend using three different enzyme concentrations for this step, as shown in Figure 4, as adjusting the appropriate enzyme concentration is assay-dependent and always empirical.

80|
Using an eight-channel pipette, transfer 99 µl of the caspase-containing buffer to wells containing 1 µl of substrate library. is when the cleavage of the best substrate results in the generation of a linear signal (RFU/s) for at least 10-15 min, and the other (weaker) substrates also produce a readable and linear signal (see Fig. 4 for details).

81|
preparation of an assay for the entire Hycosul screening (p2 sublibrary) • tIMInG 2-3 h per screening 84| Once you determine the optimal caspase concentration, perform the whole HyCoSuL screening. Start from the P2 sublibrary.
85| If the sublibrary contains up to 96 substrates, you can do the whole sublibrary screening in one 96-well plate. If the library is larger than 96 substrates, the assay must be performed in two (or more) plates. Make sure that the second (and the next) plate contains several 'good' substrates from the first plate (control) so that the results can be combined.
Regardless of the size of the library (<96 or >96 substrates), each screening must be performed at least three times, and the result must be present as an average value (Step 92).
86| Vortex each substrate, and, using a micropipette, place 1 µl of it into a 96-well plate (Fig. 3). Such a prepared plate can be stored at room temperature for 2-3 h (avoid direct exposure to light).

87|
To screen 96 substrate mixtures on one plate, incubate caspase-3 in 10 ml (100 µl per well) of caspase buffer for 15 min in a 37°C water bath (or in the incubator). Use the caspase concentration that you determined in the initial screening (natural library). If your sublibrary contains more than 96 substrates, screen the rest of them on the second plate (Step 85).

88|
Pour the assay buffer with caspase-3 into the reagent reservoir.

90|
112| Wash the resin six times with DMF (2-4 ml per wash) to remove all piperidine and the side-reaction products.
116| Pour this mixture onto the resin and shake the vessel gently until all liquid and resin are mixed well. Add DMF/DMSO (1:1 (vol/vol)) mixture if needed.  pause poInt Turn the shaker on, and gently shake the reaction vessel for 3 h.
117| Remove the mixture from the resin by vacuum filtration and wash the resin three times with DMF (2-4 ml per wash).
119| Wash the resin three times with DCM (2-4 ml per wash) and three times with MeOH (2-4 ml per wash).
120| Dry the resin in a desiccator over P 2 O 5 overnight. Replace the P 2 O 5 if needed.  crItIcal step The resin must be dry in order to cleave the final product from it.
121| Cleave the peptide from the resin. Prepare 10 ml of cleavage solution (8 ml of DCM, 1 ml of TFE, and 1 ml of AcOH), pour 5 ml of it onto the resin, and shake the cartridge once every 10 min for 45 min.
122| Filter and collect the mixture into a 100-ml round-bottom flask, and wash the resin with the remaining 5 ml of cleavage solution. Vacuum-filter it and collect it into the same flask.
123| Remove the cleavage mixture on a rotary evaporator under reduced pressure until a white/yellow oil forms.
124| Dissolve the oil in 20 ml of the water/acetonitrile mixture (1:1), freeze the mixture at -80°C, and lyophilize to obtain biotin-6-ahx-d-Tyr(tBu)-l-Tic-l-Ser(tBu)-COOH as a white powder. Overall yield (based on 2-chlorotrityl chloride resin loading capacity) is >70%, and peptide purity is >90%.  crItIcal step To confirm that you obtained the actual product, perform mass spectrometry and HPLC analysis (see Equipment Setup and supplementary Fig. 3 for details).  There is no clear answer as to which sequence is the most active/selective Several amino acids can be equally recognized by the enzyme at certain positions Synthesize several substrates with the most promising sequences and measure their kinetic parameters. Detailed kinetic analysis will provide information about possible subsites' cooperativity and will indicate the 'champion' substrate 130 Reaction is not 100% complete The mixture of 10 ml of HBr/ AcOH and water (1:2 (vol/vol)) was not sufficient to transform diazomethylketone into bromomethylketone Prepare an additional 5 ml of HBr/AcOH and water (1:2 (vol/vol)), and add it to the reaction flask dropwise over 5 min. Carry on the reaction for 5 min more and check the progress on an analytical HPLC system (254 nm)

143
Reaction progress stops at some point (HPLC analysis shows that both substrates are still present in the reaction mixture) pH is <8, as there is some TFA left over from Step 137 Adjust the reaction pH to the optimal value by adding base (2,4,6-collidine)

149
Reaction progress stops at some point (HPLC analysis shows that Asp(Bzl) is not fully deprotected) Pd/C is quenched by one of the reaction products/substrate; thus, it loses its catalytic activity Replace all the hydrogen in the flask with an inert gas, add an additional portion of Pd/C, and continue the bubbling of hydrogen through the reaction mixture Pd/C has been contaminated or has partially decomposed during storage Use Pd/C from a freshly opened bottle