The Mobility of the Cap Domain Is Essential for the Substrate Promiscuity of a Family IV Esterase from Sorghum Rhizosphere Microbiome

ABSTRACT Metagenomics offers the possibility to screen for versatile biocatalysts. In this study, the microbial community of the Sorghum bicolor rhizosphere was spiked with technical cashew nut shell liquid, and after incubation, the environmental DNA (eDNA) was extracted and subsequently used to build a metagenomic library. We report the biochemical features and crystal structure of a novel esterase from the family IV, EH0, retrieved from an uncultured sphingomonad after a functional screen in tributyrin agar plates. EH0 (optimum temperature [Topt], 50°C; melting temperature [Tm], 55.7°C; optimum pH [pHopt], 9.5) was stable in the presence of 10 to 20% (vol/vol) organic solvents and exhibited hydrolytic activity against p-nitrophenyl esters from acetate to palmitate, preferably butyrate (496 U mg−1), and a large battery of 69 structurally different esters (up to 30.2 U mg−1), including bis(2-hydroxyethyl)-terephthalate (0.16 ± 0.06 U mg−1). This broad substrate specificity contrasts with the fact that EH0 showed a long and narrow catalytic tunnel, whose access appears to be hindered by a tight folding of its cap domain. We propose that this cap domain is a highly flexible structure whose opening is mediated by unique structural elements, one of which is the presence of two contiguous proline residues likely acting as possible hinges, which together allow for the entrance of the substrates. Therefore, this work provides a new role for the cap domain, which until now was thought to be an immobile element that contained hydrophobic patches involved in substrate prerecognition and in turn substrate specificity within family IV esterases. IMPORTANCE A better understanding of structure-function relationships of enzymes allows revelation of key structural motifs or elements. Here, we studied the structural basis of the substrate promiscuity of EH0, a family IV esterase, isolated from a sample of the Sorghum bicolor rhizosphere microbiome exposed to technical cashew nut shell liquid. The analysis of EH0 revealed the potential of the sorghum rhizosphere microbiome as a source of enzymes with interesting properties, such as pH and solvent tolerance and remarkably broad substrate promiscuity. Its structure resembled those of homologous proteins from mesophilic Parvibaculum and Erythrobacter spp. and hyperthermophilic Pyrobaculum and Sulfolobus spp. and had a very narrow, single-entry access tunnel to the active site, with access controlled by a capping domain that includes a number of nonconserved proline residues. These structural markers, distinct from those of other substrate-promiscuous esterases, can help in tuning substrate profiles beyond tunnel and active site engineering.

out by applying a metagenomic functional screening approach. The strategy involved the use of LB agar plates containing tributyrin (22). The pooling of 40 clones per well dispensed into 96-well microtiter plates (approximately 3,800 clones in one plate) facilitated the colony screening at a high throughput. After screening of more than 100,000 clones, 1 hit from the SorRhizCNSL3 W library showed lipase/esterase activity. A fosmid insert was extracted using the Qiagen Large-Construct kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol and digested with restriction enzymes for insert size estimation, and the insert was sequenced by Illumina technology. Upon the completion of sequencing, the reads were quality filtered and assembled to generate nonredundant metasequences, and genes were predicted and annotated as described previously (12). One gene encoding a predicted carboxylic ester hydrolase was identified. The gene was amplified with specific primers, cloned into the p15TV-L vector, and transformed into Escherichia coli BL21(DE3) cells for expression of the N-terminal His 6 -tagged proteins. The deduced amino acid sequence of the enzyme (324 amino acids long) was used for homology searches in the taxonomy and functional assignment. A database search indicated that EH 0 showed 99% identity with the a/b hydrolase enzyme from Sphingomonas pruni (protein identifier [ID] WP_066587239), both of which are classified as members of the a/b hydrolase-3 family (PF07859). The typical H-G-G-G motif and the G-X-S-X-G catalytic motif are conserved, and EH 0 clustered together with family IV esterases.
Biochemical characterization. The recombinant protein was successfully expressed in soluble form and purified by nickel affinity chromatography. Purified protein was desalted by ultrafiltration, and its enzymatic activity was assessed. Four model p-nitrophenyl (p-NP) ester substrates with different chain lengths were used to determine the substrate specificity of the enzymes and therefore determine whether the enzymes are in fact true lipases or esterases. Lipases hydrolyze ester bonds of long-chain triglycerides more efficiently than esterases, which instead exhibit the highest activity toward water-soluble esters with short fatty acid chains (23). The substrates used for the hydrolytic test were p-NP acetate (C 2 ), p-NP butyrate (C 4 ), p-NP dodecanoate (C 12 ), and p-NP palmitate (C 16 ). The hydrolytic activity was recorded under standard assay conditions (Fig. 1). EH 0 showed a specific activity of 496.5 U mg 21 for p-NP butyrate, which was the best substrate. Lower levels of activity were observed with longer-chain esters (C $ 12). The esterase followed the Michaelis-Menten kinetics, and its kinetic parameters are reflected in Table 1. A comparison of the catalytic efficiency values (k cat /K m ) indicated a high reactivity toward p-NP butyrate followed by p-NP-acetate.
Its voluminous (volume of the active site cavity, 5,133 Å 3 ) but low-exposed (solvent-accessible surface area [SASA], 5.07 over 100 dimensionless percentage) active site allows hydrolysis of a broad range of 68 out of 96 structurally and chemically diverse esters (see Table S1 in the supplemental material), as determined by a pH indicator assay (pH 8.0, Specific activities (mean 6 standard deviation from triplicates) are shown. Reaction mixtures contained 1 mM concentrations of the corresponding p-NP esters, and reactions were conducted in the presence of 1 to 5% DMSO-acetonitrile (see Materials and Methods), under standard conditions described in Materials and Methods. At the solvent concentration used, the enzyme showed 100% of its activity compared to a control without solvent (see Table S2 in the supplemental material).
The Cap Domain of an Esterase from Sorghum Rhizosphere Applied and Environmental Microbiology 30°C). Phenyl acetate (30.23 U mg 21 ) and glyceryl tripropionate (29.43 U mg 21 ) were the best substrates. We also found that EH 0 efficiently hydrolyzed bis(2-hydroxyethyl)-terephthalate (BHET; 163.6 6 6.2 U g 21 ), an intermediate in the degradation of polyethylene terephthalate (PET) (24); high-performance liquid chromatography (HPLC) analysis (Fig. S1), performed as described previously (25), confirmed the hydrolysis of BHET to mono-(2-hydroxyethyl)-terephthalic acid (MHET) but not to terephthalic acid (TA). However, using previously described conditions (25), we found that the enzyme did not hydrolyze large plastic materials such as amorphous and crystalline PET film and PET nanoparticles from Goodfellow. According to the number of hydrolyzed esters, EH 0 can be thus considered an esterase with a wide substrate specificity, similar to other enzymes of family IV (19,20). EH 0 showed maximal activity at 50°C, retaining more than 80% of the maximum activity at 40 to 55°C ( Fig. 2A), suggesting that it is moderately thermostable. This was confirmed by circular dichroism (CD) analysis, which revealed a denaturing temperature of 55.7 6 0.2°C (Fig. 2B). Its optimal pH for activity was 9.5 (Fig. 2C). The effect on the enzymatic activity of organic solvents at different concentrations was evaluated (Table S2). An activation effect was observed for EH 0 when 10% methanol (60% activity increase) and 10 to 20% dimethyl sulfoxide (DMSO) (22 to 40% increase) were added to the reaction mixture. The presence of bivalent and trivalent cations did not have a remarkable positive effect on the activity of the enzymes, which showed, in some cases, tolerance to high concentrations of cations (Table S3). A prominent inhibiting effect was shown for all cations, except for magnesium, which was well tolerated at 1 to 10 mM (,5% inhibition).
EH 0 presents tight folding of its cap domain. The crystal structure of wild-type EH 0 was obtained at 2.01-Å resolution, with the P2 1 2 1 2 1 space group and two crystallography-independent molecules in the asymmetric unit. Molecular replacement was performed using Est8 as a template (PDB code 4YPV) (26), and the final model was refined to a crystallographic R factor of 0.1717 and R free of 0.2019 (Table S4). As with other reported family IV esterases, EH 0 has an a/b hydrolase fold with two different a Assays were performed in the presence of 5.0 to 7.5% DMSO-acetonitrile (see Materials and Methods), concentrations at which the enzyme showed 100% of its activity compared to a control without solvent (see Table S2 in the supplemental material).

FIG 2
Optimal parameters for the activity and stability of purified EH 0 . (A) Temperature profile. (B) The thermal denaturation curve of EH 0 at pH 7.0 was measured by ellipticity changes at 220 nm and obtained at different temperatures. (C) pH profile. The maximal activity was defined as 100%, and the relative activity is shown as the percentage of maximal activity (mean 6 standard deviation from triplicates) determined under standard reaction conditions with p-NP butyrate as the substrate. Graphics were created with SigmaPlot version 14.0 (the data were not fitted to any model).
The Cap Domain of an Esterase from Sorghum Rhizosphere Applied and Environmental Microbiology domains, a cap domain (residues 1 to 43 and 208 to 229) and a catalytic domain (residues 44 to 207 and 230 to 324), constituted by a total of 9 a-helices and 8 b-sheets (Fig. 3A). The catalytic domain was composed of a central b-sheet with eight parallel b-strands (b1, b3, b4, b5, b6, b7, and b8), except b2, which was antiparallel and surrounded by five a-helices (a3, a4, a5, a8, and a9). The cap domain involved four a-helices (a1, a2, a6, and a7) (Fig. 3B). There were two cis peptides, Ala122-Pro123 and Trp127-Pro128, located at the b4-a4 turn within the catalytic domain. Analysis of EH 0 folding using the DALI server (27) was employed to search for homologous proteins. The closest homologs are E53 isolated from Erythrobacter longus, with 46% identity and a root mean square deviation (RMSD) of 2.3 Å on 296 Ca atoms (PDB code 7W8N) (28), Est8 isolated from Parvibaculum, with 38% identity and an RMSD of 2.5 Å on 291 Ca atoms (PDB code 4YPV) (26), PestE isolated from Pyrobaculum calidifontis, with 33% identity and an RMSD of 2.6 Å on 294 Ca atoms (PDB code 3ZWQ) (29), and EstA isolated from Sulfolobus islandicus REY15A, with 31% identity and an RMSD of 2.6 Å on 281 Ca atoms (PDB code 5LK6). The structural superimposition of these proteins reveals a high conservation of the corresponding catalytic domains and also a common spatial arrangement of the helices at the cap domains in all the proteins except EH 0 , where a2 and the long a7 are visibly shifted very close to its EH 0 active site and are apparently impeding the entrance of substrates (Fig. 3C). This was unexpected considering the broad substrate specificity of esterase EH 0 , approaching that of most promiscuous ones (19). Thus, withdrawal of the cap domain seems a necessary requirement for allowing access of the bulky substrates to the EH 0 catalytic site.
In agreement with this assumption, a substantial rearrangement of the cap domain was previously described in the homolog esterase EST2 (having 31% sequence identity), with its M211S/R215L double variant being trapped in the crystal in a conformation resembling the open form of lipases (30). However, the authors did not assign any biological relevance to this issue, considering this state an artifact derived from the crystal packing. In our study, some flexibility at this region has been found from the two conformations adopted by loop a1-a2 observed in subunits A and B of EH 0 (Fig.  3D). In addition, the inspection of residues within the cap domain showed a high number of proline residues (Pro8, Pro21, Pro23, Pro31, Pro46, and Pro47) (Fig. 3B) that are mostly nonconserved and could confer flexibility on the N-terminal cap domain, allowing entrance of the substrate. This feature will be discussed below.
EH 0 is a dimeric enzyme. While E53 and Est8 are monomeric enzymes, and EstA is a tetramer, EH 0 is presented as a biological homodimer with approximate dimensions of 6.3 by 4.9 by 2.7 nm, which is assembled in a twofold axis symmetry arrangement ( Fig. 4A) that buries 5.6% of its total surface area. Hydrogen bonds mainly involve b8 and a8, while salt bridges involve motifs b8 and a9 ( Fig. 4B and Table S5). Similarly to EH 0 's dimeric homolog PestE, oligomerization occurs through a tight interaction of b8 strands from both subunits. However, while only the subsequent helix a12 was involved in the PestE interface (29) (Fig. 4C), both the precedent a8 and subsequent a9 helices make the EH 0 interface (Fig. 4D). These different interactions observed in PestE and EH 0 were reflected in a different orientation between the monomers that, nevertheless, present a similar distance between the catalytic serines of 35 to 38 Å and an equivalent disposition of the tunnels giving access to the catalytic site at two edges of the dimer ( Fig. 4E and F). Therefore, as seen in Fig. 4A, the two cap domains are far from the interface and project out from the dimer, revealing that dimerization is not affecting the cap function.
The peculiar EH 0 active site. EH 0 has a long catalytic tunnel with a very narrow entrance (approximately 16.8-Å depth) (Fig. 5A). The catalytic triad of EH 0 is formed by Ser161 (in the conserved motif 159 G-D-S-A-G 163 in the nucleophilic elbow), Asp253 (in the conserved motif 253 D-P-I-R-D 258 ), and His283 (Fig. 5B). To analyze the active site, a series of soaking and cocrystallization experiments were performed with different suicide inhibitors, all of which were unsuccessful. A deep inspection of the active site showed that the nucleophilic Ser161 is hydrogen bonded to Glu226 from the nearby a7 and a movement of loop b3-a3, including the oxoanion in the conserved motif  large and voluminous esters such as dodecanoyl acetate, pentadecyl acetate, vinyl laurate, methyl-2,5-dihydroxycinnamate, and ethyl-2-chlorobenzoate, which were not accepted by the wild-type enzyme (Table S1). This apparently supports that removal of the Ser161-Glu226 hydrogen bond increases cap flexibility and enhances the enzymatic efficiency toward large esters at either the acyl or the alcohol sites.
Indeed, it appears clear that helix a7 must retract from the catalytic site to allow substrate entrance, but this helix seems fixed by many atomic interactions. Close to the Ser161-Glu226 hydrogen bond, Tyr223 makes additional hydrogen bonds to the Asp194 and Lys197 main chain, from loop b6-a6, and with the side chain of Gln258 from a8 (Fig. 5C). A mutation, EH 0Y223A , was generated by directed mutagenesis and found, unlike the EH 0E226A mutation, to show little effect on conversion rates compared to the wild-type enzyme. However, this mutation allowed the hydrolysis of large and voluminous esters, such as dodecanoyl acetate, pentadecyl acetate, methyl-2,5-dihydroxycinnamate, and ethyl-2-chlorobenzoate, which were also hydrolyzed by the variant EH 0E226A . Unlike EH 0E226A , the EH 0Y223A variant was unable to hydrolyze vinyl laurate, but it was able to hydrolyze methyl-3-hydroxybenzoate, not accepted by the EH 0E226A variant. This suggested that Tyr223 may play a role in accepting large esters, particularly at the acyl side, but may also play an additional role in substrate specificity different from that of Glu226.
Additionally, at the beginning of a7, Trp218 is within a hydrophobic pocket surrounded by residues Phe13, Ile17, Leu28, and Phe219 from the cap domain and Phe91 from the catalytic domain, which is also anchored by interaction with loop a1-a2 and Pro23 (Fig. 5D). Thus, to depict how this tight molecular packing may be disrupted by the proposed cap motion, molecular dynamics were applied to crystallographic refinement through the ensemble refinement strategy, which is shown to model the intrinsic disorder of macromolecules, giving more accurate structures. The ensemble models obtained for molecules A and B within the asymmetry unit are shown in Fig. 5E and H, respectively. The analysis of the molecule A conformers (Fig. 5E) revealed that the region comprising a1 and a2 shows a wide spectrum of possible pathways from more "open" to more "closed" conformations. At one edge, Pro23 is in an extended a1-a2 loop far from Trp218, therefore releasing a7, which consequently could retract from the catalytic pocket ("open-like conformation") ( Fig. 5F). In fact, the ensemble refinement models a very flexible conformation, being unstructured even at regions corresponding to a1 and a2. At the other edge, the second scenario is the entrapment of Trp218 by Pro23 in loop a1-a2, hindering substrate entrance ("closed-like conformation") ( Fig. 5G). This last scenario is equivalent to the three-dimensional (3D) structure captured by crystallography (Fig. 5D). Furthermore, three prolines can be found within this a1-a2 loop, Pro21 (at the end of a1), Pro23 (at the middle), and Pro31 (at the beginning of a2), all of them unique to EH 0 , which are probably behind the two different conformations observed at this loop in both subunits within the asymmetric unit ( Fig.  3B and D) and explain the ensemble of conformers modeled for molecule B (Fig. 5H).
Furthermore, as seen in Fig. 3B, Pro46 and Pro47 are potential hinges that would involve flexibility of a larger region of the EH 0 cap domain, including the whole N-ter- The Cap Domain of an Esterase from Sorghum Rhizosphere Applied and Environmental Microbiology minal peptide chain up to the end of a2. This is consistent with the more "open" conformations resulting from the ensemble refinement shown in Fig. 5F. In fact, the sequence comparison of EH 0 to its closest homologs reveals that only EH 0 has two sequential prolines at this region and that Pro46 is unique to EH 0 (Fig. 6), which could be a reason behind the high EH 0 promiscuity. Interestingly, EST2 also presents the two contiguous Pro residues (Pro38 and Pro39), which can also confer high mobility on its cap domain and facilitate the "open-like" conformation captured in the crystal mentioned above (30). Therefore, the mutation EH 0P46A was generated by directed mutagenesis and submitted to crystallization experiments to investigate the Pro46 putative role. However, the crystals grown from this variant, EH 0P46A , failed to diffract, suggesting that removal of Pro46 introduces some structural instability to the polypeptide chain, resulting in crystal disorder. Moreover, analysis of the activity profile showed that Pro46 is a critical residue for the entry and hydrolysis of bulky substrates, as its mutation by Ala extends the substrate specificity from 68 to 84 esters. Additionally, the hydrolytic rate increased from 1.2-to 18,000-fold (average, 335-fold) for most esters. This variant was also able to hydrolyze large glyceryl trioctanoate and 2,4-dichlorophenyl 2,4-dichlorobenzoate, which were not hydrolyzed by the wild-type enzyme or the EH 0E226A and EH 0Y223A variants. Consequently, although the proposed role of Pro46 and Pro47 as putative hinges enabling the opening of the cap domain seems appealing, other mechanisms promoting EH 0 plasticity to bulky substrates may also operate. Furthermore, as seen below, it should be noted that the two proline residues are located at the entrance of the narrow tunnel giving access to the active site, an issue that ascribes a prominent role to both residues in binding activity and specificity. Structural details of the EH 0 active site and assignment of the acyl and alcohol moieties were explored by comparison with its homologs E53 complexed with 4-nitrophenyl hexanoate (PDB code 6KEU) (28) and EH 1AB1 in complex with a derivative of methyl-4-nitrophenylhexylphosphonate (PDB code 6RB0) (31). However, the acyl and alcohol moieties of these complexes are located in opposite sites ( Fig. 5I and J). Therefore, as we could not obtain complexes from EH 0 , the activity experiments were crucial to correctly assign acyl/alcohol moieties. As mentioned above, experimental evidence demonstrated that Tyr223 produces a steric hindrance at the acyl moiety, and consequently, the acyl and alcohol sites correspond to those observed in EH 1AB1 (Fig. 5J). On the basis of this assumption, the acyl binding site seems to be a small cavity bordered by the Tyr199 and Ile255 side chains, which produce steric hindrance for substrates with large acyl moieties. The long and narrow alcohol binding site is surrounded by both hydrophobic (Phe13, Phe91, Val92, and Leu288) and hydrophilic (Asp45, His99, Asp160, Tyr190, and Thr287) residues, with Pro46 and Pro47 being at the entrance of the tunnel (Fig. 5K). Most residues at the acyl and alcohol moieties are conserved among EH 0 homologs, with the exception of Met40, Pro46, Tyr199, Glu226, and Thr287. Remarkably, the bulky Met40 and Tyr199 residues are replaced by smaller residues in the EH 0 homologs. As previously mentioned, Pro46 is unique in EH 0 , while Glu226 is replaced by a conserved Asp, and finally, most homologs show an Asn residue instead of Thr287 (Fig. 6). Therefore, as the retraction of the cap domain must be performed to allow entrance by the substrate, residues Phe91, Tyr190, and Tyr199 from the catalytic domain, which are located close to the catalytic triad, seem essential for substrate specificity (Fig. 5K).

DISCUSSION
In this study, the microbial community of the Sorghum bicolor rhizosphere was exposed to a chemical treatment prior to environmental DNA (eDNA) extraction to construct a metagenomic library. Plant roots can secrete exudates composed of a large variety of compounds into the soil, some of which may play important roles in the rhizosphere (32,33), and with effects that involve multiple targets, including soil microorganisms. This is why an amendment of the soil with technical cashew nut shell liquid (tCNSL), containing a mixture of phenolic compounds with long aliphatic side chains (up to C 22:0 ), was carried out directly in the rhizosphere and, later, for 3 weeks under controlled laboratory conditions, as we were interested in screening lipolysis-like activity. By applying metagenomics techniques, we retrieved an esterase, EH 0 , highly similar (99% identity) to the predicted a/b hydrolase from the genome of S. pruni (accession no. WP_066587239). The most homologous, functionally characterized protein is actually P95125.1, a carboxylic ester hydrolase LipN from Mycobacterium tuberculosis H37Rv, which shows only 41% amino acid sequence identity with EH 0 . That said, given the high identity of WP_066587239 and EH 0 , we expect both hydrolases to have similar properties, although this is yet to be experimentally confirmed. Indeed, we observed that there are only three changes in their sequences, which are located on the outside and in loops away from the key residues and the dimerization interface (see Fig. S2 in the supplemental material).
EH 0 was classified within the previously described hormone-sensitive lipase (HDL) type IV family, which is one of at least 35 families and 11 true lipase subfamilies known to date (10,23,34). This family is reported to contain ester hydrolases with relative SASA values ranging from 0% to 10% and high levels of substrate specificity (19). Note that SASA, computed as a (dimensionless) percentage (0 to 1 or 0 to 100) of the ligand SASA in solution (19), is a parameter that describes the solvent exposure of the cavity containing the catalytic triad and the capacity of a cavity to retain/stabilize a substrate (19). For example, a SASA of 40% (over 100%) implies that 40% of the surface is accessible to the solvent, which facilitates the escape of the substrate to the bulk solvent; this is the case for enzymes with an active site on the surface where the catalytic triad is highly exposed. In contrast, enzymes that have a larger but almost fully occluded site that can better maintain and stabilize the substrate inside the cavity are characterized by relative SASA values of approximately 0 to 10%. This is the case for EH 0 , which has a SASA of 5.07% because of a large but almost fully occluded active site, an architecture that is known to better maintain and stabilize a higher number of substrates inside the cavity (19). Indeed, this enzyme houses a very long and narrow catalytic pocket, where helix a7 is very close to the catalytic triad with residue Glu226, making a direct hydrogen bond to the catalytic nucleophile Ser161. Therefore, it appeared clear that the cap domain must retract to allow entrance of the substrate to the active site. The movement of this domain was modeled by combining X-ray diffraction data with molecular dynamics simulation through the ensemble refinement procedure. This strategy showed a broad range of putative conformations at the cap domain, with Pro46 and Pro47 likely acting as hinges conferring a high plasticity on the N-terminal region of the cap. Remarkably, the presence of a number of prolines at this region, particularly these two sequential prolines, is a unique feature of EH 0 compared to its homologs and other substrate-promiscuous members of family IV (19). Mutational analysis confirmed the role of one of these prolines in the access and hydrolysis of the large and voluminous substrates and thus in the increase in the substrate promiscuity level.
The above structural features differ from those of other substrate-promiscuous family IV esterases, tested over the same set of ester substrates, namely, EH 1AB1 (31) and EH 3 (35,36), capable of hydrolyzing a similar number of esters. The comparison of EH 1AB1 and EH 3 with EH 0 shows major differences related to the lid (Fig. 7A). Whereas EH 1AB1 and EH 3 show large and wide catalytic pockets with two possible points of access to the binding site ( Fig. 7B and C), EH 0 has a unique, narrow, and long entrance to the catalytic active site, as noted above (Fig. 7D). Therefore, in the case of EH 0 only a structural rearrangement of the cap domain would allow its adaptation to all different substrates, which likely implies that the cap domain of EH 0 exhibits more flexibility than those from EH 1AB1 and EH 3 . This is consistent with the fact that of the 80 esters that the three enzymes together are able to hydrolyze, 61 (or 76%) were common to all three, and EH 0 was the only one able to hydrolyze such bulky substrates as 2,4dichlorobenzyl 2,4-dichlorobenzoate or diethyl-2,6-dimethyl 4-phenyl-1,4-dihydro pyridine-3,5-dicarboxylate.
Biochemical characterization of the novel esterase also revealed that the activity of EH 0 was in most cases stimulated in the presence of 10% organic solvents, particularly in 10% methanol and DMSO. Such activation is also a characteristic of some lipases. For example, the analysis of the lipase from Thermus thermophilus revealed that although the overall structure was kept stable with or without polar organic solvent, the lid region was more flexible in the presence of the latter. The flexible lid facilitates the substrate's access to the catalytic site inside the lipase, and the lipase displays enhanced activity in the presence of a polar organic solvent (37). The use of organic solvents offers more advantages over canonical aqueous biocatalysis for various reasons: higher solubility of hydrophobic substrates, minor risk of contamination, and higher thermal stability (38)(39)(40). EH 0 has a potential advantage in applications that require alkaline conditions due to its ability to act at the optimal pH of 9.0. Temperature-controlled tests indicated a mesophilic/slightly thermophilic profile of the esterase as expected from the original habitat at moderate temperatures. In addition, the structure of EH 0 is similar (31 to 33% identity) to those of extremophiles, namely, Pyrobaculum (3ZWQ) and Sulfolobus (5LK6) species.
In summary, the present study evaluated the conformational plasticity of the cap domain in members of family IV and the role of several nonconserved prolines as putative structural factors regulating their broader substrate specificity than that of other members of the 35 families and 11 true lipase subfamilies reported so far (10). This high molecular flexibility is markedly different from that found in other family IV esterases and a family VIII b-lactamase fold hydrolase (EH 7 ) which has been recently shown to be highly substrate promiscuous. In this case, the broad substrate specificity is given by the presence of a more open and exposed S1 site having no steric hindrance for the entrance of substrates to the active site and more flexible R1, R2, and R3 regions allowing the binding of a wide spectrum of substrates into the active site.
Conclusions. An activity-based metagenomics approach was used to study the microbial enzyme diversity in rhizosphere soil of Sorghum plants amended with CNSL soil. A novel esterase was found, which possessed a broad substrate promiscuity in combination with a significant pH and solvent tolerance. This work is crucial for deciphering structural markers responsible for the outstanding broad substrate specificity of EH 0 . Indeed, this work further provides important insights into the role of cap domains and their contribution to the diverse selectivity profiles and thus versatility of family IV esterases/lipases toward the conversion of multiple substrates.

MATERIALS AND METHODS
Plant material and outdoor seed germination. Seeds of Sorghum bicolor genotype BTx623 were obtained from the Agricultural Research Service of the United States Department of Agriculture (USDA) (Georgia, USA) as a gift. Field soil was sampled from the Henfaes Research Centre (53°14921.00N, 4°0 1906.50W, Gwynedd, Wales) in September 2014. The soil sample was composed of a mixture of five topsoil samples collected from randomly selected positions in the field. The soil was air dried, mixed thoroughly, and stored at room temperature for use in subsequent experiments. Two-liter pots were filled with soil, and two seeds of S. bicolor BTx623 were planted per pot. Plants were cultivated in a greenhouse at 20°C, and the soil moisture content was maintained with tap water.
Enrichment with CNSL. Three grams of technical cashew nut shell liquid (tCNSL) dissolved in 70% ethanol was added to a pot of 20-day-old plants and thoroughly mixed with the soil. After 60 days, plants were pulled out of the pot, and the soil was shaken off; samples of rhizosphere soil attached to the plant roots were then brushed off and collected. tCNSL was provided by the BioComposites Centre at Bangor University (Wales, UK). Three biological replicates of laboratory microcosm enrichment were set up in conical 1-L Erlenmeyer flasks by mixing 10 g of the collected rhizosphere soil with 300 mL of sterile Murashige Skoog basal medium (Sigma) and 10 mg/L cycloheximide. tCNSL was dissolved in 70% ethanol and added to the medium to a final concentration of 0.1 g/L; flasks were incubated at 20°C in an orbital shaker. Fifty grams of oil slurry was sampled every 7 days, and fresh tCNSL-containing (0.1 g/kg) medium was added to replace the volume of the medium.
Extraction of DNA and generation of metagenomic library. Samples collected after 3 weeks of flask microcosm enrichment were used for the construction of fosmid metagenomic libraries. Environmental DNA was extracted using the Meta-G-Nome DNA isolation kit (Epicentre Biotechnologies, WI, USA) according to the manufacturer's instructions. Briefly, 50 mL of the soil suspension from the flask enrichment was centrifuged at 400 Â g for 5 min. The supernatant was filtered through 0.45-mm and 0.22-mm membrane filters. This procedure was repeated with the initial soil sample four times: the remaining soil was resuspended in phosphate-buffered saline (PBS) and centrifuged, and the supernatant was filtered as before. Filters were combined, and the sediment on the filter was resuspended in extraction buffer and collected. DNA extraction was carried out according to the protocol described by the manufacturer. The quality of the extracted DNA was evaluated on an agarose gel and quantified with the Quant-iT double-stranded DNA (dsDNA) assay kit (Invitrogen) on a Cary Eclipse fluorimeter (Varian/Agilent) according to the manufacturer's instructions. The extracted metagenomic DNA was used to prepare two different metagenomic fosmid libraries using the CopyControl fosmid library production kit (Epicentre). DNA was end repaired to generate blunt-ended, 59-phosphorylated doublestranded DNA using reagents included in the kit according to the manufacturer's instructions. Subsequently, fragments of 30 to 40 kb were selected by electrophoresis and recovered from a lowmelting-point agarose gel using GELase 50Â buffer, and GELase enzyme preparation was also included in the kit. Nucleic acid fragments were then ligated to the linearized CopyControl pCC2FOS vector in a ligation reaction performed at room temperature for 4 h, according to the manufacturer's instructions. After in vitro packaging into phage lambda (MaxPlax lambda packaging extract; Epicentre), the transfected phage T1-resistant EPI300-T1R E. coli cells were spread on Luria-Bertani (LB) agar medium (hereinafter, unless mentioned otherwise, the agar content was 1.5% [wt/vol]) containing 12.5 mg/mL chloramphenicol and incubated at 37°C overnight to determine the titer of the phage particles. The resulting library, SorRhizCNSL3 W, has an estimated titer of 1.5 Â 10 6 clones. For long-term storage, the library was plated onto solid LB medium with 12.5 mg/mL chloramphenicol, and after overnight growth, colonies were washed off from the agar surface using LB broth with 20% (vol/vol) sterile glycerol, and aliquots were stored at 280°C.
Screening metagenomic libraries: agar-based methods. Fosmid clones obtained by plating the constructed libraries on LB agar plates were arrayed in 384-microtiter plates (1 clone/well) or alternatively in 96-microtiter plates (pools of approximately 40 clones/well) containing LB medium and chloramphenicol (12.5 mg/mL). The plates were incubated at 37°C overnight, and the day after replication, the plates were produced and used in the screening assay. Glycerol (20% [vol/vol], final concentration) was added to the original plates, which were stored at 280°C. Gel diffusion and colorimetric assays were adapted for the screening of the desired activities. The detection of lipase/esterase activity was carried out on LB agar supplemented with chloramphenicol (12.5 mg/mL), fosmid autoinduction solution (2 mL/L) (Epicentre), and 0.3% (vol/vol) tributyrin emulsified with gum arabic (2:1, vol/vol) by sonication. The previously prepared microtiter plates were printed on the surface of large (22.5 cm by 22.5 cm) LB agar plates using 384-pin polypropylene replicators and incubated for 18 to 48 h at 37°C. Lipolytic activity was identified as a clear zone around the colonies where tributyrin was hydrolyzed (12).
Extraction of fosmids, DNA sequencing, and annotation. The fosmid DNA of the positive clone was extracted using the Qiagen plasmid purification kit (Qiagen). To reduce the host chromosomal E. coli DNA contamination, the sample was treated with ATP-dependent exonuclease (Epicentre). The purity and approximate size of the cloned fragment were assessed by agarose gel electrophoresis after The Cap Domain of an Esterase from Sorghum Rhizosphere Applied and Environmental Microbiology endonuclease digestion simultaneously with BamHI and XbaI (New England Biolabs; in NEBuffer 3.1 at 37°C for 1 h using 1 U of enzyme per 1 mg DNA). DNA concentration was quantified using the Quant-iT dsDNA assay kit (Invitrogen), and DNA sequencing was then outsourced to Fidelity Systems (NJ, USA) for shotgun sequencing using the Illumina MiSeq platform. GeneMark software (41) was employed to predict protein coding regions from the sequences of each assembled contig, and deduced amino acid sequences were annotated via BLASTP and the PSI-BLAST tool (42). Cloning, expression, and purification of proteins. The selected nucleotide sequence was amplified by PCR using Herculase II fusion enzyme (Agilent, USA) with specific oligonucleotide primer pairs incorporating p15TV-L adapters. The corresponding fosmid was used as a template to amplify the target genes. The primers used to amplify the esterase gene characterized in this study were as follows: EH 0F , TTGTATTTCCAGGGCATGACCGAGCTCTTCGTCCGC; EH 0R , CAAGCTTCGTCATCATGCCGCCGCCTGTGCCATC. PCR products were visualized on a 1% Tris-acetate-EDTA (TAE) agarose gel and purified using the NucleoSpin PCR cleanup kit (Macherey-Nagel) following the manufacturer's instructions. Purified PCR products were cloned into the p15TV-L vector, transformed into E. coli NovaBlue GigaSingles competent cells (Novagen, Germany), and plated on LB agar with 100 mg/mL ampicillin. The correctness of the DNA sequence was then verified by Sanger sequencing at Macrogen Ltd. (Amsterdam, The Netherlands). 3D models of the proteins were generated by Phyre2. The intensive mode attempts to create a complete full-length model of a sequence through a combination of multiple template modeling and simplified ab initio folding simulation (43). The nucleotide and amino acid sequences of the selected nucleotide sequences are available at GenBank under accession no. MK791218. For recombinant protein expression, the plasmids were transformed into E. coli BL21(DE3) cells and subsequently plated on LB agar with 100 mg/mL ampicillin. To confirm the esterase activity of recombinant proteins, E. coli clones harboring the recombinant plasmid were streaked onto LB agar plates containing 0.5% (vol/vol) tributyrin and 0.5 mM isopropyl-b-D-galactopyranoside (IPTG), or purified enzymes were spotted directly on the agar. The plates were then incubated at 37°C overnight and visually inspected for the presence of signs of substrate degradation. E. coli clones were grown at 37°C to an absorbance of 0.8 at 600 nm, induced with 0.5 mM IPTG, and allowed to grow overnight at 20°C with shaking. Cells were harvested by centrifugation at 5,000 Â g for 30 min at 4°C. For purification of recombinant protein, the following protocol was applied. Cell pellets were resuspended in cold binding buffer (50 mM HEPES, pH 7.5, 400 mM NaCl, 5% glycerol, 0.5% Triton X-100, 6 mM imidazole, pH 7.5, 1 mM b-mercaptoethanol, 0.5 mM phenylmethylsulfonyl fluoride [PMSF]) and extracted by sonication. The lysates were then centrifuged at 22,000 Â g for 30 min at 4°C, and the supernatant was purified by affinity chromatography using nickel-nitrilotriacetic acid (Ni-NTA) His-bind resin (Novagen). The column packed with the resin was equilibrated with binding buffer, and after the addition of supernatant, it was washed with 6 volumes of wash buffer (50 mM HEPES, pH 7.5, 400 mM NaCl, 5% glycerol, 0.5% Triton X-100, 26 mM imidazole, pH 7.5, 1 mM b-mercaptoethanol, 0.5 mM PMSF) to remove nonspecifically bound proteins. His-tagged proteins were then eluted with elution buffer (50 mM HEPES, pH 7.5, 400 mM NaCl, 5% glycerol, 0.5% Triton X-100, 266 mM imidazole, pH 7.5, 1 mM b-mercaptoethanol, 0.5 mM PMSF). The size and purity of the proteins were estimated by SDS-PAGE. Protein solutions were desalted through the Amicon Ultra15 10K centrifugal filter device. Protein concentrations were determined using the Bradford reagent (Sigma) and the BioMate 3S spectrophotometer (Thermo Scientific, USA).
Biochemical assays. Hydrolytic activity was determined by measuring the amount of p-nitrophenol released by catalytic hydrolysis of p-nitrophenyl (p-NP) esters through a modified method of Gupta et al. (44). Stock solutions of p-NP esters (100 mM p-NP acetate, 100 mM p-NP butyrate, 20 mM p-NP dodecanoate, and 20 mM p-NP palmitate) were prepared in DMSO-acetonitrile (1:1, vol/vol). Unless stated otherwise, the enzymatic assay was performed under standard conditions in a 1-mL reaction mixture (50 mM potassium phosphate buffer, pH 7.0, 0.3% [vol/vol] Triton X-100, 1 mM substrate) under agitation in a water bath at 50°C until complete substrate solubilization (note that solvents were present in the assays at a concentration of 1% for 100 mM stocks of p-NP butyrate and acetate or 5% for 20 mM stocks of p-NP dodecanoate and palmitate). After that, the multititer plate was preincubated at 30°C for 5 min in a BioMate 3S spectrophotometer (Thermo Scientific, USA), which was set up at this temperature, and then an appropriate volume of purified enzyme containing 0.25 mg was added to start the reaction. The reaction mixture was incubated at 30°C for 5 min and then measured at 410 nm in a BioMate 3S spectrophotometer (Thermo Scientific, USA). The incubation time for p-NP dodecanoate and p-NP palmitate was extended to 15 min. All experiments were performed in triplicate, and a blank with denatured enzyme was included. The concentration of product was calculated by a linear regression equation given on the standard curve performed by the reference compound p-nitrophenol (Sigma). One unit of enzyme activity was defined as 1 mmol of p-nitrophenol produced per minute under the assay conditions.
Kinetic parameters were determined under standard conditions and calculated by nonlinear regression analysis of raw data fit to the Michaelis-Menten function using GraphPad Prism software (version 6.0). For the kinetics with p-NP butyrate and acetate, the concentrations were set up to 0.1, 0.5, 1, 2, and 10 mM, using stock concentrations of 100 mM in DMSO-acetonitrile (a maximum concentration of 5% DMSO-acetonitrile was used in the assay). For p-NP dodecanoate, concentrations of 0.05, 0.1, 0.5, 1, 2, and 3 mM were used, with a stock concentration of 20 mM substrate (a maximum concentration of 7.5% DMSO-acetonitrile was used in the assay). Raw data and information about precision and calculations are provided in Table S6 in the supplemental material.
The optimal pH for enzyme activity was evaluated with p-NP butyrate by performing the assay in different buffers, specifically 20 mM sodium acetate buffer (pH 4.0), sodium citrate buffer (pH 5.5), potassium phosphate buffer (pH 7.0), and Tris-HCl (pH 8.0 to 9.0). The enzyme reactions were stopped by adding 1 mL of cold stop solution (100 mM sodium phosphate buffer, pH 7.0, 10 mM EDTA) to neutralize the pH and avoid changes in the equilibrium between p-nitrophenol and the deprotonated form p-nitrophenoxide, which would result in a decrease in absorption at the applied wavelength of 410 nm (45). The optimal enzymatic temperature was investigated with p-NP butyrate by performing the hydrolytic assay at different temperatures under standard conditions (see above). To determine the denaturation temperature, circular dichroism (CD) spectra were acquired between 190 and 270 nm with a Jasco J-720 spectropolarimeter equipped with a Peltier temperature controller in a 0.1-mm cell at 25°C. The spectra were analyzed, and melting temperature (T m ) values were determined at 220 nm between 10 and 85°C at a rate of 30°C per hour in 40 mM HEPES buffer at pH 7.0. CD measurements were performed at pH 7.0 and not at the optimal pH (8.5 to 9.0) to ensure protein stability. A protein concentration of 0.5 mg Á mL 21 was used. T m (and the standard deviation of the linear fit) was calculated by fitting the ellipticity (millidegrees [mdeg]) at 220 nm at each of the different temperatures using a 5-parameter sigmoid fit with SigmaPlot 13.0.
Stability in organic solvents was assayed with p-NP butyrate under standard conditions in the presence of 10, 20,40, and 60% (vol/vol) of the water-miscible organic solvents ethanol, methanol, isopropanol, acetonitrile, and DMSO and a mixture of acetonitrile-DMSO (50% each). The effect of cations was investigated with p-NP butyrate under standard conditions by the addition of MgCl 2 , CuCl 2 , FeCl 3 , CoCl 2 , CaCl 2 , MnCl 2 , and ZnSO 4 at concentrations in the range of 1 to 10 mM. In all cases, the measured values were then expressed as the relative activity in comparison to the control reaction performed under standard conditions. The hydrolysis of esters other than p-NP esters, including bis(2-hydroxyethyl)-terephthalate (BHET), was assayed using a pH indicator assay in 384-well plates at 30°C and pH 8.0 in a Synergy HT multimode microplate reader in continuous mode at 550 nm over 24 h (extinction coefficient of phenol red, 8,450 M 21 cm 21 ). The acid produced after ester bond cleavage by the hydrolytic enzyme induced a color change in the pH indicator that was measured at 550 nm. The experimental conditions were as detailed previously (35), with the absence of activity defined as at least a 2-fold background signal. Briefly, the reaction conditions for 384-well plates (catalog no. 781162; Greiner Bio-One GmbH, Kremsmünster, Austria) were as follows: protein, 0.2 to 2.0 mg per well; ester, 20 mM; temperature (T), 30°C; pH, 8.0 [5 mM 4-(2-hydroxyethyl)-1-piperazinepropanesulfonic acid (EPPS) buffer, plus 0.45 phenol red]; reaction volume, 40 mL. The reactions were performed in triplicate, and data sets were collected from a Synergy HT multimode microplate reader with Gen5 2.00 software (BioTek). One unit of enzyme activity was defined as the amount of enzyme required to transform 1 mmol of substrate in 1 min under the assay conditions. Raw data and information about precision and calculations are provided in Table S7.
Crystallization and X-ray structure determination of EH 0 . Initial crystallization conditions were explored by high-throughput techniques with a NanoDrop robot (Innovadyne Technologies) using 24 mg Á mL 21 protein concentrations in HEPES (40 mM, pH 7 50 mM, NaCl), protein reservoir ratios of 1:1, 1.5:1, and 2:1, and commercial screens Crystal Screen I and II, SaltRx, Index (Hampton Research), JBScreen Classic, JBScreen JCSG, and JBScreen PACT (Jena Bioscience). Further optimizations were carried out, and bar-shaped crystals of EH 0 were grown after 1 day of mixing 1.2 mL of a mixture of protein (1 mL, 24 mg Á mL 21 ) and seeds (0.2 mL, 1:100) with guanidine hydrochloride (0.2 mL, 0.1 M) and reservoir (0.5 mL, 11% polyethylene glycol 8000, 100 mM Bis-Tris, pH 5.5, 100 mM ammonium acetate). For data collection, crystals were transferred to cryoprotectant solution consisting of mother liquor and glycerol (20% [vol/vol]) before being cooled in liquid nitrogen. Diffraction data were collected using synchrotron radiation on the XALOC beamline at ALBA (Cerdanyola del Vallés, Spain). Diffraction images were processed with XDS (46) and merged using AIMLESS from the CCP4 package (47). The crystal was indexed in the P2 1 2 1 2 1 space group, with two molecules in the asymmetric unit and 62% solvent content within the unit cell. The data collection statistics are given in Table S4. The structure of EH 0 was solved by molecular replacement with MOLREP (48) using the coordinates from Est8 as a template (PDB code 4YPV). Crystallographic refinement was performed using the program REFMAC (49) within the CCP4 suite, with NCS (noncrystallography symmetry) medium restraints and excluding residues 17 to 36. The free R factor was calculated using a subset of 5% randomly selected structure-factor amplitudes that were excluded from automated refinement. Subsequently, heteroatoms were manually built into the electron density maps with Coot8 (50), and water molecules were included in the model, which, combined with more rounds of restrained refinement, reached the R factors listed in Table  S4. The figures were generated with PyMOL. The crystallographic statistics of EH 0 are listed in Table S4. To extract dynamical details from the X-ray data, the coordinates of EH 0 were first refined using PHENIX (51) and then were used as input models for a time-averaged molecular dynamics refinement as implemented in the Phenix.ensemble-refinement routine, which was performed as described previously (52).
Data availability. The sequence encoding EH 0 was deposited in GenBank with the accession number MK791218. The atomic coordinates and structure factors for the EH 0 structure have been deposited in the RCSB Protein Data Bank with accession code 7ZR3.