Overproduction, crystallization and preliminary crystallographic analysis of a novel human DNA-repair enzyme that recognizes oxidative DNA damage

# 2004 International Union of Crystallography Printed in Denmark ± all rights reserved DNA glycosylases repair oxidative DNA damage caused by free radicals. Recently, NEIL1, a human homolog of Escherichia coli DNA glycosylase endonuclease VIII, has been identi®ed and shown to exhibit broad substrate speci®city for a variety of types of pyrimidine-base damage. An active C-terminal deletion construct of NEIL1 was overexpressed in E. coli and crystallized. The unliganded NEIL1 crystallizes in space group R3, with unit-cell parameters a = b = 132.2, c = 51.1 AÊ . Complete data sets were collected from native, selenomethionyl and iodinated NEIL1 to 2.1, 2.3 and 2.4 AÊ , respectively. Received 4 March 2004 Accepted 1 April 2004


Introduction
Reactive oxygen species are generated endogenously; when produced in the vicinity of DNA, these free radicals cause a plethora of types of DNA damage, including base modi®cations, base loss and single-strand breaks (reviewed in Wallace, 2002). A battery of DNA glycosylases initiate the repair process by ®rst removing the oxidatively damaged bases and then, in a concerted lyase reaction, cleaving the DNA backbone to generate a single-nucleotide gap. DNA polymerization followed by ligation completes the base-excision repair process.
In Escherichia coli, two different DNA glycosylases, endonuclease III (EcoNth) and endonuclease VIII (EcoNei), recognize and repair oxidized pyrimidine DNA bases (Wallace, 2002). Single E. coli mutants defective in either EcoNth or EcoNei do not show a mutator phenotype; however, nth nei double mutants exhibit elevated spontaneous mutation frequencies, suggesting that they serve as backup activities for each other (Blaisdell et al., 1999;Jiang et al., 1997;Saito et al., 1997). hNTH1, a homolog of EcoNth, was the ®rst human pyrimidine DNA glycosylase to be characterized (Aspinwall et al., 1997;Hilbert et al., 1997); we and others have recently identi-®ed three new human homologs (NEIL1, 2 and 3) of EcoNei (Bandaru et al., 2002;Morland et al., 2002;Takao et al., 2002;Wallace et al., 2003). Multiple sequence alignments showed that the active-site residues, the helix± two-turns±helix and zinc-®nger motifs are conserved between EcoNei and the human NEIL proteins, with the exception of NEIL1, which lacks the zinc-®nger motif (Bandaru et al., 2002). Unlike EcoNei, the NEIL1 amino-acid sequence does not contain any of the four cysteine residues that coordinate zinc in the E. coli enzyme. The substrate speci®city of NEIL1, however, resembles that of EcoNei in that NEIL1 recognizes all oxidized pyrimidine bases (Bandaru et al., 2002;. As a ®rst step towards elucidating the nature of substrate recognition by the human NEIL1 enzyme, we have overexpressed, puri-®ed and crystallized this glycosylase.

Protein expression and purification
A pET vector carrying full-length NEIL1 was transformed into Rosetta (DE3)pLysS cells (Novagen) and grown at 289 K for $16 h. Recombinant C-terminally His-tagged NEIL1 protein (398 amino acids, with an estimated molecular weight of 44.7 kDa) was puri®ed as described previously (Bandaru et al., 2002). Crystallization trials for full-length C-terminally His-tagged NEIL1 failed to yield any crystals. This inability to grow crystals was corroborated by the fact that the protein was polydisperse regardless of the temperature or buffer conditions used, based on dynamic lightscattering (DLS) experiments performed on a DynaPro MS-X instrument (Protein Solutions). Hence, two different approaches were employed to engineer a protein construct that would be more amenable to crystallization.
We had originally identi®ed human NEIL1 in the genomic database with PSI-BLAST using Arabidopsis thaliana formamidopyrimidine DNA glycosylase (AthFpg) as the seed sequence. Subsequently, we cloned and expressed human NEIL1 and showed it to be a homolog of E. coli endonuclease VIII crystallization papers (Bandaru et al., 2002). Moreover, it was known that the C-terminal 109 amino acids of AthFpg are not required for either substrate binding or DNA-glycosylase activity (Ohtsubo et al., 1998). We therefore surmised that based on the sequence alignment of AthFpg and NEIL1, a C-terminal deletion construct of NEIL1 could be designed that might retain glycosylase activity. Multiple sequence alignments of AthFpg and NEIL1 along with other members of the Fpg/Nei family predicted that deletion of the C-terminal 56 amino acids in NEIL1 should not affect the glycosylase activity or DNA-binding capability. Accordingly, a deletion construct of NEIL1 missing the C-terminal 56 amino acids (NEIL1CÁ56) was constructed by PCR and subcloned into a pET30a vector (Novagen) between NdeI and XhoI sites.
In the second approach, a Predictor of Naturally Disordered Regions (PONDR; Li et al., 1999) was used to detect disordered region(s) in NEIL1 that may hinder crystallization. PONDR analysis of AthFpg and NEIL1 showed that both proteins have a disordered C-terminal region (Fig. 1). Interestingly, the disordered region predicted in AthFpg by the PONDR analysis corresponds to the C-terminal 109 residues shown to be dispensable for glycosylase activity (Ohtsubo et al., 1998). Based on this observation, we designed a construct missing the entire disordered region (C-terminal 106 amino acids) in NEIL1 (Fig. 1). Little or no soluble protein expression was observed with this construct. A series of shorter C-terminal deletion constructs were cloned and checked for expression. These studies showed that deletions of >100 amino acids did not yield any protein expression. Therefore, a NEIL1 construct missing the C-terminal 100 amino acids (NEIL1CÁ100) was constructed for crystallization studies (Fig. 1).
The truncated NEIL1CÁ56 and NEIL1CÁ100 protein constructs carrying a C-terminal hexa-His tag, which comprises eight amino acids (LEHHHHHH), were expressed and puri®ed as described above. As shown in Fig. 2, both protein constructs were puri®ed to homogeneity. Both freshly prepared and¯ash-frozen truncated NEIL1 proteins retained DNA glycosylase/lyase activity on a thymine glycol-containing double-stranded substrate, although NEIL1CÁ100 was less stable and lost activity over time.

Crystallization and X-ray diffraction experiments
Preliminary crystallization conditions for both NEIL1CÁ56 and NEIL1CÁ100 were obtained using a sparse-matrix screen (Jancarik & Kim, 1991;Crystal Screen I, Hampton Research) by the hanging-drop vapor-diffusion method, mixing 1 ml protein solution (5 mg ml À1 in 20 mM HEPES pH 7.6, 150 mM NaCl, 1 mM DTT, 0.1 mM EDTA, 10% glycerol) with 1 ml well solution and equilibrating against 0.6±1 ml reservoir solution. The incubation temperatures for the crystal trays were determined based on DLS experiments performed on a DynaPro MS-X instrument (Protein Solutions). Temperatures at which the protein was monodisperse were deemed to be suitable for crystallization. Despite the fact that it was missing most of the C-terminal disordered region, unliganded NEIL1CÁ100 protein precipitated in most of the conditions, even at low protein concentration. In contrast, NEIL1CÁ56, which still retains some of the region predicted to be disordered (Fig. 1), crystallized in Crystal Screen I condition 17 (30% polyethylene glycol 4000, 0.1 M Tris±HCl pH 8.5, 0.2 M lithium sulfate) at 285 K. After microseeding, unli-    crystallization papers ganded NEIL1CÁ56 yielded diffractionquality crystals at 285 K in 2±7 d (Fig. 3) A selenomethionyl variant of NEIL1-CÁ56 was prepared by inhibiting methionine biosynthesis (Doublie Â, 1997) and was puri®ed as described above. A litre of induced culture yielded $0.5 mg pure SeMet-NEIL1CÁ56 protein, enough to grow crystals for diffraction studies. Crystallization conditions for SeMet-NEIL1CÁ56 were similar to those used for the native protein.
Crystals were harvested from their crystallization drops and¯ash-cooled in liquid nitrogen. Data from NEIL1CÁ56 crystals were collected on a MAR345 image-plate detector using Cu K radiation generated by a Rigaku HR-300 X-ray generator. Assuming one monomer per asymmetric unit, a Matthews coef®cient (Matthews, 1968) of 2.25 A Ê 3 Da À1 was calculated, which corresponds to $45% solvent content. Molecular-replacement attempts were made using other members of the Nei family as search models, but they uniformly failed to produce a clear solution. In addition to producing the selenomethionyl protein, we chose to attempt phasing with halides (Dauter & Dauter, 2001;Dauter et al., 2000), which we performed while waiting for synchrotron beamtime to collect multiwavelength data from the selenomethionyl crystals. The native NEIL1CÁ56 crystals were soaked in sodium iodide solutions (0.125±0.25 M NaI for 2±15 min; Dauter & Dauter, 2001;Dauter et al., 2000). SeMet crystals were also soaked in iodide solutions, which provided a double derivative (Rould, 1997). All derivative data were collected at 100 K using Cu K radiation on a MAR 345 image-plate detector and care was taken to collect accurate and redundant data. A summary of the data-collection statistics is shown in Table 1. Data were integrated and scaled using DENZO and SCALEPACK (Otwinowski & Minor, 1997).

Phasing
CNS was used to locate and re®ne the four Se sites in the diffraction data collected using Cu K radiation (Bru È nger et al., 1998). The resulting selenium phases were used in isomorphous difference Fourier calculations to identify the iodide sites. All sites were then re®ned with SOLVE; RESOLVE was used for density modi®cation (Terwilliger, 2002). It should be noted here that combining the selenomethionyl and iodide derivatives provided enough phasing information to yield an interpretable electrondensity map, con®rming the usefulness of halide soaks (Dauter & Dauter, 2001;Dauter et al., 2000) and selenium substitution in the phasing of protein structures, even in cases where the number of methionines is below average (one methionine in 86 amino acids compared with an average of one in 50; Lemke et al., 2002). In addition, all of the data sets were collected using Cu K radiation, giving credence to the assertion that when selenomethionyl protein crystals are in hand, one should not wait for synchrotron time to attempt phasing, provided of course that the diffraction data are accurate (Lemke et al., 2002). Model building and re®nement of the structure are in progress and structural details will be described elsewhere.
We thank Dr Jeffrey Bond for stimulating discussions and Dr Mark A. Rould for critically reading the manuscript. This work was supported by an NIH award (PHS R37CA33657) to SSW. The crystallographic work was supported by an award to the University of Vermont under the Howard Hughes Medical Institute Biomedical Research Support Program for Medical Schools. ² R merge = jI À hIija I, where hIi is the average intensity from multiple observations of symmetry-related re¯ections. ³ R iso = jF PH À F P ja jF P j, where F P is the observed structure-factor amplitude for the native data set and F PH is the observed structure-factor amplitude for the heavy-atom derivative.