5‐Hydroxymethylcytosine as a clinical biomarker: Fluorescence‐based assay for high‐throughput epigenetic quantification in human tissues

Epigenetic transformations may provide early indicators for cancer and other disease. Specifically, the amount of genomic 5‐hydroxymethylcytosine (5‐hmC) was shown to be globally reduced in a wide range of cancers. The integration of this global biomarker into diagnostic workflows is hampered by the limitations of current 5‐hmC quantification methods. Here we present and validate a fluorescence‐based platform for high‐throughput and cost‐effective quantification of global genomic 5‐hmC levels. We utilized the assay to characterize cancerous tissues based on their 5‐hmC content, and observed a pronounced reduction in 5‐hmC level in various cancer types. We present data for glioblastoma, colorectal cancer, multiple myeloma, chronic lymphocytic leukemia and pancreatic cancer, compared to corresponding controls. Potentially, the technique could also be used to follow response to treatment for personalized treatment selection. We present initial proof‐of‐concept data for treatment of familial adenomatous polyposis.

Introduction 5-Hydroxymethylcytosine (5-hmC) is an epigenetic modification of the DNA base cytosine, discovered in mammalian genomes in 2009. 1,2 5-hmC was shown to have a functional role in gene expression regulation, [3][4][5] and its global levels were found to be predominantly stable and highly tissue-specific. 6-8 A significant global reduction in 5-hmC level was reported for various human cancers, such as melanoma, colorectal, pancreatic, breast, liver, lung, prostate, brain and blood cancers. [9][10][11][12][13] This reduction suggests that 5-hmC may serve as an important cancer biomarker, potentially enabling early-stage detection. Several methods for 5-hmC detection and quantification have been developed in recent years; however, none meet the requirements for routine clinical diagnostics. The most reliable technique today for global quantification is liquid chromatography coupled with mass spectrometry (LC-MS/MS). 2,14 This technique is relatively accurate but requires expensive equipment, expertise and large amounts of DNA when assessing tissues with low 5-hmC levels. Other available assays are antibody-based, including DNA dot-blot, 8 immunohistochemical staining 9 and commercially available enzyme-linked immunosorbent assay (ELISA) kits. 15 These protocols are relatively simple, but lack in sensitivity and resolution. Consequently, they often fail to detect small but significant differences in 5-hmC content between samples. Resolving such differences is crucial for identification, staging and risk prediction of various medical conditions. 13 Epigenetics-based diagnostic assays may be integrated into existing cancer screening tests such as colonoscopy, potentially providing additional molecular information. Colorectal cancer (CRC) is the third most commonly diagnosed cancer and the third leading cause of cancer mortality in the U.S. for both men and women. 16 The current gold-standard for CRC prevention is colonoscopy and precancerous polyp removal, which has been shown to prevent CRC morbidity and mortality. 17 The relatively long adenoma-to-carcinoma interval in the colon enables timely diagnosis and removal of colonic adenomas that are found in more than 20% of screened adults over the age of 50, some of which already display asymptomatic early-stage CRC. [18][19][20] Biopsies obtained via colonoscopy also enable diagnosis of CRC by combining histopathological examination with associated molecular markers. Measurements of epigenetic marks may potentially provide an additional molecular diagnostic layer for risk evaluation and more appropriate surveillance frequency to prevent interval cancers. However, additional evidence is needed to determine if complementary 5-hmC quantification may assist in proper disease staging, choice of therapy, patient management and prognosis assessment. In high-risk populations (hereditary cancer syndromes and inflammatory bowel diseases), colonoscopy is performed annually or biannually starting at early age. 21,22 For example, familial adenomatous polyposis (FAP) is an inherited colorectal syndrome, characterized by the onset of hundreds to thousands of adenomatous polyps in the colon and rectum during the second decade of life. Untreated, patients will develop colorectal cancer by their early 40s. In order to prevent its onset, disease progression has to be monitored by timely and frequent screening examinations and polyp excision. 23,24 Additionally, FAP can be treated by preventive administration of therapeutic drugs 25,26 (Kariv et al., 2019, submitted). Potentially, the addition of 5-hmC quantification as a part of the screening process may provide frequent risk evaluations and an indication for treatment response. In such cases, important decisions regarding the choice of therapy and patient management could be made based on this measure.
We present a simple, high-throughput and sensitive platform for the quantification of 5-hmC in DNA samples. It employs a robust chemoenzymatic reaction for fluorescently tagging the 5-hmC residues 7,13,27,28 and a chemically activated slide for the analysis of multiple samples simultaneously. We demonstrate the assay's sensitivity in detecting various cancers and its applicability for CRC diagnosis, as well as its potential to monitor response to treatment in neoplastic lesions.

Materials and Methods
For DNA samples and study design, see Supporting Information.
All experiments involving animals were approved by the TAU Institutional Animal Care and Use Committee. All human samples were collected with informed consent for research use and approved by institutional review boards in accordance with the declaration of Helsinki.

5-hmC labeling
5-hmC residues were fluorescently labeled via a two-step chemoenzymatic reaction. 7,13,27,28 In each reaction tube, 1 μg of genomic DNA was mixed with 3 μl of 10× buffer 4 (New England Biolabs, Ipswich, MA), uridine diphosphate-6-azideglucose (UDP-6-N 3 -Glu) 29 to a final concentration of 45 μM, 2 μl (20 units) of T4 phage β-glucosyltransferase (T4-BGT, New England Biolabs) and ultrapure water to a final volume of 30 μl. The reaction mixture was incubated overnight at 37 C. The following day, Dibenzocyclooctyl (DBCO)-PEG 4 -5/6-TAMRA (Jena Bioscience, Jena, Germany) was added to a final concentration of 150 μM, and the reaction was incubated overnight at 37 C. The labeled DNA samples were purified from excess fluorophores using Oligo Clean & Concentrator columns (Zymo research), according to manufacturer's recommendations, with three washing steps for optimal results. For best yield, not more than two micrograms of DNA (two identical reaction tubes combined) were loaded on one column. Alternatively, samples were cleaned with isopropanol/ethanol DNA precipitation. Samples were kept at 4 C until analyzed.

Controls for 5-hmC labeling
We performed an identical reaction without the T4-BGT enzyme for control samples. The result is a nonlabeled DNA What's new? The amount of genomic 5-hydroxymethylcytosine (5-hmC) was shown to be globally reduced in a wide range of cancerous tissues. The utility of such an epigenetic transformation as a clinical biomarker is hampered by the limitations of existing detection assays, however. The authors present a simple-to-perform fluorescence-based platform for high-throughput and cost-effective quantification of global genomic 5-hmC levels. They demonstrate the assay's sensitivity in detecting various cancers and its applicability for colorectal cancer diagnosis. Potentially, the technique could also be used to follow response to treatment for personalized treatment selection, with the authors providing initial proof-of-concept data for familial adenomatous polyposis. sample with fluorescence signal originating from remaining free fluorescent dye molecules. This residual signal representing the assay's noise level was subtracted during analysis to reliably represent the signal derived from labeled 5-hmC residues.

Slides preparation
Teflon coated microscope slides (Tekdon, customized well formation, 2 mm diameter wells, 90 wells per slide) were immersed in 0.005% poly-L-lysine solution in water (Sigma, St. Louis, MO), in order to positively charge the surface. The immersed slides were incubated for 1 hr at 37 C with light shaking (25 rpm) and then overnight at 4 C (no shaking). The following day a blocking step was performed. The slides were washed twice with PBST (0.05% Tween 20) solution and twice with PBS (Sigma), and immersed in a 5% w/v bovine serum albumin (Sigma) solution in PBS. The immersed slides were incubated for 1 hr at 37 C with light shaking (25 rpm) and then overnight at 4 C (no shaking). In the concluding step, the slides were washed twice with PBS and then three times with ultrapure water and dried under a flow of nitrogen gas. The slides were used immediately upon drying.

Applying DNA samples to activated slides
One microliter of DNA (labeled DNA, control or calibration) samples were placed in each well. The optimal DNA concentration for attachment is 5-30 ng/μl; however, up to 300 ng/μl samples were applied. The slides contained 3-6 replicates of each sample. Slides were incubated for 14 min at 42 C and then for 24 min at 30 C, in humid conditions to avoid rapid drying of the wells. The slides were then washed with water and dried under a flow of nitrogen gas. Slides were kept in the dark to avoid exposure to light.

Calibration samples
For accurate 5-hmC quantification, each slide contained a calibration sample alongside the tested samples. Calibration samples are DNA samples containing relevant levels of 5-hmC as determined by LC-MS/MS (see Supporting Information). The calibration samples were labeled side-by-side with the tested samples for each experiment in order to account for variations in the labeling procedure between experiments. The stock of these standards can be prepared once and can be used for multiple slides.

Total DNA staining
Total DNA was stained with EvaGreen DNA binding dye (Biotium, Fremont, CA). One microliter of 1.25 μM dye (90% water, 10% DMSO) was added to the wells containing the bound DNA. Wells containing only water and no DNA were also stained, in order to obtain the background signal of the EvaGreen dye in the absence of DNA. Slides were covered to avoid exposure to light and incubated for 30 min at room temperature. The slides were then washed with water and dried under a flow of nitrogen gas.

Slide imaging
Slides were imaged using fluorescence imaging devices (FLA-5100, Fujifilm and InnoScan 1100 AL, INNOPSYS). Lasers with 532 nm were used to image the TAMRA fluorophore (with a 575 nm long-pass filter in the FLA-5100 device and 582/75 nm filter in the InnoScan 1100 AL), and 473,488 nm lasers were used to image the EvaGreen stain (with a 510 nm long-pass filter in the FLA-5100 device and 520/5 nm filter in the InnoScan 1100 AL). We imaged the TAMRA-labeled DNA before Eva-Green staining to avoid their coexcitation by the green laser. Scanning parameters were optimized to fit the entire range of fluorescence intensities on the scanned slide and to avoid technical artifacts such as saturation and photomultiplier nonlinearity.

Data analysis
Images obtained from the FLA-5100 image analyzer were converted to 16-bit gray TIFF files that keep original image resolution using the ImageGauge software (Fujifilm). These TIFF images and the TIFF images generated by the INNOSCAN 1100 AL were analyzed using ImageJ. 30 The mean fluorescence intensity inside each well in both channels was extracted. The background signal was determined from the control replicates and subtracted from the TAMRA fluorescence signal (5-hmC labels) in each sample well. To account for background noise in the Eva-Green signal (total DNA), a mean fluorescence signal of all wells containing EvaGreen and no DNA was calculated. This mean signal was subtracted from the EvaGreen signal in each sample well. We divided the calculated TAMRA signal in each well by the fluorescence intensity calculated in the EvaGreen channel of the same well, in order to normalize the signal to the actual amount of DNA in the well. Next, the average and standard deviation for each sample were calculated over 3-6 replicates. The absolute 5-hmC level in each sample was determined by comparing this value to the value of the calibration sample. The following equations describe this process (n is the number of replicates). We used T-test to evaluate the significance of the differences between the healthy samples and the colon tumors (p-value).

Results
Our 5-hmC quantification assay workflow is depicted schematically in Figure 1. First, DNA was extracted from a tissue of interest (Fig. 1a). Next, we chemoenzymatically labeled 5-hmC residues with fluorescent molecules (Fig. 1b). We then deposited the labeled DNA in predetermined regions on a chemically activated glass slide, allowing electrostatic attachment of the sample. Figure 1c depicts a photograph of the 90-sample slide used in our experiments. After deposition, we imaged the slide to determine the fluorescence intensity of each sample. In the final step, we calculated 5-hmC levels according to the recorded intensity (Fig. 1d). To compensate for differences in the amount of DNA adsorbed from each sample, DNA was stained with a different color. We used the calculated amount of DNA to normalize the levels of 5-hmC per sample. A detailed schematic representation of the various steps involved in the procedure is presented as Supporting Information Figure S1. For assessment of the assay's reproducibility, see Supporting Information Figure S2.
In order to validate the assay and assess its sensitivity, we prepared a set of DNA samples with varying levels of labeled 5-hmC for analysis. The amount of DNA was set to 25 ng per well after serial dilution of a standard 5-hmC sample. The standard 5-hmC level was determined by LC-MS/MS to be 0.1161% AE 0.0001%, and the percentages of labeled 5-hmC in the diluted samples were calculated according to the dilution factor. The images and analyzed data from the resulting slide are presented in Figure 2. These results indicate that the activated surface consistently binds DNA, as evident from the steady signal in the DNA channel. Moreover, the fluorescence signal of the 5-hmC residues increases according to the rise in the percentage of labeled 5-hmC. Normalizing the 5-hmC signal according to the amount of adsorbed DNA significantly reduces the variance between replicates, as evident by the smaller error bars in Figure 2c. The low noise level of this protocol, portrayed in the fluorescence intensity of the control sample (0% labeled 5-hmC), enables the detection of extremely low 5-hmC levels (0.0035%), lower than the level found in blood or various cancer tissues. 13

5-hmC in mouse tissues
In order to further validate the method, we measured 5-hmC levels in various mouse tissues and compared our results to previous reports regarding the tissue-specific 5-hmC content (Fig. 3a). The trend revealed in these experiments correlates with current literature, with the highest levels measured in the central nervous system, a medium level for kidney and lung, and a lower level in colon and spleen. 6,7,31 The 5-hmC value of all tissues was validated using LC-MS/MS, and the correlation between the LC-MS/MS result and the optical signal of the samples is shown in Figure 3b. The correlation between the results is highly linear, indicating that the assay is applicable through a wide range of 5-hmC levels, and that the LC-MS/MS calibration is appropriate.

5-hmC in a glioblastoma multiforme mouse model and human hematological cancers
To demonstrate the assay compatibility with the full physiological range of 5-hmC levels, we performed analysis on two extreme cases, brain and blood. 5-hmC levels were first assessed in a glioblastoma multiforme (GBM) mouse model that closely recapitulates human GBMs in their highly invasive characteristics and aggressiveness. 32,33 We evaluated the 5-hmC levels in mouse GBMs and corresponding normal brain tissue samples (same anatomical areas where the tumors were initiated; Fig. 4a). The tumors (n = 4) showed lower 5-hmC levels compared to healthy brain tissue controls (n = 4). These results are in line with previous findings showing decreased levels of 5-hmC in human GBM. 11,12,34 In contrast to brain tissue, blood is the most accessible tissue for diagnosis purposes, with potential for diagnosing hematological malignancies. However, blood is known for its extremely low level of 5-hmC, posing a challenge for most existing detection methods. Previous reports have shown the reduction of 5-hmC in hematological cancers, albeit, the small differences between samples were difficult to resolve. 13,35 To assess the ability of our assay in resolving small differences in 5-hmC content between blood samples, we evaluated the 5-hmC level in DNA extracted  from the blood of three human individuals: a healthy donor, multiple myeloma (MM) patient and chronic lymphocytic leukemia (CLL) patient. Both cancerous samples display about 40% lower 5-hmC levels compared to the healthy donor (Fig. 4b). Despite the lack of statistics, these results suggest that the method is sensitive enough to detect the reduction in 5-hmC content even for samples that display a low 5-hmC level in their normal state.

5-hmC in human colorectal cancer
To assess the performance of the assay for analysis of colorectal samples, we measured 5-hmC levels in over 25 samples from healthy human colon compared to CRC tumor. We observed a significant reduction in 5-hmC for CRC tissue relative to healthy colon (0.0133 AE 0.0096%, n = 29 vs. 0.04709 AE 0.0133%, n = 27, p < 0.001; Fig. 5a).
We also assessed the 5-hmC levels of colon tissue adjacent to the tumors (matched tissues, n = 25; Fig. 5b). We show that despite the high variability between individuals, in most cases, the adjacent tissue already displays molecular reduction of 5-hmC level relative to the healthy baseline average. These results are in line with recent studies reporting that the gene expression profile of such tissues may indicate an intermediate state, between healthy and tumor. 36 We note that the absolute 5-hmC levels of tumor and adjacent tissue are highly patient dependent, suggesting a need for personalized diagnostics.

5-hmC in pancreatic cancer and familial adenomatous polyposis
To complement our study, we analyzed pairs of pancreatic tumor and matched tissue from the tumor periphery of four pancreatic ductal adenocarcinoma (PDAC) patients. In all four patients, the 5-hmC level of the tumors was significantly reduced compared to the adjacent tissue (Fig. 6a). These results indicate that the characteristic reduction in 5-hmC can be reliably determined also for pancreatic tissues.  Finally, we performed a proof-of-concept experiment to test the changes in 5-hmC levels in response to therapy for two familial adenomatous polyposis (FAP) patients. Several highrisk conditions such as FAP are treated by preventive administration of therapeutic drugs 25,26 (Kariv et al., 2019, submitted). We measured the 5-hmC level of polyps and adjacent tissues in two FAP patients that were enrolled in a clinical trial using antibiotic read-through treatment for their APC nonsense mutations (Kariv et al., 2019, submitted). Polyp and normal tissues were collected before and after the treatment (Fig. 6b). 5-hmC levels measured after treatment were higher compared to untreated polyps and closer to the levels of normal tissue for both patients. Despite the small sampling size, the results suggest the potential use of this assay for evaluation of response to polyp chemoprevention or other cancer prevention strategies.

Discussion
Epigenetic analysis is emerging as an important diagnostic tool, and DNA methylation-based assays are already established and compatible with current diagnostic pipelines. 37,38 In recent years, demethylation of DNA via oxidation to 5-hmC is recognized as a fundamental process in development and disease. 5,39 The global quantity of 5-hmC may indicate a number of disorders, and its reduction in various human cancers is well documented. [9][10][11][12][13] The global loss of 5-hmC may therefore be used as a molecular marker to diagnose or predict neoplastic lesions. A major setback for the utility of 5-hmC as a biomarker spans from the fact that available assays are not suitable for clinical applications. The main limitations of current assays are their low sensitivity, experimental complexity and high costs. Here, we present a simple platform for 5-hmC analysis based on fluorescent labeling. The assay is performed on a partitioned microscope slide that in our case accommodates 90 samples. This feature is easily modified according to the specific experimental needs, allowing high-throughput quantification of 5-hmC. In terms of sensitivity, we present results demonstrating our ability to quantify samples with ultralow 5-hmC levels such as blood. Furthermore, we could resolve small differences in 5-hmC levels between blood samples that were otherwise indistinguishable using commercially available assays. For example, no significant differences in 5-hmC levels were found in the blood of CLL patients compared to healthy individuals using a commercial ELISA kit. 35 Moreover, available kits were assessed by our group and presented low sensitivity and poor reproducibility. 13 Other methods, including epigenetic next-generation sequencing, provide locus-specific epigenetic profiles but are extremely expensive and labor intensive. These attributes are limiting for routine diagnostic screening. In contrast, we show here that using only 5-30 ng of DNA, we are able to reliably monitor 5-hmC levels with high sensitivity across multiple samples.
While blood displays ultralow 5-hmC levels, other tissues display up to 100-fold higher 5-hmC levels, with the brain and central nervous system displaying the highest levels. We show that our assay is compatible with the full physiological dynamic range of 5-hmC content, quantitatively characterizing various tissues from blood to brain. In this context, the results presented here for glioblastoma (GBM) reaffirm previous studies showing a reduced level of 5-hmC in human GBM compared to healthy brain tissues. 11,12,34 One study even associated 5-hmC levels with tumor grade and prognosis. 40 GBM is the most common and aggressive primary brain tumor found in humans 41 with survival of only 12-15 months from diagnosis. 42 Despite extensive research, the mechanisms of glioblastoma progression and recurrence remain elusive, and many tumors show resistance to current therapeutic approaches. 33,43 The novel mouse model used here recapitulates pathophysiology of human GBM, by creating region-and cell type-specific tumors. 32,33 Measuring 5-hmC levels in this mouse model or in human tumor biopsies will provide additional molecular information about tumor epigenetics, and contribute to the study and profiling of the disease.
Various colorectal cancer hereditary syndromes, as well as sporadic colonic adenoma patients, may particularly benefit from our developed assay due to the ease of accessing tissue for diagnosis. For example, colonoscopies are routinely performed for CRC screening, coupled with polyp removal and histopathological examination. 18,44 Our assay could utilize the residual biopsy for 5-hmC quantification and determination of risk profile, polyp resection adequacy and areas at risk in the colon. This data may complement histopathological examinations of colon specimens and assist pathologists in cases for which morphological properties are controversial. One example is the 5-hmC level measured in biopsies from colon tissue adjacent to tumor. Such tissue regions, despite displaying normal morphology may in some cases present reduced 5-hmC levels, providing a quantitative molecular indication for the state of the examined tissue. Such intermediate states were recently revealed by differential gene expression and protein-protein interaction analysis. 36 We show similar changes in pancreatic tumors and matched adjacent tissue.
An attractive potential application for our assay is the monitoring of disease progression and response to therapy. Reliable and timely feedback regarding the response to a given therapeutic is of utmost importance for efficient treatment. Incorrect drug administration for high-risk conditions can lead to poor prognosis. The reported assay may assist in monitoring disease progression at regular and frequent intervals. A proof-of-concept example is given here with the monitoring of FAP patients before and after the administration of a new drug. Our results show that 5-hmC levels were increased for the treated patients, approaching the levels measured in adjacent tissue.
The results presented above exemplify the utility of 5-hmC for clinical diagnostics and the ability of this method to overcome the barriers encountered using other assays. We stress the compatibility of the assay with clinical requirements, namely, simplicity of use, low cost and the need for conventional instrumentation for analysis.
With the future of medicine progressing toward personalized diagnostics and treatment, the presented assay may provide a complementary layer of personalized epigenetic information with potential for early diagnosis and disease staging. The overall properties of the technique allow facile integration into existing diagnostic workflows for populationlevel screening and monitoring.