Crystalline Supramolecular Organic Frameworks via Hydrogen-bonding between Nucleobases

We report a crystalline supramolecular framework assembled by H-bonding interactions between covalently fused monomers equipped with two guanine-cytosine nucleobase pairs.


Synthesis and characterization
The synthesis of compounds C2, 2 C1, 2 G4 3 and G3 2 was previously described in the literature.The synthesis of compounds C, 2 G2, 2 G1 2 and B1 4 was modified slightly with respect to the methods described in the literature.
Standard Procedure A (Sonogashira coupling with TMSA).A dry THF/NEt3 or DMF/NEt3(4:1) solvent mixture was subjected to deoxygenation by three freeze-pump-thaw cycles with argon.Then, this solvent was added over the system containing the corresponding halogenated nucleobase, Pd(PPh3)2Cl2 and CuI.The mixture was stirred at room temperature during a few minutes.Then, trimethylsilylacetylene (TMSA) was added dropwise.The reaction is stirred under argon at a given temperature and for a period of time (indicated in each case) until completion, which was monitored by TLC.

Standard Procedure B (Sonogashira coupling between central block and nucleobase).
A dry DMF/NEt3(4:1) solvent mixture was subjected to deoxygenation by three freeze-pump-thaw cycles with argon.Then, this solvent was added over the system containing the corresponding halogenated central block and ethynyl-nucleobase derivative, Pd(PPh3)2Cl2 and CuI.The reaction is stirred under argon at a given temperature and for a period of time (indicated in each case) until completion, which was monitored by TLC.

Standard Procedure C (Sonogashira coupling between central block and nucleobase with controlled addition).
A dry DMF/NEt3(4:1) solvent mixture was subjected to deoxygenation by three freeze-pump-thaw cycles with argon.Then, this solvent was added over the system containing the corresponding halogenated central block, Pd(PPh3)2Cl2 and CuI and to another one containing the corresponding ethynyl-nucleobase derivative.The solution with the ethynylnucleobase derivative is added over the other one over the time and at the temperature indicated.The reaction is stirred under argon at a given temperature and for a period of time (indicated in each case) until completion, which was monitored by TLC.C2 was synthesized according to the literature 2 using cytosine (10.0 g, 90.0 mmol), I2 (34.3 g, 135.1 mmol) and HIO3 (22.2 g, 126.2 mmol) in glacial AcOH (300 mL), obtaining C2 as a white solid (16.38 g, 77%).C1 was synthesized according to the literature 2 using C2 (6.0 g, 21.1 mmol), Bu4NOH (1 M in MeOH, 31.0 mL, 31.0 mmol) and 6-iodohexane (3.4 mL, 42.3 mmol) in dry DMF (80 mL), obtaining C1 as a yellowish solid (5.53g, 68%).
G2 was synthesized modifying the literature procedure. 2In a round bottom flask equipped with a magnetic stirrer NBS (2.95 g, 16.6 mmol) was added in portions over 1 hour over a solution of G3 (4.19 g, 15.0 mmol) in Et2O (130 mL).After that, the solution was filtered over a celite plug and the solvent was eliminated under reduced pressure.Finally, the crude was purified by flash column chromatography in silica gel using Cy/AcOEt mixture (2:1) as eluent, obtaining G2 as a white solid (4.94 g, 92%).
G. In a round bottom flask equipped with a magnetic stirrer K2CO3 (0.70 g, 5.03 mmol) was added over a solution of G1 (0.63 g, 1.68 mmol) in MeOH (50 mL).The mixture was stirred at rt for 1 hour.After that, CHCl3 (50 mL) was added, the solution was filtered over a celite plug and the solvent was eliminated under reduced pressure.Finally, the crude was purified by flash column chromatography in silica gel using Cy/AcOEt mixture (2:1) as eluent, obtaining G as a beige solid (0.49 g, 96%).

B.
In a round bottom flask equipped with a magnetic stirrer, 2aminopiridine (1.15 g, 12.2 mmol) was added to a solution of B1 (4.00 g, 10.0 mmol) in toluene (100 mL) and NEt3 (2.0 mL).The mixture was stirred at reflux with a Dean-Stark apparatus for 24 hours.Then, the solvent was eliminated under reduced pressure and the crude was purified by flash column chromatography in silica gel using toluene/AcOEt mixture (10:1) as eluent, obtaining B as a beige solid (4.45 g, 93%).

13
C NMR (DMSO-d6).Due to the extremely low solubility of this compound, a 13 C NMR spectrum of sufficient quality could not be acquired even after 24 h in a 500 MHz instrument.

GCGC.
In a round bottom flask equipped with a magnetic stirrer an aqueous solution of HCl (37%, 3 drops) was added over a solution of GCGC1 (10.0 mg, 0.007 mmol) in DMF (3 mL).The mixture was stirred at rt for 48 hours.Then, an aqueous saturated NaHCO3 solution (5 drops) was added and the mixture was stirred for 10 minutes.After that, the solid was filtered and washed with water, DMF, MeOH and CH2Cl2, obtaining GCGC as an orange solid (7.5 mg, 88%).

S1. NMR studies
At low temperatures the 1 H NMR spectra of GCGC presented just broad signals due to the strong aggregation and precipitation of the fused monomers.With the increase of temperature, the signals of the different protons of the non-aggregated GCGC molecule become visible.The typical hydrogen bonded signals could not be seen because the concentration of aggregates with a small enough size to be soluble is too low.

S2. UV/visible spectroscopy studies
The absorption spectra show a maximum at 400 nm independently of the concentration.The decrease of the absorption in the most concentrated sample is due to the presence of undissolved material, as seen in the picture by the naked eye.
The emission spectra show an important decrease in the intensity when the concentration increases because of the presence of small aggregates.In the normalized spectra a shift to the blue from 538 to 458 nm can be observed as the sample is diluted.

S3. Crystallization process
For the crystallization protocol, we followed a similar approach than the one described in other publications. 5Because of the extremely low solubility, we used the protected fused monomer GCGC1 and deprotected the carbonyl groups of the guanosine fragment in situ.In this way the presence of fused monomer and its assembly is carried out in a controlled way, producing (theoretically) materials with a better crystallinity.
For the preparation of the samples, fused monomer GCGC1 was dissolved in DMF down to a concentration of 1x10 -2 M.Then, in a vial with the previous solution (0.8 mL), a solution of HCl (0.1 M in DMF, 0.2 mL) was added.After that, different quantities of cosolvent (CHCl3, MeOH, MeCN) and DMF (to complete a volume of 2.5 mL) were added, the vial was sealed and treated thermally.Finally, the supernatant was extracted, and the crystals were washed with MeOH several times.
The best results were obtained using 0.5 mL of MeOH as cosolvent and annealing the sample at 65°C for 12 hours (25°C to 65°C at 0.5°C/min, 65°C for 12 hours, 65°C to 25°C at 0.2°C/min).

Figure S3
. Crystals observed at the optic microscope using polarized light.

S5. FT-IR analysis
The FT-IR spectrum of the crystals (in red in Figure S5) was compared with a 1:1 mixture of both G and C nucleobase derivatives.In the 3000-3500 cm -1 region a band centered in 3125 cm -1 can be seen while, in the 1500-1800 cm -1 region, a couple of bands at 1650 and 1640 cm -1 can be detected.Those bands are not present in the nucleobases alone or in the protected fused monomer, but are present in the 1:1 nucleobase mixture, so they could be assigned to N-H and C=O vibrations of H-bonded species.

S7. Theoretical study
In order to elucidate the crystal structure, we followed a similar strategy to that followed by Wasielewski et al. 5 using Materials Studio (MS) 2017 R2.Given the presence of a peak corresponding to a possible pi-pi stacking (3.5 A) we delimited the study to 2D networks, for which we studied different types of packing as well as different H-bonding interactions between the nucleobases.
The first step was thus the construction of a 2D network.With this purpose, the fused monomer GCGC was constructed and optimized using the Forcite module with the DREIDING 6 force field where the charges of the atoms were obtained via Qeq calculations.Then, the monomer was assembled in different ways to obtain various 2D networks via hydrogen bond.Those networks were also optimized with the Forcite module to optimize short-range interactions such as hydrogen bonds.We studied three different possibilities (Figure S7A), the cyclic tetramer network (Figure S7Aa), one in which interactions between cytosines in one hand and guanines in the other hand are established (Figure S7Ab), and a third one with small cyclic tetramers similar to the G-quadruplex that forms a very compact network (Figure S7Ac).All 2D networks were undergone to a screening with different relative dispositions of the layers, always with a distance of 3.5 Å between them, because it was the distance estimated experimentally from the XRD pattern.We simulated the PDRX patterns of the three different possibilities in a simply S7B).Comparison of the simulated PXRD patterns with the experimental data allowed us to discard a cyclic tetramer Hbonding network due to the low similarity with the experimental data.We also discarded the third possibility (Figure S7Ac) given the high steric hindrance between the alkyl chains once the network is formed.In the second case, both simulations fitted with the experimental data.Therefore, we focused on this conformation to carry out our structural study.In addition to the networks studied previously for the second option (Figure S7b), in the case of AB stacking, different shifts in the x and y axes were tried (Figure S7C).Systematic translations of 5, 10 and 15 Å allowed us to see the trend in the simulated XRD patterns.Thus, a big translation in the c axis was necessary to move the principal peak to the distance showed in the experimental one.In the other hand, the movement in the a axis led to minor improvements.
Regarding the packing of the layers, different stacking possibilities were studied as a function of the number of layers as well as the formation of 1D channels along the b axis (Figure S7D).However, the simulated diffractograms of the latter case differed too much from the experimental data and, thus, no porous structures were further considered.Eventually, this information led us to a possible structure whose simulated PXRD was in accordance with the experimental data.In this structure, the second layer was shifted 12 and 18 Å in the a and c axes, respectively (Figure S7E).Once we obtained a probable structure, we carried out a profile fitting of the PXRD data in order to evaluate the feasibility of the model.Whereas previous models yielded poor fittings, the profile fitting refinement of the model used in Figure S7E converged with excellent residual values for a monoclinic crystal system of cell parameters a = 22.1582 Å, b = 7.2195 Å, c = 19.3841Å, β = 97.085°,V = 3077.3Å 3 , Rp = 2.23 %, Rwp = 2.55 % Rexp = 1.53 %, Chi2 = 2.76 (Figure S8).However, due to the broad diffraction peaks, it was not possible to estimate the most likely space group from the diffraction data and thus, the refinement was carried out in the triclinic P1 space group, keeping α = γ = 90°.

Scheme S1 .
Scheme S1.Synthetic route to the cytosine derivative C.

Scheme S2 .
Scheme S2.Synthetic route to the guanosine derivative G.

Scheme S3 .
Scheme S3.Synthetic route to the central block B.

Figure S1. 1 H
Figure S1. 1 H NMR spectra of the fused monomer GCGC in DMSO-d6 at different temperatures.

Figure
Figure S2.a) Samples of GCGC at different concentrations (5x10 -4 M, 1x10 -4 M and 5x10 -5 M from left to right) in DMSO; b) absorption spectra of GCGC in DMSO at 90°C; c) emission spectra of GCGC in DMSO at 90°C (the insets show the normalized spectra).

Figure S4 .
Figure S4.SEM images of the GCGC crystals obtained using 0.5 mL of MeOH as cosolvent.

Figure S5 .
Figure S5.FT-IR spectra of (top to down) cytosine derivative C, guanosine derivative G, 1:1 mixture of G and C(prepared in solution and then evaporated), fused monomer GCGC and protected fused monomer GCGC1.

Figure S7A .Figure S7B .
Figure S7A.Different possibilities studied in the formation of a 2D network.

Figure S7D .
Figure S7D.Simulated diffractograms of the b network with shifts in function of the number of layers in the stacking.

Figure S7E .
Figure S7E.Comparison between the experimental PDRX diffractogram and the simulated from the proposedstructure.