Hairpin DNA-AuNPs as molecular binding elements for the detection of volatile organic compounds.

Hairpin DNA (hpDNA) loops were used for the first time as molecular binding elements in gas analysis. The hpDNA loops sequences of unpaired bases were studied in-silico to evaluate the binding versus four chemical classes (alcohols, aldehydes, esters and ketones) of volatile organic compounds (VOCs). The virtual binding score trend was correlated to the oligonucleotide size and increased of about 25% from tetramer to hexamer. Two tetramer and pentamer and three hexamer loops were selected to test the recognition ability of the DNA motif. The selection was carried out trying to maximize differences among chemical classes in order to evaluate the ability of the sensors to work as an array. All oligonucleotides showed similar trends with best binding scores for alcohols followed by esters, aldehydes and ketones. The seven ssDNA loops (CCAG, TTCT, CCCGA, TAAGT, ATAATC, CATGTC and CTGCAA) were then extended with the same double helix stem of four base pair DNA (GAAG to 5' end and CTTC to 3' end) and covalently bound to gold nanoparticles (AuNPs) using a thiol spacer attached to 5' end of the hpDNA. HpDNA-AuNPs were deposited onto 20 MHz quartz crystal microbalances (QCMs) to form the gas piezoelectric sensors. An estimation of relative binding affinities was obtained using different amounts of eight VOCs (ethanol, 3-methylbutan-1-ol, 1-pentanol, octanal, nonanal, ethyl acetate, ethyl octanoate, and butane-2,3-dione) representative of the four chemical classes. In agreement with the predicted simulation, hexamer DNA loops improved by two orders of magnitude the binding affinity highlighting the key role of the hpDNA loop size. Using the sensors as an array a clear discrimination of VOCs on the basis of molecular weight and functional groups was achieved, analyzing the experimental with principal components analysis (PCA) demonstrating that HpDNA is a promising molecular binding element for analysis of VOCs.


Introduction
In the past decade, DNA was extensively used in sensors design, fabrication, characterization and application providing new impulses to analytical research (Bettazzi et al., 2017;Rasheed and Sandhyarani, 2017). Through the selection of the DNA sequence, a wide variety of analytical applications have been proposed, the majority of which were applied to liquid samples. Currently, very few gas sensors propose DNA as functional material (Wasilewski et al., 2017). The first attempts to use DNA in gas sensing was reported few years ago and was obtained by decoration of carbon nanotubes (Khamis et al., 2012;Kybert et al., 2013;Staii et al., 2005;Su et al., 2013). In a recent work DNA extracted from fish sperm was introduced between a gate dielectric and an organic semiconducting layer to build up an organic field-effect transistor sensor for NO 2 detection (Shi et al., 2016). Another interesting work evaluated the ability to detect odors delivered in the vapor phase of double-stranded (dsDNA) and single-stranded DNA (ssDNA); ssDNA exhibited sequence-specific responses for a variety of volatile compounds (White et al., 2008).
To date no gas sensor works has explored the use of hpDNA for sensing of VOCs. In fact, hpDNA has been used for sensor applications only in liquid media, mainly coupled to electrochemical transducers (Martín-Fernández et al., 2015;Wang et al., 2014). The particular shape of hpDNA is very interesting since it appears ideal to maximize the orientation of the binding element (via the stem) using the loop as sensing element. In this work, hpDNA conjugated with AuNPs was used as a molecular binding element in piezoelectric gas detection (Fig. 1S). Piezoelectric transducers can monitor the frequency change of functionalized QCMs when gaseous molecules are adsorbed, providing the relationship between mass and resonant frequency shift (Skládal, 2016). In gas piezoelectric sensors, the use of AuNPs as platform for VOCs binding was found to increase the sensitivity by two orders of magnitude versus monolayer modified QCMs .
The new hpDNA-AuNP piezoelectric gas detection strategy, described in this paper, is based on in-silico calculation of the hpDNA loop binding. HpDNA loop shape is ideal to build sensing molecules since provides a large combinatorial complexity of structures; the latter can be tailored with the help of computer-aided methods to respond to different volatile chemicals. In-silico rationally designed molecular traps have been demonstrated to have a strong impact on the development of analytical strategies since they minimize experimental issues such as reagent stability and nonspecific recognition also for separation procedures (Baggiani et al., 2013;Mascini et al., 2013;Narcisi et al., 2011;Uzun and Turner, 2016). A computational approach was also recently used to reduce the large number of attempts necessary to select the right combination of tools for different gas sensing application (Gustafson and Wilmer, 2017;Mascini et al., 2017;Pizzoni et al., 2014).
AuNPs were used as immobilization platform for the hpDNA sequences. The relative binding affinities of the hpDNA loops vs. different VOCs belonging to relevant chemical classes were evaluated. The results showed a significant increase of the binding affinity versus VOCs with the increase of the hpDNA loop size highlighting the key role of the molecular geometry.
The hpDNA sensors were then used in a sensor array format using the principle of combinatorial selectivity. The combination of such sensors into arrays was thought to overcome the limited selectivity of a single oligonucleotide that is common, to the majority of gas sensing devices. When complemented by a multivariate data analysis technique, this sensor array allows for the classification and the identification of compounds with a performance well beyond that of a single selective sensor.
As a practical use, the pattern recognition of these new sensors was estimated by using the unsupervised multivariate algorithm PCA, a convenient tool often used in sensors post-processing analysis (Akamatsu et al., 2017;Compagnone et al., 2015;Di Natale et al., 2014;Imamura et al., 2017). Data obtained demonstrated that the hpDNA sensors, used as array, were able to discriminate the eight molecules on the basis of molecular weight and functional groups.
The entire DNA library of tetramer, pentamer and hexamer ssDNA was generated using Hyperchem 8.0.5 software on a Microsoft Windows 10 laptop. Calculations of the in-silico screening process, including molecular docking run and data preparation were performed using a desktop computer with 19 processors Intel Xeon X5690 at 3.47 GHz each, with 94.5 GiB RAM, running Kernel Linux 2.6.32-642.1.1el6.x86_64, GNOME 2.28.2. Tools from OpenEye Scientific Software package under academic license, were used at different stages of the in-silico procedure. VOCs were obtained via LEXICHEM 2.1.0 package, by converting ligands standard IUPAC names into their corresponding structures (LEXICHEM version 2.1.0). SZYBKI 1.5.7 with default parameterization was used to optimize molecular geometries (SZYBKI version 1.5.7). Conformational space for both ssDNA and VOCs was taken into account with OMEGA 2.4.6 (Hawkins and Nicholls, 2012;Hawkins et al., 2010;OMEGA version 2.4.6). Multi-conformer rigid body docking was carried out using OEDocking 3.0.0, having Chemgauss4 as scoring function (Kelley et al., 2015; OEDocking version 3.0.0). Structures visualization and generation of molecular surfaces were performed using VIDA 4.1.1 (VIDA version 4.1.1).
The entire DNA molecular surface was included in the active site box defining the area where VOCs were expected to bind. For each ssDNA receptor, a dedicated box (10-20 nm 3 ) was generated. The time elapsed for processing each DNA conformer was about 2 min per processor, from the initial 3D structures generation to final docking results. Ten conformers per ssDNA and a maximum of 200 conformers for each of the 50 VOCs were considered. The binding score average for each DNA was calculated over all the conformers. The entire process was automated using a bash script and using a freeware BASIC-like scripting language (AutoIT V3) for post processing data analysis.
The piezoelectric measurements were carried out using an Enose-UTV from Sensor group, University of Rome Tor Vergata (Italy). 20 MHz QCM sensors, were from KVG GmbH (Germany).
Colloidal AuNPs were synthesized using the trisodium citrate reduction method (Frens, 1973). In brief, 50 mL of 0.3 mM tetrachloroauric acid solution (in water) was stirred vigorously and heated. At boiling point, 1.5 mL of 40 mM trisodium citrate solution was added. The mixture was left boiling for 20 min (the color turned from clear liquid to wine red). The colloidal suspension was then cooled down to 4°C. Ultraviolet-visible spectrophotometry was used to confirm the AuNPs formation and verify the AuNPs dispersion. The concentration of AuNPs was 3.5 × 10 -9 M considering an average diameter of 15 nm as reported by Sanghavi et al. (2016).
Immobilization of the oligonucleotides on the AuNPs surface was carried out covalently using a C6 thiol modifier group attached to 5' phosphate end of the hpDNA. Each hpDNA was dissolved in deionized water and added to 1 mL of the AuNPs colloidal solution at a final concentration of 0.678 μM. The hpDNA-AuNPs colloidal suspensions were incubated at + 5°C for 12 h. HpDNA-AuNPs were then centrifuged at 13,000 rpm for 30 min at 4°C. The colorless supernatant was discarded and the solid pellet was resuspended in 1 mL of deionized water. All steps were monitored via UV-Vis spectrophotometry.
The QCM sensors modification was achieved by drop casting 5 μL of the hpDNA-AuNP -suspension on each side of the crystal and let dry for few minutes. Before the first use, the QCM sensors were completely dried under N 2 at a flow rate of 2 L/h and stored at room temperature in the dark when not in use. The shape of a single crystal unit resonating at about 20 MHz was a circular plate with a diameter of 8 mm (Zampetti et al., 2008). The QCM background noise in all cases was ± 1 Hz.
The piezoelectric measurements were carried out using N 2 as carrier gas at a flow rate of 2 L/h. Measurements of the VOCs were carried out using different amounts of compounds introduced in a gas-tight lab bottle (100 mL) connected to the measuring chamber containing sensor array via three-way stop-cocks. The liquid organic compound was completely evaporated placing the bottle at 45°C for 10 min. The temperature was then brought back to 25°C and the measurement started opening the stop-cocks and, then, flowing the analyte to the sensor chamber. The frequency shift (ΔF), taken as analytical signal was recorded. Steady state was reached between 100 and 200 s after opening the stop-cocks. After each measurement, a complete recovery of the signal was achieved under N 2 flow in about 400 s. Piezoelectric responses dataset was analyzed by the unsupervised multivariate technique principal component analysis (PCA) using MatLab R2011 (USA). Dataset were autoscaled (zero mean and unitary variance) before analysis. PCA was applied to inspect the multivariate data structure by decomposing a data matrix of eight rows (the VOCs) and seven columns (the hpDNA-AuNP sensors).

Results and discussion
3.1. In-silico screening: ssDNA vs chemical classes The binding properties of the ssDNA libraries were calculated against 50 VOCs molecules belonging to four different chemical classes (alcohols, aldehydes, esters and ketones). Only the four natural bases adenine (A), cytosine (C), guanine (G) and thymine (T) were used to build the ssDNA libraries. The minimum oligomer size to have a significant library of loops was using four DNA bases. The tetrameric structure was then the starting library tested. Considering that more bases can contribute synergistically in binding the VOCs, the size of the oligomer library was increased adding in every library an additional base. All the possible combinations of the four DNA bases were tested; the libraries consisted, then, in 256 elements for tetramers, 1024 elements for pentamers and 4096 elements for hexamers. Increasing the oligomer size by using a pure combinatorial approach generates too much structures to be calculated, therefore hexamer was the largest structure tested in this work.
The molecular docking functions used screened compounds that potentially interacted with the binding site predominantly through noncovalent interactions, particularly hydrogen bonds. Therefore, only the hpDNA loops having unpaired bases were virtually screened. Performing an in-silico screening of the entire hairpin DNA increased enormously the machine time consumption; this was avoided considering that the shape of double strain DNA has no preferential sites for the VOCs binding. Fig. 1 reports the binding score trend of tetramer, pentamer and hexamer hpDNA loop libraries for the four chemical classes tested. The binding score was reported as the average calculated over 10 conformers for each ssDNA sequence. The score values were calculated using Chemgauss4 scoring function, thus lower values represented higher ssDNA-ligand affinity. The Chemgauss4, the new scoring function from OpenEye software is a modification of the Chemgauss3 that has improved hydrogen bonding and metal chelator functions. This scoring function was particularly suitable for the focus of this work based on the comparison between unpaired DNA bases in binding the VOCs, particularly via electrostatic interactions (hydrogen bond, van der Waals forces).
The molecular docking functions used screened compounds that potentially interacted with the binding site predominantly through noncovalent interactions, particularly hydrogen bonds. Therefore, only the hpDNA loops having unpaired bases were virtually screened. Fig. 1 reports the binding score trend of tetramer, pentamer and hexamer hpDNA loop libraries for the four chemical classes tested. The binding score was reported as the average calculated over 10 conformers for each ssDNA sequence. The score values were calculated using chem-gauss4 scoring function, thus lower values represented higher ssDNA-ligand affinity. The oligonucleotides virtual binding score trend was correlated to the oligonucleotide size for all chemical classes, with values increasing of about 25% from tetramer to hexamer. All oligonucleotides had common trend with best binding scores for alcohols followed by esters, aldehydes and ketones. In all libraries, alcohols were 2 times higher than ketones. The minimum-maximum dynamic range Fig. 1. Binding score trend of tetramer, pentamer and hexamer hpDNA loop libraries for the four chemical classes tested. The data were sorted in ascending order of score, thus not necessarily a correspondence must exist between the positions of the ssDNA in each curve.
for each chemical class was quite narrow for tetramers becoming relevant only for the hexamer DNA library (-3.07 kcal/mol). In all cases, average and median were very close to each other demonstrating a good symmetry in normal distribution.
Structural analysis was carried out to study the occurrence of the four DNA bases in each oligonucleotide position. The 5% top ranked structures of the tetramer, pentamer and hexamer unpaired DNA were tested versus the four chemical classes. The structural data exhibited a very high level of similarity in DNA bases distribution. Top ranked tetramer and pentamer DNA had higher amount of adenine and thymine, however, in the hexamer DNA the occurrence of both purines was higher than pyrimidines.
Due to the small combinations generated by only four different DNA bases, the binding difference within the DNA library was likely due to the steric/conformational effects. Increasing the DNA in size enhanced the internal flexibility of specific DNA regions and the target accessibility to the binding box conformational space. Fig. 2 reports the specific positions of the DNA bases contributing cooperatively to target binding. Top binding scores were obtained when DNA docked VOCs with a saddle shaped binding pocket, allowing oligonucleotide to bury the entire ligand in its surface. On the other hand, inefficient binding was found when DNA docked VOCs with a planar interaction. This confirms that the degree of freedom to move around the DNA backbone of the single bases was the major effect to explain the binding score data; this is particularly true for hexamer DNA where the probability of synergic cooperation is higher.
The results of the virtual screening were used to select some oligonucleotides with different affinities for the VOCs in order to evaluate their potential applicability in gas analysis by using QCM sensors.
Since the final aim is the use of the sensors as array (as electronic nose), the loops for the experimental data were selected taking into account not the absolute "best" binding scores but the minimum cross reactivity. This was done looking at the largest differences among the chemical classes.
The selection was finalized to maximize the recognition properties of DNA motif between chemical classes. Thus, two tetramers, two pentamers and three hexamer DNA were finally chosen. Table 1 reports the binding score of the DNA versus the VOCs selected in experimental part. The binding score average obtained by the simulations of the ssDNA versus the chemical classes (14 alcohols, 13 aldehydes, 18 esters and 5 ketones) was also reported in order to emphasise the differences between chemical class, average and single compounds of the same class.
The selected oligonucleotides have the same trend of the entire DNA library with better interaction for alcohols followed by esters, aldehydes, and ketones showing always the lowest interactions. According to the binding score data, all the DNA sequences exhibited similar trend for alcohols except for ethanol; binding scores varied significantly for the interaction with esters and aldehydes and, in the case of one hexamer, also for ketones. Three of the seven oligonucleotides selected for the experimental part, CCAG, TAAGT and CTGCAA, were oligonucleotides supposed to be poor candidates in binding VOCs. These oligonucleotides were selected to test the matching between in-silico and experimental data.
The oligonucleotide TTCT showed a good interaction particularly for aldehydes and ethyl octanoate. The other tetramer CCAG exhibited a clear difference in binding alcohols and the other VOCs selected in experimental part. The pentamer TAAGT was selected because of the very low interaction with all the molecules compared to its counterpart CCCGA that had almost two-fold more interaction energy for each of the VOC.
A clear difference in affinity scores was observed using the hexameric DNA. As reported also considering the entire DNA library, increasing the number of bases, there was a considerable increase of docking scores. The hexamer ATAATC showed better binding score than the other oligonucleotide receptors for most of the ligands and, in particular, for ethyl octanoate and both aldehydes (nonanal and octanal). This hexamer and CATGTC exhibited the same pattern in docking the alcohols, aldehydes and esters, showing a significant difference among small compounds, as ethanol and ethyl acetate, and the other molecules. All oligonucleotides exhibited affinity properties inversely correlated to the molecular weight except the hexamer CTGCAA that had good affinity only for ethanol and half interaction energy for all the other VOCs when compared to the other two hexamer DNA.
It should be noted that the same stem DNA sequence was used for the realization of the hpDNAs in order to evaluate the contribution of the loop. Thus, some oligonucleotides, particularly in hexamer DNA, were discarded due to stem-loop intramolecular base pairing.

AuNPs-DNA functionalization and QCM sensors modification
The selected sequences were extended with the same double helix stem of four base pair DNA attaching to the 5' end the sequence GAAG and to the 3' end the sequence CTTC. Each secondary structure was analyzed using the Mfold Web Server (www.unafold.rna.albany.edu) to check the stem-loop intramolecular base pairing. All selected DNA had unpaired loop in standard conditions. The AuNP functionalization with hpDNA was followed by UV-Vis spectroscopy. The amount of hpDNA for the AuNPs functionalization was selected testing different concentrations of hpDNA (0.136, 0.271, 0.678 and 1.355 μM.).
The UV-VIS spectra after AuNPs functionalization with hpDNA are reported in supplementary material (Fig. 2S). Similar absorption spectra were obtained in the 350-800 nm range for bare AuNPs and all the different amounts of hpDNA-AuNPs tested demonstrating that the functionalization did not cause AuNPs aggregation. Similar results were obtained using all DNA loops. The UV-Vis spectra of the hpDNA-AuNPs resuspended in the same volume of water after centrifugation showed that centrifugation was essential to remove chemicals excess after nanoparticles functionalization. The spectra (Fig. S2B) showed that the AuNPs were stabilized by the negatively charged DNA that acted as electrostatic repulsing capping agent among the AuNPs as reported in the literature (Baldock and Hutchison, 2016;Xu et al., 2016). Moreover, the nanoparticles functionalization with hpDNA was confirmed by the presence of a sharp peak at 260 nm indicating the presence of DNA. Unmodified AuNPs were not easily resuspended in water, showing an irreversible aggregation due to the centrifugation step. The spectra of the supernatant (Fig. 2S C) gave an indirect indication of the maximum amount of hpDNA necessary to saturate the binding sites of AuNPs. In fact, the peak at 260 nm, gave indication of unbound hp-DNA. The latter increased significantly for hpDNA concentrations higher than 0.678 μM, indicating saturation of the binding sites. Therefore this concentration was chosen for the functionalization of all the AuNPs.
After functionalization, the 20 MHz QCM sensors surfaces were modified by drop-casting of 2.5 μL of hpDNA-AuNP suspension on each side of the crystal and let drying at room temperature. This procedure was repeated to assess the maximum loadable amount. Every 2.5 μL addition of hpDNA-AuNP suspension on each side of the sensor led to a variation of approximately 2.5 kHz for all the sensors realized demonstrating the reproducibility of the deposition procedure. After four times (20 μL total volume) QMC crystals frequency crashed and no variation was detectable. Thus, a total amount of 20 μL of hpDNA-AuNP suspension was selected for further work, leading to a variation of 10 kHz in all cases.

QCM sensors response to VOCs
QCMs frequency shifts (ΔF) were used to calculate the relative experimental binding constants of the eight VOCs and to assess the correlation between the virtual screening and real binding data. For this reason, pure VOCs were tested by using N 2 as carrier gas directly in the measuring chamber.
The relative binding affinities of the complex hpDNA VOC were calculated by adding to the gas-tight lab bottle different amount of liquid pure VOCs. After complete evaporation of the analyte (achieved incubating at 45°C for 10 min), this was sent to the sensors measuring chamber using N 2 as carrier. The VOC binding to the sensor surface was estimated by recording the frequency shift. A quantitative evaluation of the mass captured by the QCM sensors was achieved through the oscillation constant (Kq = −4.8 Hz/ng). Using the estimated nanograms it was possible to calculate the moles bound by the sensor. Fig. 3 shows, as an example, the frequency shifts measured with the sensor modified with CTGCAA as loop, for different amounts of 1-pentanol. The piezoelectric sensorgram was similar for all hpDNA-AuNP and VOCs, showing a rapid decrease of the signal after the stop-cocks opening, followed by a slower raise up to the steady state.
The steady state was reached between 100 and 200 s after the start of the measurement. The adsorption kinetics was similar for all the VOCs tested. The frequency shift (ΔF), taken as analytical signal, was recorded for all cases before desorption.
The bound compound was determined assuming 1:1 complexation stoichiometry. Using the Scatchard model, the ratio between bound and free compound versus the bound was plotted and the relative binding affinity was calculated by linear regression fitting. The results are reported in Table 2.
Despite their different structure both tetramer DNA loops had very similar binding affinity for all VOCs. The tetramer TTCT exhibited slight better affinity for aldehydes leading to a significant correlation with simulated results. On the contrary, there was no correlation for the other DNA tetramer loop looking at 1-pentanol and ethyl octanoate, respectively. These two molecules and 3-methylbutan-1-ol were bound by the pentamer CCCGA with an affinity of one order of magnitude higher than both the DNA tetramers. The other DNA pentamer loop, TAAGT, had the lowest binding affinity for all molecules. The correlation coefficient of this pentamer DNA was only 0.37 because of the lack of correlation for alcohols. Affinity was high in-silico and low for the experimental data.
The DNA loop size played an important role in the observed experimental behavior improving the binding affinities, as revealed by the DNA hexamer loop binding data. Both hpDNA having as loop ATAATC Table 1 Binding score average (Kcal/mol) of the tetramer, pentamer and hexamer DNA versus the VOCs tested in experimental part. In italic-bold, the binding score obtained by the simulations of the ssDNA versus the chemical classes (14 alcohols, 13 aldehydes, 18 esters and 5 ketones). The average and standard deviation was calculated over 10 conformers.  Fig. 3. Frequency shifts recorded by the sensor modified with CTGCAA after introducing in the 100-mL glass bottle different micromolar amounts of 1pentanol. In all cases, the relative binding affinity between hpDNA and VOCs were calculated taking the frequency shift before desorption of the compound adsorbed on the QCM surface modified with hpDNA-AuNP.
and CATGTC showed a significant interaction with ligands, which was approximately two fold higher than the smaller DNA loop. This was in good agreement with the prediction by virtual screening. Strong interaction with larger molecular weight molecules such as 1-pentanol, octanal, nonanal and ethyl octanoate was observed. On the other hand, the binding behavior of the other DNA hexamer, CTGCAA, was in agreement with the virtual screening results only for alcohols. The different responses of these DNA hexamers to the VOCs emphasized the importance of the chemical nature of the DNA loop. Such heterogeneous data set demonstrated that the binding affinities did not depend on the presence of the stem that was the same for all hpDNA. It should be noted that the seven ssDNA were extended with the same double helix stem of four base pair DNA, that played as spacer. The hairpin structure was chosen to have unpaired DNA bases in the loop therefore with more probability to bind the target via electrostatic interactions (hydrogen bond, van der Waals forces). By increasing the oligomer size, it was supposed that more bases contributed synergistically in binding the VOCs. From the results obtained was clear that larger structures gave better results in terms of chemical classes detection and discrimination. However, if compared with peptides from a previous work (Mascini et al., 2017) detection specificity of oligonucleotides appeared limited. As a consequence, oligonucleotides proposed in this work can be only used for the classification of different VOCs patterns rather than to track individual VOC. The inter-relationships between the sequence-specific responses of hpDNA to VOCs were highlighted considering all sensors measurements in multivariate analysis format. The data set was represented by the hpDNA-AuNPs-QCMs frequency shifts obtained using 900 μmoles of each VOC.
The data were autoscaled and then analyzed making use of unsupervised PCA. Fig. 4 shows the scores and loading plots of the first three principal components. The first component represented 46.84% of the variance, the second 28.19% and the third 17.37% displaying together a cumulative variance of 92.40%.
The score points, representing the new coordinates of the VOCs were interpreted assuming that close distance in plot plane is a measure of the similitude between samples. PC 1 separated well both aldehydes and the ketone butane-2,3-dione from alcohols and esters. PC 2 highlighted the differences within alcohol and ester classes grouping the low molecular weight molecules ethanol and ethyl acetate. PC 3 contributed to the separation between ethanol and ethyl acetate, the two small molecules of the group. This separation was influenced by the synergic contribution of all sensors.
The loadings, representing the contribution of each DNA sensor to the principal components, contributed mostly to the scores spread on the PC 2. The PC 2 axis highlighted the differences among sensors. Both pentamer DNA contributed significantly to the separation of the small alcohol and ester to the other family members. On the other hand, the hexamer ATAATC and the tetramer TTCT played an important role in clustering on the PC 2 the molecules with higher molecular weight confirming the predictions obtained by virtual data. The other two hexamers had very similar pattern recognition performance contributing only in spreading the VOCs on PC 1. All sensors contributed to the spread on the PC 3.
It is important to note that hpDNA-AuNPs-QCM sensors can discriminate molecular classes and separate molecules on the basis of the molecular weight. The PCA algorithm highlighted that the DNA sensors, used as array, can be effectively applied to those cases where the difference between VOC patterns plays a crucial role in classification purposes.

Table 2
HpDNA-AuNP sensors relative binding affinities vs the VOCs, estimated using piezoelectric response. The correlation coefficient between experimental and simulated binding is reported in the last row. The standard deviation was calculated using three measurements taken in three different days.

Conclusions
This work contributes to the growth of the DNA applications in biotechnological and analytical field. For the first time, the interaction between hpDNA loops and VOCs were rationally calculated by virtual assessment and then experimentally tested. A good matching between in-silico selection and experimental results was found especially with hexamer hpDNA.
Multivariate data elaboration showed that beyond interesting differences between chemical classes, molecules could also be clearly discriminated based on the molecular weight.
The key parameter for increasing the affinities of sensors versus VOCs was found to be the size of the DNA loop within the hairpin structure.
This work represents the starting point for the selection hpDNA used as molecular binding elements in gas sensors. In near future, taking advantage of the fast progress in computing, larger ssDNA loops with more complex shapes can be screened in short times, tailoring the efficiency and effectiveness of the gas analysis.