Cpg Methylation in G-Quadruplex and I-Motif DNA Structures

Abberant hypomethylation in DNA regions with noncanonical folding potential (ncDNA motifs) is believed to predetermine tumor development - presumably, by facilitating G-quadruplex (G4) and/or i-motif (IM) formation via altering nucleosome positioning (stable G4s induce subsequent genomic rearrangements). We questioned whether CpG methylation per se affects the dsDNA-ncDNA equilibrium. Thermodynamic studies of genomic and model oligonucleotides with methylated CpG sites at different positions are reported. The genomic oligonucleotides analyzed in this work are DNA fragments with reportedly different methylation statuses in colorectal cancer and normal cells. Free energies of duplex, ncDNA formation from single strands were calculated based on melting curve analyses. Polyethylenglycole was used to imitate crowding effect. Our results suggest that CpG methylation may alter the energetic barrier for dsDNA-IM transitions.


Introduction
DNA-methylation in mammalian somatic cells occurs predominantly due to cytosine modification in CpG dinucleotides. The methylation pattern distinguishes two different fractions of CpG composition: the main fraction in which CpG dinucleotides are infrequent (on average, 1 CpG per 100 b.p.), but heavily methylated; and the secondary fraction, known as CpG islands (CGIs), which is represented by short DNA fragments (~ 1000 bp) with average CpG frequency of 1 per 10 b.p. [1]. Most CGIs contain sites of transcription initiation, which are typically demethylated (i.e., contain no methyl groups at the C5 position of cytosine residues) in an otherwise heavily methylated genome, even when the corresponding gene is transcriptionally inactive. There are, however, examples of promoter CGIs that become methylated, leading to stable silencing of the respective genes. Both hypermethylation of promoter CGIs (i.e., the excess of 5-methylcytosine residues) and genome-wide hypomethylation (the loss of the methyl groups in normally methylated CpG sites) are regarded as hallmarks of cancer genomes [2][3][4]. In this work we describe impacts of colorectal cancerassociated aberrant methylation on the propensity of the respective DNA fragments for conformational rearrangements.
One possible mechanism of aberrant hypomethylation-induced oncogenesis has been proposed previously based on the analysis of genomic distributions of noncanonical secondary structures (ncDNA) and DNA breakpoints associated with somatic copy number alterations in cancer tissues [5]. The proposed mechanism includes the following steps: (i) random epigenetic mutation changes tissue-specific methylation pattern; (ii) hypomethylation adjacent to ncDNA motifs favors ncDNA folding in the presence of stabilizing proteins and negative supercoiling; (iii) nсDNA increases the likelihood of further alterations (DNA damage and genomic rearrangements).
Four-stranded noncanonical structures of G/C-rich DNA, i.e., G-quadruplexes (G4s -planar arrangements of guanine tetrads) and i-motifs (IMs -intercalated parallel duplexes stabilized by hemiprotonated C-C pairs) have recently gained significant attention as possible regulatory elements and the hotspots of genomic or epigenetic instability [6][7][8][9]. Although IMs are typically stable at pH 3-6.5, recent data support possibility of IM formation under physiological conditions [10]. Both G4s and IMs can repress or activate gene expression at the transcriptional level, which is, in some cases, beneficial for genome maintenance. For instance, C-myc expression is presumably controlled by the interplay between protein factors recognizing mutually exclusive G4 and IM sites in the promoter region of this oncogene [11].
Methylation effects on folding and thermal stability of G4 motifs colocalized with CGIs in promoter regions of several genes have been investigated previously [12,13]. The epigenetic modification was shown to alter G4 topology and thermal stability. However, the data are somewhat inconclusive and no general tendencies can be outlined. For instance, methylation reportedly stabilized the G4 structure in the BCL-2 promoter, but had the opposite effect on the MEST promoter G4. There is also some controversy about IMs: single and double Me-CpG insertions have been shown to enhance thermal stabilities of telomeric IM structures, while further methylation caused destabilization [14].
The influence of CpG methylation in G4 motifs or the opposing IMs on the 'B-DNA-ncDNA' equilibrium remains unclear at the moment. In this paper, we report the comparative analysis of ncDNA and B-DNA thermodynamic stabilities in methylated and demethylated states. The impacts of methylation rate (single or several Me-CpG sites within the same G4 / IM) and Me-CpG positions, as well as molecular crowding effects, are discussed.
Our major goal was to clarify the dependence of DNA refolding potential on methylation status, which may shed new light on the role of epigenetic alterations in oncogenesis.

Oligonucelotide synthesis, purification and MS analysis
Oligonucleotides (ONs) were synthesized on a ASM-800 DNA synthesizer (Biosset, Russia) following standard phosphoramidite protocols using standard reagents. For methylated ONs, we additionally used 5-Mе dC CE phosphoramidite (Glen Research, USA). The ONs were purified by preparative scale reverse-phase HPLC as described in [15] (for IMs) and [16] (for G4s). The final purity of all ONs was determined to be 95% by HPLC. MALDI-TOF MS analysis of the ONs was performed as described in [16]; the data were acquired on a Microflex mass spectrometer (Bruker, USA).

UV absorption and circular dichroism spectroscopy
UV absorption and circular dichroism (CD) spectra were recorded on a Chirascan spectrophotometer (Applied Photophysics, UK), equipped with a thermostated cuvette holder. Solutions of G4-or IM-forming ONs (2.5μM) in 20mM Tris-HCl buffer (pH 7.6 or 5.5, respectively) containing 100mМ KCl were annealed rapidly (heated to 95 °C and snap-cooled on ice) prior to measurements to facilitate intramolecular folding. Molar CD per nucleotide residue was calculated as follows: Δε = θ/(32.982 × C × l × n), where θ is ellipticity (degree), C is ON concentration (M); l is optical pathlength (cm) and n is the number of nucleotide residues in the ON. Thermal difference spectra (TDS) were obtained by subtracting absorption spectra registered at 15 °C from the spectra registered at 90 °C.

Rotational relaxation time assay
Rotational relaxation times (RRT) of the fluorescent intercalator -ethidium bromide (EtBr) in complexes with G4s/ IMs (1:1) were calculated based on EtBr fluorescence polarization and lifetime values using the PerrineWeber equation [17,18]. The latter values were estimated as described in [19] using Cary Eclipse spectrophotometer (Agilent technologies, USA) and Easy Life V fluorescence lifetime fluorometer (Horiba, Japan).

Melting experiments
In UV-melting experiments, absorption at 295nm (for G4 and IMs) or 260nm (for duplexes) was registered from 5 °C to 90 °C every 1 °C at a heating rate of 0.5 °C/min. Polyethylene glycol with average molecular weight of 200 (PEG 200) was added to ON solutions to a final concentration of 40% to imitate molecular crowding conditions. All G4 and IM samples were annealed rapidly prior to the experiments as described in the previous subsection. Duplex samples were heated to 95 °C and cooled slowly to room temperature.
The melting curves were analyzed and fitted using DataFit9 software (Oakdale Engineering, USA).

Oligonucleotide design
In this study we analyzed model sequences and fragments of the human genome that are differentially methylated in colorectal cancer and normal tissues. The latter sequences were selected based on previously published results of genome-scale methylation profiling with Infinium HumanMethylation450 chips [20]. The coordinates of CpG sites with replicable and statistically significant differential methylation (candidate tumor markers) were overlaid with the coordinates of putative G4/IM sites obtained using ImGQFinder software [21]. Secondary structures of the presumed G4s and IMs containing candidate CpG/Me-CpG tumor markers were verified by optical methods, and the structures with ambiguous folding were excluded from subsequent analysis.
Sequences of the selected genomic fragments (methylated and demethylated variants), as well as model ONs, are provided in Table  1. G4 oplah1 is a fragment of the 5-oxoprolinase gene OPLAH; IMs Lmo1 and Fli1 are fragments the respective transcription factorcoding genes, and ppp is a fragment of the protein phosphatase 1 gene PPP1R16. Model ONs Ib (IM), mbb (G4) and their derivatives were designed to contain methylation sites at different positions: in G4/IM loops, at loop/G-tetrad boundaries or inside the IM core (Me-C in hemiprotonated C-C pairs).

Secondary structure verification
The G4 and IM structures were characterized by optical methods. The results are summarized in Figures 1 and 2 for genomic and model ONs, respectively.
Thermal difference and circular dichroism spectra (TDS and CD) were used to confirm G4/IM formation. Rotational relaxation time (RRT) assay with EtBr as a fluorescent intercalator was used to distinguish between inter-and intramolecular structures (RRT is roughly proportional to the hydrodynamic volume of the intercalator/ON complex). UV melting experiments were performed to assess thermal stabilities of the structures, and the melting temperatures (Tm) are provided in Table 1.    CD spectra and TDS of oplah1 (top panels in Figure 1A) and mbb (top panels in Figure 2A) contain specific signatures of G4 DNA [22,23]. Positive CD bands at 265 and 295 nm point to hybrid G4 topologies. RRTs of EtBr in complexes with the ONs (left bottom panels in the figures) suggest intramolecular folding. Melting profiles (right bottom panels) indicate moderate and high thermal stabilities of oplah1 and mbb, respectively.
ONs ppp, Fli1, Lmo1 and Ib adopt thermodynamically stable intramolecular IMs, as evident from Figure 1B and Figure 2B. Characteristic features of IM structures are present in the CD spectra (positive band at 285nm) and TDS (negative band at 295nm); RRT values are close to the calibration line (this implies intramolecular folding) in all cases except lmo1. The EtBr/lmo1 RRT value is reduced -presumably, due to a particularly compact folding of the lmo1 IM.

Methylation effects on thermal stabilities of the secondary structures
Methylation of the genomic G4 oplah1 at the loop/tetrad boundary enhanced thermal stability of the structure, as evident from the somewhat shifted melting curve and increased CD amplitude of M_oplah1 (ΔTm ≈ +10 °C). The general shapes of oplah1 and M_oplah1 CD spectra are rather similar, suggesting no principal changes in the G4 topology ( Figure 1A).
In the case of the model G4 mbb, both loop CpG methylation and loop/tetrad CpG methylation had minor effects on Tm values, while the portion of the folded structure appears to decrease with increasing methylation rate (see CD amplitudes and hyper/ hypochromism changes in Figure 2A). The latter effect is particularly pronounced for Me-CpG sites at loop/tetrad boundaries. The extensively methylated G4 GQall seems to be mostly unfolded (the characteristic negative TDS band at 295 nm is absent) despite the relatively high apparent Tm value. Thr increase of RRT values with increasing methylation rate can be attributed to the fact that methylation per se contributes to the G4 hydrodynamic volume. However, partial aggregation, i.e., intermolecular (possibly non-G4) folding of heavily methylated mbb derivatives cannot be excluded.
As concerns IMs, methylation had moderate (mostly stabilizing) effects on the genomic structures ( Figure 1B) and profound stabilizing effects on the model structure ( Figure 2B) for both core and loop cytosine modifications (ΔTm ≈ +10 °C for a single Me-C in the core; further modification had less pronounced effects). Overall, the results obtained for the IM structures agree with the data in the literature, except that no negative impact of extensive methylation [14] was observed. Subsequent in-depth analysis was performed for IMs exclusively.

Molecular crowding effects
We questioned whether substantial influence of CpG methylation on IM thermal stabilities is maintained under molecular crowding conditions. Polyethylene glycol of relatively a low average molecular weight (PEG 200) was used as crowding agent (higher molecular weight PEG is known to form nonspecific complexes with DNA, which may interfere with IM folding [24]). As evident from Figure 3A, PEG 200 tends to mitigate (Lmo1 and ppp) or even reverse (Ib and Fli1) methylation effects in IMs. Genomic IMs (both methylated and demethylated) exhibited decreased Tm values in the presence of PEG, and the opposite effect was observed for model structures.
According to the data in the literature, crowding usually has positive impacts IM thermal stabilities at near-physiological pH [24,25]. Stabilization is sometimes attributed to general

Comparative analysis of B-DNA and i-motif stabilities
To clarify whether methylation may shift the B-DNA-ncDNA equilibrium, we compared thermal stabilities of IMs and respective duplexes for both methylated and demethylated variants. To obtain duplexes, the IM mixtures with complementary strands (1:1) were annealed slowly, and CD spectra were recorded to verify B-DNA folding. Spectra of the duplexes contained B-DNA signatures rather than IM/G4 signatures or their superposition at both acidic and neutral pH. Notable exceptions included Lmo1 and its methylated analog M_Lmo1 (under acidic non-crowded conditions these remarkably stable IMs appear to dominate even in the presence of the complementary strand).
All duplexes (demethylated variants in particular) displayed high thermal stabilities under non-crowded conditions ( Figure 3B). PEG destabilized the duplexes in most cases, which is in line with the previously published data [26]. Predictably, duplexes are in general thermodynamically more favorable that IMs even at pH 5.5 irrespective of the methylation status. However, methylation alters the Tm ratio and free energies of secondary structure formation from single strands. This implies a possible impact on activation energy of the B-DNA↔ncDNA rearrangements. We addressed this matter for the model ON Ib and its extensively methylated analog I4P using a simplified approach (possible intermediates other than ssDNA were not taken into account). Enthalpic and entropic contributions to free energies of IM or dsDNA formation from single strands were obtained from the melting curves using and a two-state model [28], and the free energies were normalized to the physiological temperature. The results are summarized in Figure 4. Methylation of Ib increased the energetic barrier for the IM→duplex transition from 0.6 to 2.5kcal/ mol at 37 °C in non-crowded microenvironment (left panels in Figure 4). The effects were reversed in the presence of PEG: methylation lowered the IM→duplex barrier from 1.8 to 0.6 kcal/ mol and prohibited reverse transitions (right panels in Figure 4).

Conclusions
We have shown that CpG methylation has diverse and in some cases pronounced effects on thermal stabilities of ncDNA structures and ncDNA-dsDNA transition probabilities. Substantial stabilization upon single-point methylation was demonstrated for a G4 structure and two IM structures representing genomic fragments with differential methylation in normal and colorectal cancer cells, but the crowding agent mitigated those stabilizing effects. Extensive methylation of the model IM structure caused dramatic stabilization. Importantly, the effect was reversed in crowded microenvironment.
Our findings support the hypothesis of methylation-induced conformational rearrangements of genomic DNA and illustrate the importance of accounting for crowding effects.