Searching HPV genome for methylation sites involved in molecular progression to cervical precancer

Background: Human Papilloma Virus has been considered as the main cause for cervical cancer. In this study we investigated epigenetic changes and especially methylation of specific sites of HPV genome. The main goal was to correlate methylation status with histological grade as well as to determine its accuracy in predicting the disease severity by establishing optimum methylation cutoffs. Methods: In total, sections from 145 cases genotyped as HPV16 were obtained from formalin- fixed, paraffin-embedded tissue of cervical biopsies, conization or hysterectomy specimens. Highly accurate pyrosequencing of bisulfite converted DNA, was used to quantify the methylation percentages of UTR promoter, enhancer and 5' UTR, E6 CpGs 494, 502, 506 and E7 CpGs 765, 780, 790. The samples were separated in different groupings based on the histological outcome. Statistical analysis was performed by SAS 9.4 for Windows and methylation cutoffs were identified by MATLAB programming language. Results: The most important methylation sites were at the enhancer and especially UTR 7535 and 7553 sites. Specifically for CIN3+ (i.e. HSIL or SCC) discrimination, a balanced sensitivity vs. specificity (68.1%, 66.2% respectively) with positive predictive value (PPV) and negative predictive value (NPV) (66.2%, 68.2% respectively) was achieved for UTR 7535 methylation of 6.1% cutoff with overall accuracy 67.1%, while for UTR 7553 a sensitivity 60.9%, specificity 69.0%, PPV=65.6%, NPV=64.5% and overall accuracy=65.0% at threshold 10.1% was observed. Conclusion: Viral HPV16 genome was found methylated in NF-1 binding sites of UTR in cases with high grade disease. Methylation percentages of E6 and E7 CpG sites were elevated at the cancer group.


Introduction
Cervical cancer is the fourth most common cancer in women, and the seventh overall, with an estimated 528,000 new cases in 2012. A large majority (around 85%) of the global burden occurs in the less developed regions, where it accounts for almost 12% of all female cancers. Since 1999, Human Papilloma Virus (HPV) has been considered as the main cause for cervical cancer [1]. Over 100 different types of the virus have been isolated from human samples, out of which 20 are considered oncogenic and have been epidemiologically linked to this cancer [2].
A persistent infection by HPV aided by other Ivyspring International Publisher parameters results in HPV integration and progression to high grade lesions. Most HPV infections are cleared by the host's immune system before they progress to lesions. Prophylactic HPV vaccines and traditional Pap-smear screening are undoubtedly capable of decreasing the incidence and mortality of cervical cancer. However, a large number of females succumb to the disease each year due to late diagnosis and resistance to conventional treatments. Nowadays, the more extensive use of modern molecular biological methods, have added to cervical cancer screening approaches [3]. In order to understand the biology of this tumor, it is of utmost importance to analyze its molecular dynamics aiding to improve the clinical outcome. Epigenetics is a well-established phenomenon that plays a major role in virus-associated neoplasms [4,5] and one of the most widely studied epigenetic changes is DNA methylation. During cervical carcinogenesis, substantial changes in methylation are observed in both the host cell and the viral genome. Methylation of the viral DNA has been recently proposed as a novel biomarker with encouraging results [6]. The quantification of the percentage of cytosines with a covalently added methyl-group at individual CpG (Cytosine-phosphate-Guanine) dinucleotides reflects the degree of epigenetic changes of the viral genome. Many studies mainly focused on L1 viral gene, have already shown that the methylation percentage of HPV 16 specific CpG sites along this viral gene, is increasing gradually and it is highest in women with high-grade cervical neoplasia [7][8][9][10].
This study aims to assess whether quantitative measurement of methylation of CpGs along the HPV 16 UTR, E6 and E7 genes, could predict the presence of high-grade disease at histology in women testing positive for the HPV 16 genotype. More specifically, we aim to correlate specific sites with the histological grade and to determine their accuracy in predicting the disease severity by establishing optimum methylation cutoffs.

Material and Methods
In total, sections from 145 non-pregnant women, 21-62 years of age genotyped as HPV16 were obtained from formalin-fixed, paraffin-embedded tissue of cervical biopsies, conization or hysterectomy specimens of females that visited the gynecology clinic of Attikon University General Hospital, Athens, Greece, between May 2014 and May 2016. Women were included irrespective of their ethnicity, smoking habits, phase in their cycle, menopausal status and contraceptive practices. Women who were HIV or hepatitis B/C positive, with autoimmune disorders, or had a previous history of cervical treatment were excluded. The histological diagnoses included the following groups: normal, CIN1, CIN2, CIN3, squamous cell carcinoma (SCC), and adenocarcinomas.
DNA was extracted from formalin-fixed, paraffin-embedded tissue sections, using QIAamp DNA FFPE Tissue Kit (Qiagen, Heidelberg Germany). All steps in the purification procedure were done using the automated QIAcube technology (Qiagen, Heidelberg Germany). DNA typing was performed with the HPV Genotypes 14 Real-TM Quant (Sacace Biotechnologies, Como Italy) for the quantitative detection and genotyping of Human Papillomavirus types (16, 18, 31, 33, 35,  DNAs were then bisulfite converted using the EpiTect Bisulfite Kit (Qiagen, Heidelberg Germany), according to the manufacturer's instructions and stored at -80°C. Biotin-labeled primer sets, sequencing primers and polymerase chain reaction conditions were used as previously described [11] to amplify HPV16 UTR region, while biotin-labeled primer sets and sequencing primers for E6 and E7 genomic regions were designed in the present study with the aid of PyroMark Assay Design 2.0 ( , which provides a site-specific quantification of methylation at individual CpG sites. Pyrosequencing protocols were at first applied to bisulfite converted SiHa cells. This cervical cell line contains a single genome of HPV-16 integrated into chromosomal DNA which is completely unmethylated at LCR, E6 and E7 genomic regions [12] a finding that was confirmed by our protocols. In order to establish a limit of blank for each specific methylation site, we performed series of ten measurements using SiHa DNA. The highest value that was reported in the series of measurement was 3.4%. DNA from SiHa cells was run in every experiment as an unmethylated control. As of lack of an artificially methylated control for the studied CpG dinoucleotides spanning along HPV16 genome, we randomly chose a highly methylated to all sites clinical sample and performed ten independent experiments to check the reproducibility of our protocols to detect methylation. The measurements had at most 3.2% standard deviation around the mean value. The UTR methylation analysis included CpGs sites which are located within the p97 HPV 16 promoter (31, 37, 43, 52, 58, 7862), the enhancer (7535, 7553) and the 5'UTR (7270, 7461, 7455, 7428). The E6 methylation analysis included CpGs 494, 502 and 506, while the E7 methylation analysis included CpGs 765, 780 and 790 (reference sequence NC_001526).

Statistical analysis
The statistical analysis was performed by programming in SAS 9.4 for Windows (SAS Institute Inc. NC, USA) [13,14]. Microsoft Excel for Windows was used for data storage and preprocessing. We applied the student's t-test to examine if the various methylation percentages were statistically different between various histological groupings. Finally, the algorithms for the determination of the optimum threshold values were developed in-house within the MATLAB software environment and programming language (The MathWorks, Inc. Natick, Massachusetts, U.S.A.).
We calculated different accuracy parameters for the ability of the mean methylation to detect the presence of disease for the previous histological cutoffs. These included the sensitivity (Sens), specificity (Spec), positive (PPV) and negative predictive value (NPV), the false positive (FPR) and false negative (FNR) rate, the overall accuracy (OA) and the positive (PLR) and negative likelihood ratio (NLR). These parameters were calculated from methylation percentage 0% and increased up to 100% using an increment step of 0.1%, as described in our previous studies [7,[15][16][17]. Subsequently graphs depicting sensitivity, specificity and overall accuracy for all methylation positions were produced and the above measures were reported for two positions, i.e. the methylation position that maximizes OA and the position that produces a more balanced result, i.e. whereas the difference between sensitivity and specificity is minimal.
In order to identify any possible correlations of age or disease severity with the methylation levels and the mean methylation per region it was used the Spearman correlation coefficient (Rs). Specifically concerning disease severity the histological status was numerically coded as 1 to 5 for the histological categories of: normal, CIN-1, CIN2, CIN-3 and carcinomas (SCC and adenocarcinomas) respectively.
The methylation of E6 CpG sites ranged from 8.0±6.9 to 40.6±27.8 among the histological groups (table 3) and the methylation of E7 CpG sites ranged 3.6±1.8 to 29.6±27.2 among the various histological groups (table 3).

Methylation and age
Concerning the relation of the 5' UTR region sites analyzed in this study the analysis via Spearman correlation indicated that no specific site neither the mean methylation for all 5' UTR sites was related to the women age. Similarly, for the enhancer region it was found a rather low correlation (Rs=0.23, p=0.0053) of age with site 7535, while the correlation for site 7553 was R s =0.15 (p=0.0802), and for the mean methylation of the enhancer region was R s =0.20, p=0.0248. The correlation of the promoter sites with age showed a weak correlation with methylation levels at site 7862 (R s =0.18504, p=0.0280) and there was no correlation of the remaining sites or of the mean methylation of the promoter region. Furthermore, no significant relation of age with the methylation of sites from E6 or E7 region, neither the mean methylation of the E6 or E7 region were identified (p>0.05 for all cases).

Methylation and disease severity
The statistical tests correlating disease severity (as formed in a numeric manner) with methylation percentages indicated generally weak correlations. Specifically, for the 5'UTR sites the correlation for sites 7270, 7461, 7428 and the mean methylation of all 5' UTR sites was Rs 0.22, 0.36, 0.24 and 0.29 respectively, p<0.05, while for sites 7535, 7553 and for the mean enhancer methylation a correlation of 0.33, 0.29 and 0.35 respectively, p<0.001 was found. A negative correlation for site 43 of the promoter (R s -0.23, p=0.0052) was indicated. As far as E6 is concerned, a weak correlation existed for sites 494, 502 and the mean E6 methylation (R s 0.18, 0.20 and 0.17 respectively, p<0.05) and finally for site 790 of the E7 region the correlation was 0.19, p=0.0265.

Calculation of thresholds discriminating between histological groups and performance characteristics
Accuracy parameters were determined for different histological cut-offs. We searched for a methylation threshold optimizing overall accuracy as well as for a threshold that produces a balanced sensitivity vs. specificity (table 4). The most important identified methylation sites were on the enhancer region, sites 7535 and 7553. Specifically for CIN3+ discrimination, a balanced sensitivity vs. specificity (68.1%, 66.2% respectively) with positive predictive value (PPV) and negative predictive value (NPV) (66.2%,68.2% respectively) was achieved for UTR 7535 methylation of 6.1% cutoff with overall accuracy 67.1%, while for UTR 7553 a sensitivity 60.9%, specificity 69.0%, PPV 65.6%, NPV 64.5% and overall accuracy 65.0% at threshold 10.1% was observed. For the remaining methylation sites we also calculated the thresholds that maximized overall accuracy or produced a balanced sensitivity vs. specificity. However, results produced were not performing well (data not shown). Table 4. Performance characteristics for discriminating CIN3+ cases using two cut-offs, one that maximizes overall accuracy (Optimal threshold) and one that balances sensitivity vs. specificity (Balanced threshold).

Discussion
Many epigenetic alterations are observed including DNA hypomethylation, hypermethylation of tumour suppressor genes and histone modifications during all the stages of cervical cancer [18]. One of the epigenetic mechanisms that are increasingly studied is HPV genes' methylation. At present, a consistent correlation of increased methylation of capsid viral genes with histology severity is referred by the researchers [7,9,11,19,20]. On the other hand, studies on methylation status of the UTR, E6 and E7 regions reveal heterogeneous and rather inconclusive results [21][22][23][24][25].
DNA methylation has been recognized as a frequent event in cervical cancer and as such, is referred as of valuable tool in the early detection of precancerous disease. In the present study we attempted to elucidate the methylation profile of HPV 16 genome for each of 4 CpG sites of the 5' UTR, 2 of the enhancer, 6 of the promoter, 3 of the E6 and 3 of the E7 gene in clinical specimens of different severities in a Greek women population. The studied sites were located in the 5' UTR at nt 7270, 7461, 7455 and 7428, in the enhancer at 7535, 7553, in the promoter at 7862, 31,37,43,52,58, in E6 at 494, 502, 506 and in E7 at 765, 780, 790.
The HPV16 UTR plays an important role in regulation of viral gene expression. HPV 16 E6 and E7 oncogenes are transcribed from the P97 promoter which is located at 3' UTR and is regulated by products of the viral E2 gene through a feedback mechanism. The transcription of the viral oncogenes depends also on the enhancer's activity, which is located between the positions 7454 and 7854 and acts as a cis-acting element that drives the transcription of the early HPV genes [26]. Several host transcription factors, such as like AP-1, NF1, SP1, TFIID, TF1, Oct-1 are bound to specific sites of this viral gene triggering the over-production of E6 and E7 oncoproteins gradually leading to neoplastic progression [27,28]. There are studies supporting that this region is highly or moderately methylated [29][30][31][32] and the methylation is associated with the severity of cervical neoplasia. It is assumed that if E2 viral protein does not manage to bind at specific sites due to inhibition by methylation of cytosines within its binding site, the repression of oncogenes' transcription will be diminished. On the other hand, several studies claim that this region has an overall low percent methylation and there is no correlation of methylation with different severities of cervical carcinogenesis [10,21,24,33,34]. According to the results of the present study, the mean methylation of HPV 16 UTR showed constantly low methylation percentages between the different histological groups. The only sites with remarkable results were 7535 and 7553 mapped at the enhancer, were a correlation with CIN 3+ can be proposed. These sites are part of the binding positions of NF-1 [22,35,36]. One could assume that such an increase of methylation could eliminate the NF-1 binding activity, affecting the enhancer's cis-acting efficiency for the transcription of the early HPV genes. But according to our results, although these specific enhancer sites are methylated along with the progression of the disease, we can probably suggest that the massive production of E6, E7 oncoproteins is not affected by the addition of methyl groups at these sites and they may be selectively used to discriminate CIN3+ cases.
In the present study we also investigated the methylation of CpG sites that are located at E6 and E7 viral genes and specifically in those regions that are considered to be immunostimulatory motifs [25]. According to Hacker et al [37], a sequence motif that contains CpGs has the capacity to stimulate certain immune cells. As far as HPV 16 is concerned it has been shown that TLR9 is capable of recognizing a CpG motif between nt 496 and 514 of E6 gene [38] and methylation of the sites located into this genomic part could possibly lead to an escape from immune surveillance, having thus a significant biological impact in cervical carcinogenesis. One of the objectives of Sen et al. [25], was the analysis of the influence of methylation within two immunostimulatory CpG motifs within HPV16 E6 and E7 genes to cervical carcinogenesis. Presence of elevated methylation was shown at cervical cancer samples, with higher proportions at samples that had an integrated HPV 16 infection. In the present study, although we do not have the information concerning the integration of viral genome to the host genome, the percentages of methylation at CpG sites of E6 and E7 genes were elevated at the cervical cancer histology group when compared to precancer samples a finding that is in accordance to published results. As of statistical analysis revealed, these specific sites originated from E6 and E7 gene, cannot be proposed as biomarkers that could distinguish precancerous HPV 16 infections that have a true oncogenic potential from those that will dissolve without leading to disease.
In conclusion, the knowledge of viral genome's alterations during the viral life cycle is adding valuable information on understanding the biology of cervical cancer and to the exploring of new biomarkers. Frequently, different methodological approaches to the methylation study of viral genes may lead to inconsistent results, so the scientific community should feed the literature with findings of this area of HPV research. According to our results, methylation of HPV 16 UTR is not highly associated with severity of cervical neoplasm. However, two specific sites mapped at the enhancer region of UTR, could probably act as biomarkers and molecular determinants that could distinguish the rare HPV 16 infections that have a true oncogenic malignant potential from those common infections from HPV 16 that resolve spontaneously without leading to disease. Although a modest number of samples were studied, the reproducibility of these results should be assessed in large validation sets. Future studies should also analyze serial samples from larger cohorts to further assess the value of methylation as a predictive and diagnostic molecular determinant.