Association of microRNA biosynthesis genes XPO5 and RAN polymorphisms with cancer susceptibility: Bayesian hierarchical meta-analysis

XPO5/RAN-GTP complex mediates the nuclear transport of pre-miRNAs in the miRNA processing system, its altered expression is indicated to be correlated with cancer risk. Several studies have inspected the association between XPO5 or RAN polymorphisms and the risk of various cancers, but the findings remain controversial. A Bayesian hierarchical meta-analysis was carried out to review and analyze the effect of XPO5 and RAN polymorphisms on cancer risk. The association was estimated by calculating the logarithm of odds ratio (Log OR) and 95% credible interval (95% CrI). The expression quantitative trait loci (eQTL) analysis was used for in silico functional validation of the identified significant susceptibility loci. Consequently, 38 case-control studies (from 27 citations) with 27,459 cancer cases and 25,151controls were included in the meta-analysis of the five most prevalent SNPs (rs11077 A/C, rs2257082 G/A, rs3803012 A/G, rs14035 C/T, rs3809142 C/T). In the XPO5 gene rs11077 SNP, the minor C allele significantly increased the risk of cancer (Log OR = 0.120, 95% CrI = 0.013, 0.241), and a strong association between rs11077 SNP and cancer risk was also found in the dominant model (CC + AC vs. AA: Log OR = 0.132, 95% CrI = 0.009, 0.275). In addition, the minor GG genotype allele of the RAN gene rs3803012 SNP significantly increased the cancer risk (Log OR = 0.707, 95% CrI = 0.059, 1.385). Statistically significant associations between rs3803012 SNP and cancer risk were also observed in the recessive model (GG vs. AG + AA: Log OR = 0.708, 95% CrI = 0.059, 1.359). Furthermore, the eQTL analysis revealed that rs11077 SNP was significantly correlated with XPO5 mRNA expression, which provided additional biological basis for the observed positive association. Our results suggest that XPO5 rs11077 may be a possible functional susceptibility locus for cancer risk.


Introduction
MicroRNAs (miRNAs) are a highly conserved class of small, noncoding RNAs, which mediate post-transcriptional gene silencing [1]. Over the past decade, they have increasingly been recognized to be involved in the initiation and progression of human carcinogenesis [2]. The biosynthesis of miRNAs involves a multiple-step process that starts in the nucleus of the cell, where miRNA genes are initially transcribed as primary miRNAs (pri-miRNAs), and then converted into precursor miRNAs (pre-miRNAs). Secondly, with the assistance of GTP-binding nuclear protein Ran/exportin-5 (XPO5) complex, the pre-miRNAs are exported from the nucleus to the cytoplasm, where the mature miRNA molecule exerts its main function [3,4].

Ivyspring International Publisher
During the miRNA maturing processing, the XPO5/RAN-GTP complex mediates the nuclear transport of pre-miRNAs, which are crucial components.
XPO5, a member of the nucleo-cytoplasmic exportins, is related to the human export receptor that uses the Ran-GTPase to control cargo association [5,6]. Studies have shown that the overexpression of XPO5 is found to improve transport efficiency and further enhance miRNA activity, while the downregulation of XPO5 leads to a loss of pre-miRNA function [7,8]. RAN encodes a small G protein that is crucial for the translocation of RNA and proteins through the nuclear pore complex. If the RAN is depleted, the output of the pre-miRNA will be greatly reduced [9,10]. Therefore, impaired miRNA processing caused by the dysregulation expression of miRNA biosynthesis genes XPO5 or RAN can noticeably promote the tumorigenesis [11].
Increasing evidence proposed shows that single-nucleotide polymorphisms (SNPs) in core components of miRNA biogenesis may impair or enhance miRNA processing efficiency or function, which can function as an oncogene or tumor suppressor [12]. Formerly, several studies have been conducted to assess the association between XPO5 and RAN SNPs and cancer susceptibility in diverse populations. However, the conclusions of the findings remain inconsistent. Hence, a meta-analysis is required to combine data from all the individual studies to obtain a more comprehensive and effective estimation. Previous studies have reviewed the relationship between polymorphisms of miRNA processing genes and cancer risk through classical meta-analysis approach [13]. However, the results did not indicate a correlation between SNPs in XPO5 and RAN genes with cancer risk. This might be as a result of the small number of articles included. Indeed, classical meta-analysis requires the initial sample to be large enough to ensure the asymptotic normality of the effect size and, to further obtain accurate and realistic results [14]. Bayesian hierarchical meta-analysis, however, provides a more accurate pooled effect size compared to classical meta-analysis approaches, especially in situations with a small number of studies [15]. Therefore, in the present study, we carried out a Bayesian hierarchical meta-analysis including newly published articles to find a vivid and precise association between SNPs in XPO5 and RAN genes with cancer risk based on all available eligible studies. We also used expression quantitative trait loci analysis (eQTL) to validate the potential function of the identified significant susceptibility loci.

Retrieval strategy
To identify all potentially eligible publications, PubMed, PubMed Central (PMC), Web of science, Embase, China National Knowledge Infrastructure (CNKI), Chinese Wanfang databases, Wiley, Google Scholar, Cochrane, the Cochrane Central Register of Controlled Trials were searched, using a combination of the following keywords: 'XPO5/ exportin 5/ exp5/ RAN/ ARA24/ Gsp1/ TC4'; 'SNP/ polymorphism/ variation/ variant'; and 'tumor/ cancer/ carcinoma/ neoplasm'. The search was limited to articles published in English or Chinese through April 9, 2019. References of the relevant literature and review articles were also evaluated to identify all potentially eligible articles. This meta-analysis was carried out in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Supplementary PRISMA 2009 Checklist) [16].

Inclusion and exclusion criteria
Eligible publications were selected based on the following inclusion criteria: (i) evaluation of genetic association between XPO5 or RAN and susceptibility to cancer; (ii) a case-control designed study; Articles meeting the following criteria were excluded: (i) reviews, meta-analyses, conference reports, or editorial articles; (ii) duplicate records; (iii) no available data to extract; (iv) the control subjects exhibited a departure from Hardy-Weinberg equilibrium (HWE).

Data extraction and Quality assessment
The following information was extracted by two reviewers independently and disagreement was solved through discussion: first author's name, publication year, country, ethnicity, cancer type, polymorphisms, sample size of cases and controls, genotype distribution, source of control groups (population-based (PB) or hospital-based (HB)), genotyping method, HWE in controls. If more than one type of cancer or multistage research was involved in a single article, data for each type of cancer was extracted independently. When the data in eligible articles was unavailable, we tried our best to contact the corresponding authors for original data. Quality assessment of articles was conducted using the Newcastle-Ottawa Quality Assessment Scale (NOS) [17]. NOS scores range from 0 to 9. To the best of our knowledge, there is no established cut-offs for low, moderate and high quality. Hence, we have relied on previous literature [18] to define low quality as a score ≤ 5, moderate quality as a score between 6 and 7, and high quality as a score between 8 and 9.

Statistical methods
In this meta-analysis, the following comparisons for XPO5 and RAN polymorphisms were evaluated in five common genetic models including allele model (V vs. W) (W for wild allele, V for variation allele), heterozygote model (WV vs. WW), homozygote model (VV vs. WW), dominant model (WV+VV vs. WW), and recessive model (VV vs. WW+WV).

Bayesian meta-analysis method
The Bayesian meta-analysis is a Bayesian modeling method which determines the prior distribution with hierarchical prior distribution, and then does the statistical inference [19]. Compared with the classical meta-analysis, the Bayesian hierarchical random-effect model can obtain accurate pooling effects, especially in situations with a small number of studies [14,[20][21][22]. A Bayesian approach allows one to coherently process the uncertainty in the heterogeneity parameter while focusing on inference for the effect parameters, and interprets the results more intuitively [23]. In order to compare the effect magnitude between different studies, the pooled logarithmic odds ratio (Log OR), between-study standard deviation (τ 2 ) and their respective 95% credible intervals (CrIs) are estimated. The 95% CrI is the Bayesian equivalent for standard confidence intervals. In particular, the model supposes that the mean of the Log ORs has a low-informative normal distribution (mean = 0, variance = 100) and the variance of the Log ORs has a low-informative inverse-gamma distribution (0.01, 0.01) [24]. Sensitivity analyses with different choices of low-information prior distributions showed the robustness of this choice [20]. In addition, we estimated the I 2 statistic, which is used to measure the total variation [25]. Forest plots, which illustrate Log ORs and 95% CrIs for both the individual trials and the pooled results, were included in our meta-analysis. Moreover, the heterogeneity plot displayed the joint posterior density of the two parameters, Log OR and τ parameters, with darker shading corresponding to higher probability density. All the statistical analyses were calculated using "bayesmeta" R package (https://cran.r-project.org/ web/packages/bayesmeta/index.html).

In silico functional validation
To validate the potential impact of the cancer risk SNP, we examined its association with the expression of corresponding genes using eQTL databases. The eQTL analysis was performed by using the genotyping and expression data of lymphoblastoid cells from 373 European individuals available in the 1000 Genomes Project [26]. Considering that many eQTLs are populationspecific, we also extracted eQTL data of East Asian individuals from a study by Stranger et al. [27], in which genome-wide mRNA expression in lymphoblastoid cell lines of 726 individuals from eight global populations in the HapMap3 project was analyzed. Seventeen cases of hepatocellular carcinoma genotype and gene expression data were obtained from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo) (GSE 65373). Choosing hepatocellular carcinoma and breast cancer as representatives, we also downloaded mRNA sequencing datasets of 154 paired cancer tissue samples and normal adjacent tissue samples from The Cancer Genome Atlas (TCGA-LIHC and TCGA-BRCA) (https://tcga-data.nci.nih.gov/tcga/). A linear regression model was performed to evaluate the correlation between SNPs and specific mRNA expression levels. A paired t-test was used to test for the differences in gene mRNA expression levels between cancer tissue and adjacent normal tissue from the TCGA database. All analyses were performed using R (version 3.5.1).

Study characteristics
The process of selecting eligible studies is depicted in Figure 1. A total of 194 articles were identified based on our search strategy, 118 of the articles were duplicates. After a screening of the titles and abstracts, 24 articles were excluded for irrelevant information (5 were reviews, 19 were not related to our topic). We eliminated 25 records after browsing the full text of the remaining 52 articles (16 were related to prognosis; 5 had overlapping study populations; 1 for unavailable data; 3 were departure from HWE). Finally, 38 studies from 27 articles with 27,459 cases and 25,151 controls were included in our meta-analysis . Table 1 shows the characteristics and relevant data of the included studies. The detail of NOS scores for every included study was shown in Table S1. In summary, 38 eligible case-control studies, five SNPs of XPO5 or RAN genes were investigated in the eventual analysis. In XPO5, the analyzed SNPs were rs11077 A/C, rs2257082 G/A; while in RAN, the analyzed SNPs were rs3803012 A/G, rs14035 C/T, rs3809142 C/T.

Quantitative synthesis
The main results of Bayesian hierarchical meta-analysis were calculated as the median of the marginal posterior distribution of the Log ORs and τ parameters. On the basis of the Bayesian hierarchical meta-analysis, XPO5 rs11077 and RAN rs3803012 SNPs were significantly associated with the risk of cancer (Table 2).
In the XPO5 gene rs11077 SNP (Figure 2A), the minor C allele significantly increased the risk of cancer (Log OR = 0.120, 95% CrI = 0.013, 0.241). A strong association of rs11077 SNP with cancer risk was also found in the dominant model (CC + AC vs. AA: Log OR = 0.132, 95% CrI = 0.009, 0.275) ( Figure  2B). In addition, the minor GG genotype allele of the RAN gene rs3803012 SNP ( Figure 2C) significantly increased the cancer risk (Log OR = 0.707, 95% CrI = 0.059, 1.385). Statistically significant associations between rs3803012 A/G SNP and cancer risk were also observed in the recessive model (GG vs. AG + AA: Log OR = 0.708, 95% CrI = 0.059, 1.359) ( Figure  2D). However, alleles and genotypes in other polymorphisms of XPO5 and RAN genes were not significantly associated with cancer susceptibility ( Table 2).

Heterogeneity and publication bias
Evaluation of the heterogeneity of the studies was analyzed with the τ 2 test. I 2 > 0.50 was considered as high value for heterogeneity. On the basis of heterogeneity plots and I 2 value, in most of the meta-analyses, the total heterogeneity and between studies heterogeneity were not high, for example, rs11077 and rs3803012 ( Figure 3, Table 2). However, these results for rs14035 were significantly high ( Table  2).
Begg's and Egger's tests were performed to evaluate the potential publication bias. As shown in Table 2, no publication bias was observed in our meta-analysis.

Functional validation by eQTL analysis
To substantiate the associations between the identified SNPs (XPO5 rs11077 and RAN rs3803012 SNPs) and cancer risk, we performed the eQTL analysis to assess the associations between SNPs and corresponding mRNA expression levels. The eQTL analysis results of lymphoblastoid cell lines from 373 Europeans were visualized through the Geuvadis Data Browser (https://www.ebi.ac.uk/Tools/ geuvadis-das), and we found that XPO5 rs11077 was significantly associated with XPO5 mRNA expression levels (P = 5.83E-07).    Similarly, in the HapMap3 East Asian samples (81 Japanese samples) from Stranger et al. [27], rs11077 was significantly associated with the expression of XPO5 gene (P = 0.016, Figure 4A), with the risk of C allele predicting higher mRNA levels of XPO5. According to the genotyping and expression data of 17 hepatocellular carcinoma obtained from the GEO database (GSE65373), we also found that rs11077 C allele had a significant association with an increased mRNA expression levels of XPO5 in the recessive model (P = 0.026, Figure 4B). However, no significant associations between rs3803012 and RAN mRNA expression levels were found in the above datasets. In addition, we compared mRNA expression levels of XPO5 in 154 paired cancer tissue samples with normal adjacent tissue samples from two TCGA projects (58 paired samples in TCGA-LIHC and 96 paired samples in TCGA-BRCA). We found that XPO5 mRNA expression levels were significantly increased in the tumor tissues compared to the normal tissues (P = 1.50E-20 and P = 5.27E-11, respectively) ( Figure 5).

Discussion
In the present study, a total of five SNPs in XPO5 and RAN genes were comprehensively reviewed and analyzed to estimate their associations with the risk of overall cancer by Bayesian hierarchical meta-analysis. Of these five SNPs, two (rs2257082, rs3809142) were analyzed for the first time. In contrast to the classical meta-analysis already performed with a fewer number of articles included [13], the Bayesian hierarchical meta-analysis applied here indicated that rs11077 SNP of XPO5 and rs3803012 SNP of RAN might facilitate the carcinogenesis. Nonetheless, we also performed a classical meta-analysis of the current data (results were not mentioned) that demonstrated the association of most of the genetic models in rs11077 SNP and the relevance of the rs3803012 SNP in homozygous and recessive models. Since the Bayesian hierarchical meta-analysis is much more sensitive and confers more precise estimation compared with classical meta-analysis, it is powerful suggested that rs11077 SNP of XPO5 and rs3803012 SNP of RAN are associated with cancer risk. Furthermore, eQTL analysis demonstrated that rs11077 SNP may influence the mRNA expression levels of XPO5. However, no associations were revealed amongst other studied SNPs in our meta-analysis, therefore future studies with a larger sample size are needed to determine their relationships.
For the classical meta-analysis, when faced with extreme values or small research quantum, the accuracy of the results cannot be guaranteed and the correctness of its conclusions will be questionable [14]. However, with the development of Markov Chain Monte Carlo (MCMC) methods, Bayesian hierarchical meta-analysis can avoid these defects and address the actual research question more directly [55][56][57]. Chen et al. [14] compared the difference between fully Bayesian hierarchical meta-analysis and classical meta-analysis, and found that if fixed effect is used to determine the real effect, both types of meta-analysis can be used. When random effect is adopted, if the study quantum is < 20, the Bayesian hierarchical meta-analysis should be the analysis of choice. The number of included articles for each studied SNP was < 20 in our meta-analysis, therefore the Bayesian hierarchical meta-analysis was utilized. From a statistical point of view, the number of included studies was not large enough for a classical meta-analysis, thus the results should be interpreted with caution. In the Bayesian hierarchical meta-analysis, however, the credible interval is slightly wider than that of classical meta-analysis and the results tend to be more consistent [14][15]. Hence, the significant result of Bayesian hierarchical meta-analysis is conservative and more reliable in comparison with the classical meta-analysis.
XPO5 is a member of karyopherin β family related to human export receptor CRMI, and is responsible for nuclear export and stabilization to form mature miRNA to produce physiological effects [58][59]. As XPO5 is a key factor for the transportation of miRNA from the nucleolus, it has been postulated as a rate-limiting step in the development of miRNAs, so its impairment could lead to pre-miRNA trapping in the nucleolus, influencing the risk of cancer [60][61]. Current studies have indicated the role of XPO5 in the development of several sorts of cancers such as hepatocellular carcinoma, thyroid cancer, lung cancer, and so on [62][63][64]. These studies are consistent with the results of the present study in which XPO5 mRNA expression levels were significantly increased in tumor tissues compared to normal tissues in 154 paired cancer tissue samples from the TCGA database. In addition, an increasing number of studies have focused on the correlations of XPO5 polymorphisms with cancer risk. The previous classical meta-analysis of XPO5 gene rs11077 SNP performed by He et al. [13], which included 7 case-control studies, showed no significant correlation with cancer risk. However, our analysis which included 14 case-control studies indicated that the minor C allele of rs11077 SNP significantly increased the risk of cancer and a strong association with cancer risk was also found in the dominant model. This association was further supported by the significant correlation between rs11077 C allele and an increased XPO5 mRNA expression level in the eQTL analysis. These findings suggest that rs11077 was significantly associated with cancer risk possibly by decreasing the mRNA expression levels of XPO5. Located in the 3'-UTR of XPO5, the A to C substitution of rs11077 may affect mRNA stability, alter the expression of XPO5 and, consequently, affect the expression of miRNAs, resulting in an aberrant expression of miRNA target gene at the post-transcriptional level [12,65].
RAN is a key member of the Ras superfamily of GTPases and is essential for translocation of pre-miRNAs from the nucleus to the cytoplasm through the nuclear pore complex in a GTP-dependent manner [66]. Studies have revealed that the up-regulation of RAN expression in various malignancies supports its role in cancer development [67][68][69]. No significant association was observed in the RAN gene rs3803012 SNP according to the previous classical meta-analysis [13], which included 5 case-control studies. In contrast, our Bayesian hierarchical meta-analysis which included 6,514 cases and 8,707 healthy subjects for the RAN gene rs3803012 SNP from 7 studies, demonstrated a significant association between rs3803012 SNP (homozygote or recessive model) and overall cancer risk. Our classical meta-analysis (the results of this analysis were not included in this study) also demonstrated a significant increased association risk of RAN gene rs3803012 SNP in cancer. Studies have hypothesized that the RAN rs3803012 G allele might affect the targeting of hsa-miR-199a-3p and result in decreased expression of RAN mRNA in tumor cells, which may affect various miRNA biosynthesis [43]. Unfortunately, we failed to obtain a significant eQTL results for SNP rs3803012 because the minor allele frequency (MAF) of rs3803012 was low in the included datasets (MAF ≤ 0.05). Population-specific eQTL analysis are warranted to validate our findings. As for the RAN gene rs14035 and rs3809142 polymorphisms, both types of meta-analysis did not support the significant association with cancer risk. The total heterogeneity as well as between studies heterogeneity was relatively high. Thus, further investigations are required to identify these potential cancer susceptibility loci.
Despite these results, we encountered some limitations during our meta-analysis. Firstly, since we had a limited number of studies, we could not perform a subgroup analysis with respect to the ethnicity, source of control groups (population-based or hospital-based) and cancer type. Heterogeneity among different cancers may cause the real effects to be hidden when pooling all cancer types. Secondly, gene-environmental interactions which may alter cancer risk were not evaluated due to the lack of relevant data across the included studies. Thirdly, studies of XPO5 and RAN SNPs in the cancer predisposition field continue to emerge, which resulted in limited number of the relevant investigations.
In view of all this, Bayesian hierarchical meta-analysis suggests a potential role of the miRNA biogenesis genes XPO5 (rs11077 A/C) and RAN (rs3803012 A/G) SNPs in cancer risk, supplying novel clues to identifying new biomarkers with cancer-forewarning function. Although we used publicly available genotyping and expression data to confirm the biological significance of the variant and suggest that XPO5 rs11077 may be a possible functional susceptibility locus for cancer risk, further high-quality research and functional evaluations are still warranted to validate our findings due to the limitations mentioned above.

Author Contributions
Fen Liu and Yi Shao designed research; Yi Shao, Yi Shen, Xudong Guo and Chen Niu collected and analyzed the data; Yi Shao and Lei Zhao drafted the manuscript; Fen Liu revised and made the decision to submit for publication. All authors contributed to manuscript revision, read and approved the submitted version.