Bioinformatics analysis of lncRNA‑associated ceRNA network in melanoma

Melanoma is an extremely malignant tumor with early metastasis and high mortality. Little is known about the process of by which melanoma occurs, as its mechanism is very complex and only limited data are available on its long non-coding RNA (lncRNA)-associated competing endogenous RNAs (ceRNAs). The purpose of this study was to screen out potential prognostic molecules and identify a ceRNA network related to the occurrence of melanoma. We screened 169 differentially expressed mRNAs (DEmRNAs) from E-MTAB-1862 and GSE3189; gene ontology (GO) enrichment analysis showed that these genes were closely related to the development of skin. In the protein-protein interaction network, we screened out a total of 19 hub genes. Furthermore, we predicted the microRNAs (miRNAs) that regulate hub genes using the miRWalk database and then intersected these with GSE35579, resulting in nine DEmiRNAs. We also predicted the lncRNAs that regulate the miRNAs using the LncBasev.2 database. According to the ceRNA hypothesis, and based on the intersection of the DElncRNAs with merged GTEx and TCGA data, we obtained 20 DElncRNAs. A total of four DEmRNAs, nine DEmiRNAs, and 20 DElncRNAs were included in the ceRNA network. Based on Cox stepwise regression and survival analysis, we identified five biomarkers, ZSCAN16-AS1, LINC00520, XIST, DTL, and let-7a-5p, and obtained risk scores. The results showed that most of the differentially expressed genes were related to epithelial-mesenchymal transition (EMT) in melanoma. Finally, we obtained a LINC00520/let-7a-5p/DTL molecular regulatory network. These results suggest that ceRNA networks have an important role in evaluating the prognosis of patients with melanoma and provide a new experimental basis for exploring the EMT process in the development of melanoma.


Introduction
Melanoma is the most frequently occurring malignancy in the elderly and has high mortality rates [1]. In recent decades, the number of patients with melanoma has increased with the aging of the population [2]. The development of melanoma is caused by the progression of a benign nevus, sometimes accompanied by sun exposure, physical stimulation, genetic variation, and other factors [3]. Nevi are benign products of normal melanocytes in the skin. In general, nevi will not develop into melanoma [4]. However, constant rubbing or physical stimulation of the skin in the affected area increases the probability of malignant transformation. At present, treatments for melanoma include surgery, radiotherapy, chemotherapy, and immunotherapy, but their efficacy is not optimal [5][6][7]. The transformation of a normal nevocytic nevus into melanoma is regulated by a complex molecular network. Therefore, it is of great importance to clarify the molecular mechanism of melanoma formation and find new molecular therapeutic targets. This study aimed to unravel the pathogenesis, prognostic factors, and potential molecular therapeutic targets of melanoma. First, we used multiple public databases to Ivyspring International Publisher screen out differentially expressed genes (DEGs), mRNAs (DEmRNAs), microRNAs (DEmiRNAs), and long non-coding RNAs (DElncRNAs). Then, we constructed a competing endogenous RNA (ceRNA) network based on gene enrichment, target gene prediction, survival analysis, and multivariate Cox regression analysis. Finally, five predictive genes, ZSCAN16-AS1, LINC00520, XIST, DTL, and let-7a-5p were identified; these genes are closely related to the occurrence of melanoma. This study provides new molecular therapeutic targets and lays the foundation for a better understanding of the mechanism of melanoma formation.

Identification of DEGs in melanoma
After downloading the mRNA expression profiles of E-MTAB-1862, we used the EdgeR package in R version 4.0.0 to identify DEmRNAs [11]. Then, we used the ggplot2 R package to draw a heat map and volcano plot of the DEmRNAs. Next, we filtered the DEmRNAs from GSE3189 using the GEO2R software [12]. To find overlapping DEmRNAs, we used the Venn software online to obtain the intersection of the two datasets. Moreover, we used GEO2R to screen DEmiRNAs from GSE35579 between nevi tissue and melanoma tissue. Finally, after merging the GTEx and TCGA data using R, we filtered out DElncRNAs with the EdgeR package, and generated the corresponding heat map and volcano map using ggplot2. The DEmRNAs and DEmiRNAs were then filtered using adjusted p<0.05 and |log fold change (FC)|>1, and the DElncRNAs with thresholds of |log FC|>1, p <0.05, and false discovery rate (FDR) < 0.05.

Functional enrichment analysis
GO annotation analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed using the STRING (Search Tool from the Retrieval of Interacting Genes; https://string-db.org/) database (version 11.0) [13]. STRING can be used to predict protein-protein interactions (PPI) directly or indirectly. (p < 0.05 and FDR<0.05).

PPI analysis
We used the STRING database to visually evaluate DEmRNAs with a confidence score >0.4. Furthermore, we used the Cytoscape software (version 3.7.2) to visualize the PPI [14]. We used the Molecular Complex Detection (MCODE) application in Cytoscape to screen hub genes of the PPI network with degree >8 (degree of cutoff = 2, node score cutoff = 0.2, k-core = 2, maximal depth = 100).

Construction of ceRNA regulation network in melanoma
We used the miRWalk and miRDB databases to predict the miRNAs that regulate DEmRNAs with mini p-value >0.9 [15]. To find overlapping DEmiRNAs, we used Venn software online to identify the intersection between the predicted DEmiRNAs and those from the GSE35579 dataset. Furthermore, we predicted lncRNAs that regulated the DEmiRNAs using the LncBasev.2 database with threshold >0.9 [16]. Then, we identified the intersection of the DElncRNAs obtained after merging the GTEx and TCGA databases with the predicted lncRNAs. Finally, based on the ceRNA hypothesis, we constructed a ceRNA network and visualized the results using Cytoscape.

Multivariate Cox regression model analysis
Only DEGs identified as meaningful in the survival analysis were considered for further analysis. Based on the clinical information of patients with survival status and duration in the TCGA database. First of all, we rule out those patients who have no survival status. Furthermore, we used the survival R package to perform multivariate Cox regression analysis using the six meaningful DEGs (LINC00520, ZNF561-AS1, XIST, MMP9, DTL, and let-7a-5p). We also performed a stepwise regression analysis of these six variables and obtained risk scores. To evaluate the prognostic value of the final gene combination in patients with melanoma, patients were divided into high-and low-risk groups based on the median risk score. Moreover, survival analysis and receiver operating characteristic (ROC) curve analysis of the final gene combination were carried out using the survival and survivalROC R packages, respectively. Finally, we calculated the area under the curve (AUC). The heat map of the six DEGs was constructed using ggplot2 (p<0.05). To better predict the prognosis of patients, we used melanoma patients with gender, stage, and TNM stages in the TCGA database as inclusion criteria. At the same time, we used a survival package combined with clinical information of patients and the risk model constructed in this experiment for univariate and multivariate COX analyses and obtained the related forest plots. Furthermore, the ROC curve related to clinical information was drawn to evaluate the prognostic value of the model. P-value < 0.05.

Identification of DEGs in melanoma
Through analysis of gene expression profiles in E-MTAB-1862, we screened a total of 367 DEGs, of which 210 were upregulated and 156 were downregulated in melanoma. Then, we used a volcano plot and heatmap to map the top 100 DEmRNAs (Figure 1). We also screened 3090 DEmRNAs from the GSE3189 dataset using GEO2R, of which 1215 were upregulated and 1875 were downregulated in melanoma tissues. Finally, we took the intersection of the two datasets using the Venn software online and found that a total of 169 DEmRNAs appeared in both datasets, of which 74 were upregulated and 95 were downregulated in melanoma samples (Figure 2a). In addition, we analyzed the miRNA sequencing data from the GSE35579 dataset; 91 DEmiRNAs were screened, of which 31 were upregulated and 60 were downregulated. We identified nine miRNAs in the intersection of GSE35579 with the DEmiRNAs predicted from the miRWalk database (Figure 2b). We also screened DElncRNAs by combining the melanoma GTEx and TCGA data, and obtained 684 upregulated and 1160 downregulated DElncRNAs in melanoma. The ggplot2 R package was used to plot the heat map of DElncRNAs ( Figure 3).

GO term and KEGG pathway analysis
Analysis of the top 10 GO enrichment results (Table 1, Figure 4a) showed that the DEmRNAs were mainly involved in tissue development, epidermis development, skin development, and cell differentiation with respect to biological processes. Regarding cellular components, the DEmRNAs were mainly enriched in the extracellular matrix, intercellular bridge, cornified envelope, and cell surface. Regarding molecular functions, the DEmRNAs were significantly involved in proteoglycan binding, structural constituent of the cytoskeleton, carboxylic acid binding, and DNA-binding transcription activator activity. Furthermore, KEGG pathway analysis showed that the DEmRNAs were involved in bladder cancer, ECM-receptor interaction, PPAR signaling pathway, pathways in cancer, IL-17 signaling pathway, and microRNAs in cancer (Table 1, Figure 4b).

PPI analysis
The PPI network contained 169 nodes and 407 edges, including 74 upregulated genes and 95 downregulated genes (Figure 5a). We used Cytoscape to visualize the PPI network. Furthermore, we used MCODE to identify two modules comprising 37 genes. Then, taking degree >8 as the screening index (Figure 5b), we selected 19 genes, of which the CD44 gene had low expression and the other genes had high expression in melanoma.

Constructing the ceRNA regulation network in melanoma
We used the miRWalk database to predict the miRNAs regulating 19 mRNAs (Table 2); these miRNAs were also found in the miRDB database. Then, we took the intersection of the miRNAs in the predicted miRNA database and GSE35579 dataset, resulting in nine miRNAs (Table 3). Furthermore, we used the LncBasev.2 database to predict the lncRNAs regulating the miRNAs (threshold >0.8), and used the Venn software online to intersect the different lncRNAs from the merged GTEx and TCGA data with the predicted lncRNAs. According to the ceRNA hypothesis, there is a positive regulatory relationship between lncRNAs and mRNAs, and a negative regulatory relationship between miRNAs and mRNAs. Finally, we obtained 20 lncRNAs with significant expression differences in melanoma (Table  4). We constructed the ceRNA network using Cytoscape ( Figure 6).

Survival analysis of DEGs
We used GEPIA to analyze the expression of specific DEGs (Figure 7a). The results showed that compared with normal nevus tissues, LINC00520, ZSCAN16-AS1, MMP9, and DTL had higher expression in melanoma, whereas XIST had lower expression. Survival analysis using the oncolnchttp website showed that LINC00520, ZSCAN16-AS1, XIST, MMP9, DTL, and let-7a-5p were significantly correlated with overall survival (OS) (Figure 7b). Kaplan-Meier survival analysis showed that the survival time of patients with high expression of LINC00520, ZSCAN16-AS1, MMP9, and DTL was significantly shortened, whereas that of patients with high expression of XIST and let-7a-5p was significantly prolonged (log-rank p<0.05).

Multivariate Cox regression analysis
As different marker combinations appeared very promising as prognostic indicators, we further evaluated the prognostic value of six gene combinations in patients with melanoma. Using stepwise regression analysis of six variables, we obtained five variables by Cox regression. Risk scores were calculated as follows, risk score=0.68192*ZSCAN16-AS1+0.14792*LINC00520+0. 74931*DTL-3.07155*let-7a-5p-0.12864*XIST. To further evaluate the prognostic value of the five gene combinations in patients with melanoma, patients were divided into high-and low-risk groups according to the median of the Cox regression model. Kaplan-Meier survival analysis showed that the survival time of patients in the high-risk group was significantly shorter than that in the low-risk group (p<0.05). ROC curve analysis showed that a combination of four genes had high accuracy and specificity in evaluating the prognosis of patients with melanoma, with AUC=0.716. The expression of five genes in the high and low-risk groups was visualized using a heat map (Figure 8). Also, univariate COX analysis of the risk model and clinical information showed that both stage and TNM stages could predict the prognosis of patients with melanoma (Supplemental Figure S1A). The forest plot results of multivariate COX analysis showed that T and N could be used as independent prognostic factors (Supplemental Figure S1B). However, the ROC results showed that the AUC of the risk model, T and N was 0.705, 0.679, and 0.667, respectively (Supplemental Figure S1C). Therefore, compared with TNM staging, the risk model we constructed is more accurate in predicting patient survival.

Discussion
Epithelial-mesenchymal transformation (EMT) is the process by which epithelial cells transform into mesenchymal cells. Under normal physiological conditions, EMT is related to human embryonic development, wound healing, and tissue regeneration [19]. In the pathological state, EMT is associated with cancer progression and fibrosis, and is significantly related to the occurrence and development of cancer. During the development of cancer, because of gene mutations, adhesion molecules and connections between cells are reduced, the extracellular matrix is also reduced, and the cytoskeleton is reconstructed [20]. Therefore, cancer cells can migrate and invade, and eventually spread to other organs. Moreover, cancer cells further promote metastasis by changing the tumor microenvironment and secreting enzymes that degrade the extracellular matrix. EMT-treated cells also gain the ability to resist apoptosis. In this study, we screened 169 DEmRNAs, 91 DEmiRNAs, and 1844 DElncRNAs. The results of the GO analysis showed that the function of DEmRNAs was mainly related to epidermis development, proteoglycan binding, and structural constituents of the cytoskeleton; these factors are closely related to the occurrence of melanoma. Notably, the top 10 biological process terms were all related to skin development. With respect to molecular functions, most of the DEmRNAs were related to protein binding and structural constituents of the cytoskeleton. Therefore, we speculate that the formation of melanoma may be closely related to changes in the cytoskeleton structure of normal nevus cells.
In addition, the KEGG pathway analysis showed that the DEmRNAs were mainly enriched in bladder cancer, ECM-receptor interactions, the PPAR signaling pathway, and the IL-17 signaling pathway. Surprisingly, we found that most of the DEmRNAs were related to ECM-receptor interaction, which is closely related to cell adhesion and migration. Generally, EMT is related to the metastatic and invasive ability of cancer cells. During EMT [21], the integrity of normal epithelial cells is lost, the adhesion and connections between cells are reduced, and the cytoskeleton is reconstructed, which leads to greater movement, migration, and invasion of cancer cells. Furthermore, we found that most of the genes were enriched in the bladder cancer pathway, suggesting a significant relationship between bladder cancer and melanoma. After using the PPI network to cluster DEmRNAs, we found that among the 19 hub genes, only CD44 showed low expression in melanoma. CD44 can interact with moesin protein, which has been shown to promote the migration and infiltration of breast cancer cells by remodeling the cytoskeleton [22], thereby further promoting the metastatic ability of cancer cells [23,24]. CD44 is a cell surface glycoprotein that is highly expressed in some cancer stem cells and is closely related to cell motility and cancer metastasis. In normal nevocytic nevus, CD44 shows strong staining. In melanoma, the expression of CD44 decreases gradually with increasing invasiveness [25]. DTL regulates the cell cycle and maintains the stability of the genome. DTL is highly expressed in invasive hepatocellular carcinoma and is closely related to tumor grade and prognosis [26].
Chen et al. found that knocking down the expression of DTL in liver cancer inhibited the growth and invasion of liver cancer cells, accelerated the apoptosis of cancer cells, and inhibited tumorigenesis [27]. The same effects were found in gastric cancer, breast cancer, and cervical cancer [28]. Through bioinformatics analysis, Zhou et al. identified DTL as a hub gene in hepatocellular carcinoma cells and found that it was highly expressed [26]. Through survival analysis, a close relationship was identified between DTL and the prognosis of patients with liver cancer. In addition, previous studies have reported that DTL is highly expressed in melanoma compared with benign melanocytic nevi and significantly correlated with OS [29]. However, the effects of DTL on the migration and invasion of melanoma have not been reported. MMP9 is a matrix metalloproteinase that promotes the migration and invasion of cancer cells by degrading the extracellular matrix [30]. MMP9 has been reported to play an important part in the EMT process of cancer cells. Its overexpression can increase the invasive ability of hepatocellular carcinoma cells and promote metastasis [31], leading to higher TNM staging and poor prognosis. High expression of MMP9 is often found in breast cancer [32]; in malignant breast cancer tissues, MMP9 shows strong positive expression. Overexpression of MMP9 can enhance the colony formation and migration ability of breast cancer cells, whereas inhibition of MMP9 leads to a significant decrease in invasive ability. Furthermore, survival analysis showed that MMP9 has a significant relationship with prognosis in breast cancer patients and could thus represent a prognostic biomarker for breast cancer [33]. In addition, MMP9 can promote angiogenesis; Li et al. showed that the angiogenic ability of melanoma cells could be inhibited by knocking down the expression of MMP9 [34]. There is a well known positive correlation between angiogenesis and invasion of cancer cells. Bakos et al. showed that the expression of MMP9 was low in normal melanocytes [35]. Therefore, we infer that MMP9 plays an important part in the transformation of normal melanocytes into melanoma.
In this study, we also constructed a lncRNA-related ceRNA network to identify related miRNAs in melanoma. The functions of lncRNAs have been implicated in a variety of cancers. LINC00520 has been widely reported in breast cancer, colorectal cancer, skin squamous cell carcinoma, melanoma, and other cancer cells, and shows increased expression in breast cancer tissue compared with normal breast tissue. Henry et al. showed that the migration ability of breast cancer cells could be blocked by knocking down the expression of LINC00520 [36]. LINC00520 can also promote the proliferation of cutaneous squamous cell carcinoma, as well as enhancing its migration and invasiveness via upregulating the expression of MMP9 [37]. Wen et al. found that LINC00520 had a similar function in melanoma. Compared with normal melanocytes, LINC00520 was highly expressed in melanoma [38], and overexpression of LINC00520 could promote the proliferation, migration, and invasion of melanoma cells. In addition, it was significantly associated with a shorter OS in patients with melanoma. The role of XIST in cancer has also been widely reported. Tian et al. showed that XIST was highly expressed in melanoma cells [39]. Overexpression of XIST can promote the proliferation, migration, and invasion of melanoma cells. Moreover, XIST is associated with TNM stage and lymphatic metastasis in patients with melanoma. We showed here for the first time that lncRNA ZSCAN16-AS1 was highly expressed in patients with melanoma and significantly correlated with poor OS. To the best of our knowledge, no role of ZSCAN16-AS1 has previously been reported in any disease. Let-7a-5p is a gene that has been studied extensively and is been found in a variety of cancer cells. In the current study, it was found to regulate downstream target genes. Functional experiments with let-7a-5p showed that its overexpression could inhibit the proliferation, migration, and invasion of lung cancer cells [40], as well as promoting autophagy and apoptosis. Li et al. found that let-7a-5p inhibited the proliferation of lung cancer cells by inhibiting the G1/S phase process [41]. Moreover, let-7a-5p was associated with shorter OS in patients with lung cancer. The miRNA sequencing analysis by Babapoor et al. showed that the expression of let-7a-5p in invasive melanoma was lower than that in normal melanocytes [42].
Different from previous studies, our study focused more on the relationship between the ceRNA network and melanoma epithelial-mesenchymal transformation (EMT). Also, the data sources of this study were more abundant. For example, mRNA and miRNA data come from GEO and ArrayExpress databases, and lncRNA data comes from TCGA and GTEx databases. Moreover, we not only constructed a melanoma ceRNA network but also carried out multivariate COX analysis of the key molecules in the ceRNA network. Finally, a risk model was constructed. To evaluate the clinical value of this risk model, through multivariate COX analysis, combined with the clinical data of melanoma patients in the TCGA database, we found that this risk model can predict the survival of melanoma patients more accurately than traditional TNM staging. To independently verify our risk model, we query the melanoma data from online databases such as the GEO database. Unfortunately, most data sets only sequence mRNA or miRNA and lack relevant lncRNA data, and some melanoma data lack patient survival data. Therefore, this study has not been able to independently verify this risk model. However, we will still strive to collect tissue samples and clinical information of melanoma patients in the future, and further, verify our risk model through gene chip sequencing data.
In the current study, we found that DTL, MMP9, LINC00520, and ZSCAN16-AS1 were highly expressed in melanoma compared with normal melanocytes, whereas XIST and let-7a-5p showed the opposite trend. The results of the survival analysis showed that these six genes were closely related to the prognosis of patients with melanoma. Furthermore, the results of our multivariate Cox analysis suggested that the combination of DTL, LINC00520, ZSCAN16-AS1, XIST, and let-7a-5p could identify high-risk melanoma patients. According to the ceRNA network, both LINC00520 and XIST can bind to let-7a-5p, and let-7a-5p can interact with its target gene DTL. However, previous studies showed that XIST was highly expressed in melanoma, contrary to our results. The role of XIST in melanoma has not been fully elucidated; we will investigate this in future research. Finally, we constructed a ceRNA network for LIINC00520/let-7a-5p/DTL.
In summary, we constructed a ceRNA network using miRNA, lncRNA, and mRNA expression profiles in melanoma and normal nevus. This will provide a basis for subsequent studies of the regulation mechanism of melanoma. In addition, we identified a combination of five genes that could represent a new signature for the diagnosis and prognostic analysis of patients with melanoma. In the future, we will further expand the collection of clinical data based on these data, and conduct a more detailed analysis of the pathogenesis of melanoma to clarify the role of these genes in the ceRNA network.