Risk factors of secondary cancer in nasopharyngeal carcinoma patients after radiotherapy

Purpose: To identify risk factors of secondary cancer in nasopharyngeal carcinoma (NPC) patients after radiotherapy. Materials and methods: The data of NPC patients with secondary cancer were extracted from the Surveillance, Epidemiology, and End Results database from 2004 to 2016. Univariate and multivariate logistic regression analysis was performed to identify risk factors of secondary cancer. Risk factors selected from the multivariable logistic regression analysis were used to build a predicting model. Results: A total of 3931 patients were included: 329 (8.37%) patients developed secondary cancers and 3602 (91.63%) patients did not have secondary cancers. Univariate logistic regression analysis revealed that age, race, and the American Joint Committee on Cancer (AJCC) stage were risk factors of secondary cancer. Multivariable analysis demonstrated that age [Odds ratio (OR) = 1.03, P < 0.001], race (OR = 1.17, P = 0.010), AJCC stage (OR = 0.82, P = 0.002), and chemotherapy (OR = 1.55, P = 0.028) were independent risk factors of secondary cancer. Age, race, AJCC stage, and chemotherapy were entered into a nomogram for predicting secondary cancer. The area under the ROC curve of the nomogram was 0.645 [95% confidence interval (CI): 0.617-0.673]. The decision curve showed that if the threshold probability is between 4% and 25%, using the nomogram added more benefit than either the treat-all-patients scheme or the treat-none scheme. Conclusion: Age, race, AJCC stage, and chemotherapy were independent risk factors of secondary cancer in nasopharyngeal carcinoma patients after radiotherapy.


Introduction
Nasopharyngeal carcinoma (NPC) is a highly epidemiologic and radiosensitive cancer [1,2]. Radiotherapy is the primary treatment for NPC [3,4]. With the improvement in diagnosis and treatment, long-term survival of NPC patients are increasing. As a result, secondary cancer after radiotherapy becomes a serious complication among these long-term survivors [5,6]. Although the secondary cancer is rare [7][8][9][10], it can decrease patients' survivals [11]. The low frequency of secondary cancer makes it difficult to identify the potential predictive factors. This retrospective study was conducted to identify risk factors of secondary cancer in NPC patients after radiotherapy using the data of the Surveillance, Epidemiology, and End Results (SEER) database.

Data source and patients
This retrospective study searched the SEER database to extract data of NPC patients from 2004 to 2016. The inclusion criteria were as follows. (1) Ivyspring International Publisher Pathologically confirmed NPC. (2) definite TNM stages of the American Joint Committee on Cancer (AJCC), (3) NPC was the first cancer, (4) not stage M1, and (5) received radiotherapy. Patients' characteristics of age, sex, race, tumor grade, World Health Organization (WHO) classification, AJCC stage, chemotherapy, and secondary cancer were extracted.

Identifying risk factors
Univariate logistic regression analysis was performed to identify potential risk factors of secondary cancer. All the patients' characteristics of age, sex, race, tumor grade, WHO classification, AJCC stage, and chemotherapy were included in the univariate logistic regression analysis. All the factors were also included in the multivariate logistic regression analysis to identify independent risk factors. Factors with a P < 0.05 in the multivariable logistic regression analysis were considered as the independent risk factors of secondary cancer. The results of logistic regression analysis were reported as Odds ratios (ORs) with 95% confidence intervals (CIs).

Nomogram development
The independent risk factors of secondary cancer identified from the multivariable logistic regression analysis were used to develop a predictive nomogram. Receiver operating characteristic (ROC) curve analysis was used to assess the nomogram discrimination capacity. The area under the ROC curve (AUC) was calculated for quantification. The performance of the nomogram was assessed by a calibration plot for internal calibration. The decision curve analysis (DCA) was adopted to evaluate the clinical efficacy of the nomogram and analyze the net benefit under different risk thresholds.

Statistical analysis
The continuous variable of age was compared using Wilcoxon rank sum test between secondary cancer group and non-secondary cancer group. Categorical variables, including sex, race, tumor grade, WHO classification, AJCC stage, and chemotherapy, were analyzed using the χ 2 test or Fisher's exact test. Overall survival between the secondary cancer group and non-secondary cancer group were calculated using the Kaplan-Meier analysis with log-rank test statistics. Multivariable proportional hazards models adjusted for age, sex, race, tumor grade, WHO classification, AJCC stage, and chemotherapy were implemented to assess independent prognostic factors.
Statistical analyses were performed using SPSS Statistics Version 26.0 software (IBM Co., Armonk, NY, USA) and R software (version 4.0.2). Two-tailed P values < 0.05 were considered statistically significant. Figure 1 shows the process of patient selection. A total of 3931 patients were included. The secondary cancer group included 329 (8.37%) patients. The non-secondary cancer group included 3602 (91.63%) patients. Table 1 summarizes the patient characteristics. The median follow-up times were 46 [interquartile range (IQR): 20-85] months in the non-secondary cancer group and 69 (IQR: 39-102) months in the secondary cancer group, respectively.

Survival between the secondary cancer and non-secondary cancer groups
The 5-year overall survival did not differ between the secondary cancer and non-secondary cancer groups (71.2% vs. 67.2%; P = 0.230, Figure 2). The 7-year overall survival of the secondary cancer and non-secondary cancer groups was 63.4% and 63.6%. Overall survival of the secondary cancer group was worse than that of the non-secondary cancer group in 7 th year after radiotherapy. Patient characteristics after propensity score matching were showed in Table 2. After propensity score matching, the 5-year overall survival was similar between the secondary cancer and non-secondary cancer groups (59.3% vs. 71.1%; P = 0.120, Figure 3).

Independent risk factors of secondary cancer
Univariate logistic regression analysis revealed that age, race, and the AJCC stage were risk factors of secondary cancer ( Figure 6). Multivariable logistic regression analysis demonstrated that age (OR = 1.03, P < 0.001), race (OR = 1.17, P = 0.010), AJCC stage (OR = 0.82, P = 0.002), and chemotherapy (OR = 1.55, P = 0.028) were independent risk factors of secondary cancer ( Figure 7). Chemotherapy was not a risk factor of secondary cancer in the univariate logistic regression analysis. However, it was an independent risk factor of secondary cancer in the multivariable logistic regression analysis.

Development of a prediction nomogram
The prediction nomogram that incorporated the factors selected in the multivariable logistic regression analysis was developed (Figure 8). The score for each independent risk factor was determined by drawing a line from the factor to the points axis. The sum of the points was located on the total points axis. The probability of development of secondary cancer was located on the points drawing straight down to the risk of secondary cancer axis.

Prediction of nomogram performance
ROC curve was established to assess the accuracy of the nomogram (Figure 9). The AUC of the nomogram was 0.645 with a 95% CI ranging from 0.617 to 0.673. The nomogram was internally validated by computing the bootstrap-corrected Harrell index and by the calibration plot ( Figure 10). The calibration plot showed that the probability of secondary cancer predicted by the nomogram was relatively matched.

Clinical Use
The decision curve analysis for the nomogram was presented in Figure 11. The decision curve showed that if the threshold probability is between 4% and 25%, using the nomogram added more benefit than either the treat-all-patients scheme or the treat-none scheme. The clinical impact curve for the nomogram was showed in Figure 12.

Discussion
This retrospective study identified several independent risk factors associated with secondary cancer of NPC after radiotherapy. Although some studies had investigated the risk factor of secondary cancer. Risk factors needs to be further assessed due to the low incidence of secondary cancer [12,13]. The current study investigated the potential risk factors based on a large sample size. Moreover, we established and internally validated a nomogram based on age, race, AJCC, and chemotherapy for predicting secondary cancer. This predictive nomogram could provide personalized estimates of secondary cancer development to guide follow-up strategy for NPC patients. Patients might benefit from this nomogram.
The mechanism of secondary cancer after radiotherapy was not yet clear. Previous studies had showed that the risk factors for secondary cancer included hereditary susceptibility, age of initiative irradiation, the type of primary tumor, the toleration of the irradiated tissues, the dose and area of irradiation, and combination of chemotherapy [14,15]. Our study revealed similar results. White patients were more likely to develop secondary cancer (OR = 1.34, 95% CI: 1.03-1.75; P = 0.032) setting Asian patients as reference. Moreover, older age was more likely to develop secondary cancer (OR = 1.03, 95% CI: 1.02-1.04; P < 0.001).
The multivariable logistic regression analysis revealed that patients with stage IV (OR = 0.47, 95% CI: 0.31-0.74; P < 0.001), III (OR = 0.53, 95% CI: 0.34-0.82; P = 0.004), and II (OR = 0.62, 95% CI: 0.41-0.96; P = 0.030) were less likely to have secondary cancer. This was an unexpected finding. In clinical practice, patients with locoregionally advanced diseases would receive more chemotherapy compared to early-stage diseases. Based on the multivariable logistic regression of our study, patients with locoregionally advanced diseases were more likely to develop secondary cancers. The possible explanation was that locoregionally advanced diseases had worse survival compared with early-stage diseases. The survival time might be insufficient to develop secondary cancers.
Our study suggested that chemotherapy was not a risk factor of secondary cancer in the univariate logistic regression analysis. However, chemotherapy was an independent risk factors of secondary cancer in the multivariable logistic regression analysis. This was another unexpected finding. It was reported that chemotherapy could increase the incidence of secondary cancer and reduce the latency between radiotherapy and secondary cancer occurrence [13,16]. However, a recently published study revealed that chemotherapy was not an independent risk factor of secondary cancer [17,18]. Until now, the effect of chemotherapy on secondary cancer was still unclear due to the limited studies. Our study with a large sample size found that chemotherapy was associated with the incidence of secondary cancer. The result needed to be verified in prospective studies with longer follow-up time.
Until now, latency of secondary cancer is unclear. It was reported that the latency period for development of secondary cancer was between 3 and 36 years (median: 8.5 years) after radiotherapy [17]. On the other hand, the latency of secondary cancer was shorter for patients who received intensity-modulated radiotherapy than that for patients who received conventional radiotherapy (median years: 4.0 vs. 11.0, P = 0.013) [11]. Due to the limitations of SEER database, the latency of secondary cancer after radiotherapy could not be extracted. Thus, the latency period for development of secondary cancer could not be calculated. However, the overall survival of secondary cancer group was worse than that of non-secondary cancer group in 7 th year after radiotherapy. This result might indicate that the latency of secondary cancer was less than 7 years. This nomogram revealed that the AUC was 0.645 (95% CI: 0.617 to 0.673). The result suggested that the discriminatory capacity of the nomograms was relatively weak. Moreover, calibration curves, used to quantify how close predictions were to the actual outcome, showed that prediction was also not well calibrated. The possible explanations were as following: (1) Secondary cancer of NPC after radiotherapy was rare. The frequency of secondary cancer was very low. Several studies reported that the incidence of secondary cancer ranged from 0.8% to 5.6% [7][8][9][10]. Although our study reported a percent of 8.37% for secondary cancer after radiotherapy, the sample size of patients with secondary cancer was still small. The small sample size might have been insufficient for establishing a nomogram for prediction of secondary cancer in NPC patients after radiotherapy. (2) Only 4 risk factors of secondary cancer were identified in the multivariable logistic regression analysis. The nomogram was established based on the risk factors of age, race, AJCC, and chemotherapy. Important factors of radiation therapy technique, radiation dose and its distribution were not included due to the limitation of SEER database [19][20][21]. Thus, the nomogram could not provide well-discriminating ability. This nomogram should be modified with more independent risk factors to improve its efficacy.       Limitations of this study should be considered. First, considering the low incidence of secondary cancer, the nomogram was only internally validated by computing the bootstrap-corrected Harrell index and by the calibration plot. The nomogram was not externally verified in a validation cohort. Its clinical utility should be treated with caution. Second, the locations of secondary cancer were not provided in the SEER database. It was unclear whether the secondary cancer was more likely to occur in the fields of radiotherapy.
In conclusion, age, race, AJCC stage, and chemotherapy were independent risk factors of secondary cancer in nasopharyngeal carcinoma patients after radiotherapy. Multicenter studies with large sample sizes and longer follow-up time are needed to verify the nomogram of this study.