Nomograms for the Prediction of Survival for Patients with Pediatric Adrenal Cancer after Surgery

Purpose: To develop and validate a nomogram to postoperatively evaluate overall survival (OS) and cancer-specific survival (CSS) in patients with pediatric adrenal cancer. Methods: In total, 847 eligible patients diagnosed between 1988 and 2015 form the Surveillance Epidemiology, and End Results (SEER) database were enrolled in this study according to the specified inclusion and exclusion criteria. They were divided into a training set (n = 661) and a validation set (n = 186). Multivariate Cox proportional hazards regression algorithm was used to identify the independent predictors of OS and CSS in the training set, and develop the predicting models, which were presented two nomograms. The performance of the nomograms (discrimination, calibration and clinical usefulness) was assessed in the training set and validated in the validation set. Results: Based on the multivariate Cox proportional hazards regression analyses, three independent predictors including age at diagnosis, tumor size and M stage were identified for both OS and CSS. Then, an OS nomogram and a CSS nomogram were developed incorporating these three predictors, respectively. The OS nomogram showed good calibration and discrimination in the training set (C-index [95% CI], 0.744 [0.711-0.777]), which was confirmed in the validation set (C-index [95% CI], 0.746 [0.656-0.836]). Favorable calibration and discrimination of the CSS nomogram were also observed in the training set (C-index [95% CI], 0.749 [0.715-0.783]) and validation set (C-index [95% CI], 0.789 [0.710-0.868]). Moreover, the nomograms successfully distinguished patients with high risk of all-cause and cancer-specific mortality in all patients and in the stratified analyses. Decision curve analysis demonstrated the usefulness of the nomograms. Conclusion: The presented nomograms show favorable predictive accuracy for OS and CSS in patients with pediatric adrenal cancer after surgery. Further validation is warranted prior to clinical implementation.


Introduction
Adrenal cancers are mainly represented by adrenocortical cancer (ACC), neuroblastoma (NB), ganglioneuroblastoma (GNB) and malignant adrenal pheochromocytoma (PCC). Apart from NB, adrenal cancer is rare in the pediatric population [1,2]. NB is childhood cancer rising from neural crest progenitor Ivyspring International Publisher cell, accounting for nearly 10% of all childhood cancers [3][4][5]. The most common site of origin of NB is the adrenal medulla, accounting for 35% of cases [6], and NB with adrenal site is associated with inferior survival [7].
Adrenal cancer is usually aggressive with a poor prognosis, since it has often invaded nearby tissues or metastasized to distant organs at the time of diagnosis [3,8,9]. Complete surgical resection of the tumor is the most important and mainstay treatment for patients with adrenal cancer, which carries the best hope for prolonged survival and potential cure [10][11][12]. However, clinical outcome varies even in homogenously treated adrenal cancer patients with the same tumor stage because of the heterogeneous nature of adrenal cancer [9,13,14]. Indeed, if clinicians can identify patients at high risk after surgery, then systemic therapy can be implemented in time. Therefore, it's of great significance to construct a prognostic evaluation tool to aid in clinical decision making, facilitating the personalized and precision management of patients with adrenal cancer.
Nowadays, knowledge has grown regarding that the clinical manifestations and biologic behavior of pediatric adrenal cancer is different from that in adult adrenal cancer [2,[15][16][17]. For example, NB is unique to the pediatric age group and does not have adult counterparts [3,4]. Pediatric patients with ACC seem to present more endocrine dysfunction features and TP53 mutations than adult patients [16]. Hypertension may be continuous rather than paroxysmal in pediatric PCC [2]. And the genomic characteristics of PCC are also different between children and adults [18,19]. Therefore, the prognosis predictors of adrenal cancer are different between pediatric and adult patients. However, to our knowledge, a prognostic prediction tool has not been proposed specifically for pediatric adrenal cancer patients.
Hence, this study aimed to develop and validate nomograms for the postoperative survival prediction in individual patients with pediatric adrenal cancer using the Surveillance Epidemiology, and End Results (SEER) database.

Patients
In total, 847 adrenal cancer patients diagnosed between 1988 and 2015 from the SEER database were enrolled in this study according to the specified inclusion and exclusion criteria. Inclusion criteria consisted of the following: (a) adrenal cancer patients confirmed by pathology; (b) underwent surgery of primary site; (c) age at diagnosis less than 20; and (d) clinicopathological data and follow-up information available. Exclusion criteria included the following: (a) patients suffered from other cancer disease; (b) patients with bilateral adrenal cancer. The pathway of patient selection is shown in Supplementary Figure  S1. All enrolled patients were divided into two cohorts: 661 patients diagnosed between 2005 and 2015 were allocated to the training set, while 186 patients diagnosed between 1988 and 2004 were allocated to the validation set.
Clinicopathological data extracted for each case included age at diagnosis, sex, tumor laterality, tumor size, tumor invasion, N stage and M stage. Follow-up data extracted for each case included survival status, survival time, and cause of death. Overall survival (OS) duration was defined as time from diagnosis until death or last follow-up. Cancer-specific survival (CSS) duration was defined as time from diagnosis until death because of adrenal cancer or last follow-up.

Nomograms Construction and Performance Assessment
Clinicopathological candidate predictors, including histological type, were tested using a multivariate Cox proportional hazards regression algorithm in the training set. Backward stepwise selection using Akaike's Information Criterion (AIC) was applied to select the significant predictors of OS and CSS [20]. Then, an OS nomogram and a CSS nomogram were constructed based on the multivariate Cox proportional hazards regression models, respectively.
The performance of the nomograms was evaluated with respect to their discrimination and calibration in the training set. The Harrell's C-index was applied to quantitatively evaluate the discriminative ability, which is commonly used to assess the discrimination of prognostic models [21]. Note that bootstrapping using 1000 resampling procedures was used to obtain the C-index that was corrected for potential overfitting. The calibration of the nomograms was evaluated by plotting the calibration curves, which compared the nomogram-predicted survival probability with the observed survival probability.

Validation of the Nomograms
The performance of the two nomograms was validated in the validation set, respectively. The multivariate Cox proportional hazards regression formulas constructed using the training set were applied to all patients of the validation set, with risk scores calculated for each patient to reflect the risk of all-cause and cancer-specific mortality. Cox proportional hazards regression was then performed by using the risk score as a factor in the validation set.
Finally, based on the regression analyses, the C-indices were calculated and the calibration curves were plotted to validate the performance of the nomograms.

Categorization of Patients into High-or Low-risk Groups
A risk score for each patient was calculated based on the multivariate Cox proportional hazards regression formula. Then all patients were divided into high-risk and low-risk groups based on the optimal risk score cutoff value, which was identified by using X-tile plots in the training set [22]. The difference in the survival curves of the high-risk and low-risk groups was assessed by using the log-rank test. Moreover, stratified analyses were also performed within various subgroups in the combined training and validation set.

Clinical Usefulness of the Nomograms
The decision curve analysis (DCA) was used to estimate the clinical usefulness of the proposed nomograms by calculating the net benefits at different threshold probabilities. The DCA algorithm can serves as a comprehensive method for assessing and comparing different diagnostic and prognostic models [23].

Incremental predictive value of histologic grade
Since histologic grade has been reported as a factor associated with prognostic in ACC and NB patients, we performed additional analyses to explore whether it adds to the value of the presented nomograms [24,25]. In all 847 patients, only 491 patients recorded the information about the histologic grade. Therefore, the incremental value of histologic grade as an additional candidate predictor was assessed in this dataset, with C-indices calculated, calibration curves plotted and DCA performed.

Statistical Analyses
The X-tile software version 3.6.1 (Yale University School of Medicine, New Haven, CT, USA) was used to create the X-tile plots. X-tile plots provide a single method to automatically select the optimum cutoff based on the highest χ² value (i.e., minimum P value) defined using a Kaplan-Meier survival analysis and the log-rank test [22]. All other statistical analyses were performed using R statistical software version 3.5.1 (https://www.r-project.org/). The "survival" package and "MASS" package were used to perform the Cox proportional hazards regression model analysis. The nomograms and calibration plots were produced using the "rms" package. DCA was performed using the function "stdca.R." All statistical tests were two-tailed, and P < 0.05 were deemed significant.

Patient Clinicopathological Characteristics
The clinicopathological characteristics of the patients in the training and validation sets are presented in Table 1 and Supplementary Table S1. In total, 77.2% and 13.0% of patients were diagnosed as NB and GNB, respectively. And ACC accounted for 7.6% of all patients (Supplementary Figure S2). As for the distribution of age at diagnosis, more than 73% patients were diagnosed no more than 3 years of age. A single peak was seen in 1 years of age, and similar findings were also found in male and female subgroups (Supplementary Figure S3A). However, the distribution characteristics of age at diagnosis vary from different tumor types (Supplementary Figure S3B). Among all enrolled patients, 193 patients (22.8%) were dead during the follow-up, and 174 patients (20.5%) died due to adrenal cancer. Median fellow-up was 4.3 years (Interquartile range, 1.8-8.5). There was no significant difference between the OS (P = 0.310) or CSS (P = 0.260) of patients with different histological types (Supplementary Figure S4).

Nomograms Construction and Performance Assessment
Age at diagnosis, tumor size and M stage were identified as independent predictors of OS based on the multivariate Cox proportional hazards regression algorithm (Table 2). Then, the OS nomogram was constructed by incorporating these three predictors based on the multivariate Cox proportional hazards regression model ( Figure 1A). The OS nomogram showed favorable discrimination with a C-index of 0.744 (95% CI, 0.711-0.777) in the training set. The calibration curves for the 1-, 3-and 5-year OS showed favorable agreement between the nomogrampredicted OS probability and actual OS probability, indicating good calibration of the OS nomogram in the training set ( Figure 1B).
The three variables, including age at diagnosis, tumor size and M stage, were also found to be independent predictors of CSS based on the multivariate Cox proportional hazards regression algorithm ( Table 3). The CSS nomogram was developed by incorporating these predictors ( Figure  2A). The CSS nomogram yielded a C-index of 0.749 (95% CI, 0.715-0.783). The calibration curves for the 1-, 3-and 5-year CSS also showed favorable calibration of the CSS nomogram in the training set ( Figure 2B).

Validation of the Nomograms
The favorable discrimination of the OS nomogram was confirmed using the validation set (C-index [95% CI], 0.746 [0.656-0.836]). And good calibration of the OS nomogram was also observed in the validation set ( Figure 1C). As for the CSS nomogram, the C-index was 0.789 (95% CI, 0.710-0.868). The calibration curves for the 1-, 3-and 5-year CSS in the validation set also confirmed the good calibration of the CSS nomogram ( Figure 2C).

Categorization of Patients into High-or Low-risk Groups
The risk score was calculated for OS and CSS by using the following formulas: OS risk score = 0.074 × age at diagnosis + 0.037 × tumor size + 2.192 × M (with distant metastasis).
Note that the indicator function (M) is equal to 1 if the statement in the parentheses is true and is equal to 0 otherwise.
The optimal OS risk score cutoff generated by the X-tile plots was 2.41 (Supplementary Figure S5A-C). All patients were classified into high-risk and low-risk groups according to the optimal cutoff value. We assessed the distributions of the OS risk score and OS status in the combined training and validation set, and found that patients with higher risk scores were more likely to have death ( Figure 3A). There was a significant discrimination between the OS of the high-risk and low-risk patients in the training set ( Figure 4A), which was confirmed in the validation set ( Figure 4B). The OS risk score was also associated with the OS in the combined training and validation set (P < 0.001, Figure 4C) and in the stratified analyses (Supplementary Figure S6). Thus, the OS nomogram can successfully distinguish patients with high risk of all-cause mortality.   As for the CSS risk score, we defined an optimal cutoff value of 2.48 based on the X-tile plots (Supplementary Figure S5D-F). Accordingly, the patients were categorized into high-risk and low-risk groups for CSS. The distributions of the CSS risk score and CSS status in all patients are shown in Figure 3B. Patients with higher risk scores were more likely to have death due to adrenal cancer. Significant discrimination between the CSS of the high-risk and low-risk patients was observed both in the training and validation sets ( Figure 4D and 4E, respectively). The CSS risk score was also associated with the CSS in the combined training and validation set (P<0.001, Figure  4F) and in the stratified analyses (Supplementary Figure S7). Therefore, those patients with high risk of cancer-specific mortality can be identified by using our CSS nomogram.

Clinical Usefulness of the Nomograms
The DCAs of the OS nomogram and CSS nomogram are presented in Figure 5. The OS nomogram and the CSS nomogram offered a net benefit over the ''treat-all'' or "treat-none" strategy at a threshold probability < 71% and < 76% at 5 years, respectively. In addition, similar DCA findings were also observed in both the training and validation sets (Supplementary Figure S8). Therefore, the presented nomograms are clinically useful.

Incremental predictive value of histologic grade
The new models after the addition of histologic grade are presented in Supplementary Figure S9A and S10A (defined as OS nomogram II and CSS nomogram II, respectively). The calibration curves demonstrated good calibration for the OS nomogram II and CSS nomogram II (Supplementary Figure S9B and S10B, respectively). However, C-indices indicated that incorporating histologic grade did not improve prediction performance for either the OS nomogram II

Discussion
Our study demonstrates that age at diagnosis, tumor size and M stage were independent predictors of OS or CSS in patients with pediatric adrenal cancer after surgery. To provide easy-to-use tools for the postoperative survival prediction in individual patients, an OS nomogram and a CSS nomogram were developed incorporating these predictors, respectively. The nomograms showed good discrimination and calibration in the training and validation set, which may aid in clinical decision-making.
Over the past several decades, the outcome for childhood cancer has dramatically improved. However, the long-term outcome of pediatric adrenal cancer patients with high risk remains poor [9,17,26]. Pediatric adrenal cancer is a heterogeneous malignant neoplasm with prognosis ranging from near uniform survival to high risk for fatal demise. The heterogeneity of the outcome makes it difficult to provide an assessment of the prognosis after surgery of the pediatric adrenal cancer [9,13,14]. Indeed, if clinicians can identify patients at high risk after surgery, then systemic therapy can be implemented in time. Therefore, accurately estimating the prognosis can optimize disease management and may improve patient outcome.
As far as we know, research that specifically focus on risk factors for outcomes of pediatric NB located at adrenal has not been reported [27]. In addition, maybe due to the low incidence of ACC, only a few prognostic prediction models have been reported for ACC patients previously [24,28,29]. However, further studies are warranted due to limitations. For example, cases with insufficient data were not excluded for analysis, and "unknown" was treated as a category in some variables [24,28]; age at diagnosis was used as a categorical variable rather than continuous variable [24]; a useless variable "year of diagnosis" was even identified as an independent risk factor for survival and incorporated in the final model [24]. More importantly, these models were developed in ACC patients of all ages, which neglected that the prognosis predictors of ACC are different between pediatric and adult patients [24,28,29]. Due to the deficient number of cases, it is hard to develop reliable models for the rare tumors, like the pediatric ACC. In view of the above, the proposed nomograms were developed based on the pediatric adrenal cancer patients in this study, including various histological types. Note that histological type was used as a candidate predictor for regression analyses in our study. However, it was not associated with OS or CSS among pediatric adrenal tumor patients based on the regression analyses. Then stratified analyses were performed in the histological type subgroups to confirm whether the models were applicable to different tumor types. Encouragingly, our nomograms performed well in different histological type subgroups as well. Therefore, although some prognostic prediction models for adrenal cancer have been reported as mentioned above, our study did do a lot of improvement compared with previous studies.
In this study, age at diagnosis, tumor size and M stage were identified as independent predictors of OS or CSS in patients with pediatric adrenal cancer after surgery. Older age at diagnosis was an adverse prognostic factor in our study, which was consistent with the results reported in the previous studies on ACC or pediatric NB [27,29]. Tumor size also played an important role in predicting prognosis of pediatric patients with adrenal cancer after surgery. In our study, tumor size was used as a continuous variable rather than categorical variable, which could provide more detailed information, thus improving the model performance. For many different types of pediatric adrenal caner, the staging is based on information and data primarily from adult populations. Tumor size with 5 cm is often used as a cutoff for grouping patients in terms of tumor stage for some types of adrenal cancer. However, it is reasonable to suspect that this tumor size cutoff may be inappropriate to pediatric patients due to the different body size and different characteristics of adrenal cancer between pediatric and adult patients. We tried to explore the ideal cutoff for the tumor size using X-tile plots in all enrolled patients. As a result, an optimal cutoff value of 10.0 cm was defined, which was longer than 5 cm. This result also indicated that staging system should be established specifically for pediatric adrenal cancer patients for better disease management. M stage is another well-established prognostic variable in pediatric adrenal cancer [9,[30][31][32]. Our study further elucidated that even after surgical, patients with distant metastasis disease had significant worse prognosis than those without.
To our knowledge, this is the first attempt to develop and validate nomograms for the survival prediction in individual patients with pediatric adrenal cancer after surgery. Our study has several strengths. First, the presented prognosis models are specifically for the pediatric patients, which can better reflect the characteristics of this population. Since the clinical manifestations and biologic behavior of pediatric adrenal cancer is different from that in adult adrenal cancer, the prognosis predictors of adrenal cancer are not the same between pediatric and adult patients. Therefore, it is greatly needed to develop prognosis models specifically for these two different populations, respectively. Second, our nomograms are applicable to different histological types of pediatric adrenal cancer, which provides userfriendly tools for clinicians and patients, especially for those with rare tumor types. Third, our study indicated that tumor size is an independent predictor of OS or CSS in patients with pediatric adrenal cancer after surgery, and the optimal cutoff for tumor size we discovered is quite different from the tumor size cutoff defined by the current staging system for some types of adrenal cancer. This reminds us that we should establish staging system specifically for pediatric adrenal cancer patients rather than using the current staging system which is derived from the data of adult populations.
Despite the strengths, some unavoidable limitations of our study should be considered. First, due to the retrospective nature of the study and the strict inclusion and exclusion criteria used, potential selection biases might occur. For instance, patients suffered from other cancer disease or with bilateral adrenal cancer were excluded in our study, which will limit the application of our models to these patients. These criteria introduced selection bias by removing patients with worse prognosis (i.e., patients suffered from other cancer disease). And the selection bias thus limits our model only accurate in specific patient population. Second, data from the SEER database also suffers from lack of detail. The selected candidate factors were based on our clinical experiences, previously published studies and the available data from the SEER database. Those unrecorded clinical characteristics might also be associated with patient outcome, such as manifestation, comorbidity, mitotic index et al. In addition, the presented nomograms do not include data on molecular markers, which may serve as promising predictors. Thus, further studies are warranted to address this issue. Third, although a validation set was used for model validation in our study, further external validation in other datasets is warranted to confirm the generalizability of our nomograms before clinical application.
In conclusion, we proposed two nomograms for the OS and CSS prediction in individual patients with pediatric adrenal cancer after surgery, respectively. The nomograms showed favorable prediction efficiency, which was validated in the validation set.