The association of HLA/KIR genes with non-small cell lung cancer (adenocarcinoma) in a Han Chinese population

The host immune system plays a crucial role in the surveillance, recognition and elimination of tumor cells. Recent studies found that Human lymphocyte antigen class I (HLA I) genes, Killer cell immunoglobulin-like receptor (KIR) genes and HLA/KIR combinations play a role in the defense against tumor cells. To evaluated the associations between HLA I genes, KIR genes and HLA/KIR combinations and non-small cell lung cancer (NSCLC) in a Chinese Han population, a total of 229 patients with NSCLC (adenocarcinoma) and 217 healthy individuals were studied. Our results showed that the HLA-C*08:01 allele occurred at a significantly higher frequency in the NSCLCs compared with the controls (P=0.034). The HLA haplotype frequencies bearing HLA-A, -B, and -C loci between the NSCLC and control groups were not different (P>0.05). And there were no differences in the KIR gene, genotype and haplotype frequencies between the NSCLC and control groups (P>0.05). Also, there were no differences between the HLA/KIR combinations in the KIR3D genes and HLA-A3/A11, HLA-Bw4 ligands and KIR2D genes and HLA-C1/C2 ligands between the NSCLC and control groups (P>0.05). Our results indicate that the HLA-C*08:01 allele could be a risk factor for NSCLC (adenocarcinoma) in the Chinese Han population (OR=2.395; 95% CI: 1.359-4.221).


Introduction
Lung cancer causes a large number of cancer-related deaths around the world, and its 5-year survival rate is approximately 15% [1]. Among lung cancer cases, non-small cell lung cancer (NSCLC) accounts for approximately 80% of lung cancer cases [2]. To date, more and more studies have proven that host genes play a key role in the susceptibility or development of NSCLC, especially host immune system genes.
The host immune system can recognize and eliminate tumor cells [3]; this is mediated by CD8 + cytotoxic T cell adaptive immune responses. Human lymphocyte antigen class I molecules (HLA-I) play a key role in presenting tumor antigens for CD8 + cytotoxic T cells recognition [4]. HLA-I molecules are encoded by HLA I genes; three classical transplantation HLA genes (HLA-A, HLA-B and HLA-C) have been observed to have multiple roles in immune regulation [5].
HLA-I molecules and KIR serve as ligands and receptors, respectively, and their interplay is important for transmitting activating or inhibitory signals to regulate the function of NK cells [11]. Their interactions are also a component of the innate immune system, which plays a role in the defense against tumor cells. Thus, the function of HLA I genes, KIR genes and their combinations are valuable for investigating associations with susceptibility or worse clinical outcomes in different types of cancer, such as kidney cancer, breast cancer, colorectal cancer and lung cancer [12][13][14][15][16][17][18].
To investigate the role of genetic variations in HLA I genes, KIR genes and HLA/KIR combinations in the susceptibility or development of NSCLC, we evaluated the association of HLA I genes, KIR genes and HLA/KIR combinations with NSCLC (adenocarcinoma) in a Chinese Han population in this study.

Ethical approval and informed consent
All procedures were in accordance with the ethical standards of the responsible committee on human experimentation and with the Helsinki Declaration of 1964, which was revised in 2013. All experimental protocols used in this study were approved by the Institutional Review Boards of the No. 3 Affiliated Hospitals of Kunming Medical University. All participants provided written informed consent.  [19]. The NSCLC patient inclusion criteria: 1) the patients were aged ≥18 years with histologically and pathologically diagnosed NSCLC; 2) the patients were diagnosed with adenocarcinoma; 3) the patients had not received preoperative neoadjuvant therapies (including chemotherapy and radiotherapy). The criteria for the exclusion was 1) the patients with a prior history of primary cancer other than lung cancer; 2) the patients with small cell lung cancer; 3) the patient with malignant tumors except lung cancer; 4) the patients receiving radiotherapy or chemotherapy, and unclear pathological diagnosis. Clinical characteristics and data, such as sex, age, family history of cancer, and histological type of cancer, were obtained. The healthy controls included 217 subjects (143 males and 74 females) were recruited from a population undergoing routine examinations at the No.3 Affiliated Hospitals of Kunming Medical University. The controls underwent clinical examinations and did not have any cancer, or family history of NSCLC, or respiratory diseases and matched to the cases by age and gender. All participants (NSCLC patients and healthy controls) were unrelated Chinese Han individuals.

HLA I genes (HLA-A, -B and -C) genotyping
Blood samples were collected and genomic DNA was extracted from peripheral lymphocytes using the QIAamp Blood Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. HLA-A, -B, and -C genes were genotyped with the NGS sequencing-based HLA typing using the NGSgo ○ R Illumina Miseq workflow (Illumina, Inc. San Diego, CA, USA) according to the manufacturer's instructions. Briefly, the NGSgo-AmpX kit was first used to perform HLA locus-specific (HLA-A, -B, and -C) amplification. Then, the large amplicons were randomly fragmented to 300-800 bp prior to adapter ligation to ensure that the whole gene was evenly covered by numerous overlapping reads. The end-repair and dA-tailing were completed in the same step. After fragmentation, adapter ligation was performed, followed by a second clonal amplification and then sequencing. During sequencing, millions of reads were generated from multiple samples and multiple loci simultaneously. The NGS engine software package (Illumina) was used to determine the HLA genotypes for the individual samples and loci in an easy and user-friendly manner based on NGS data using the following three steps: (1) Alignment, (2) Haplotype phasing, and (3) Genotype determination.

KIR gene genotyping
In the current study, sixteen KIR genes (excluding 3DP1) [20] were genotyped using multiplex PCR sequence-specific priming (PCR-SSP) and identified by electrophoresis via agarose gels as described previously [21]. Two sets of primers were designed for each of the sixteen KIR genes and the presence of a gene was detected only when both were amplified.

Statistical analysis
The HLA-A, -B and -C allele frequencies were calculated with the Pypop and PyHLA software based on the genotyping results [22][23][24]. The χ² test was used to determine differences in the allele frequencies between the NSCLC and healthy control groups. The odds ratios (OR) and associated 95% confidence intervals (CIs) were also calculated for allele-specific risks. The Hardy-Weinberg equilibrium was assessed using the Guo and Thompson method [25]. The HLA-A, -B and -C haplotypes were constructed and calculated using the expectation-maximization algorithm based on the genotyping results.
For KIR loci, the frequency of each KIR gene was determined by direct counting. The genotypes and haplotypes were defined by referring to the Allele Frequencies website (http://www.allelefrequencies. net). The χ² test was used to determine differences in gene, genotype and haplotype frequencies between the NSCLC and healthy control groups.
The false discovery rate (FDR) correction was used for the multiple comparisons [22,31], and adjusted P value less than 0.05 was considered statistically significant. Table 1 lists the characteristics of the enrolled subjects. There were no age or gender differences between the NSCLC and healthy control groups (P>0.05).

Association of HLA I genes and their haplotypes with NSCLC
The allele frequencies for HLA-A, -B and -C in the NSCLC and healthy control groups are listed in Table  2. The genotype frequencies for HLA-A, -B and -C were in Hardy-Weinberg equilibrium for both the NSCLC and healthy control groups (P>0.05). The frequencies of the HLA-A*02:06, B*13:01, B*40:01, and C*08:01 alleles were different between the NSCLC and healthy control groups (P<0.05). However, after FDR correction, only HLA-C*08:01 occurred at a significantly higher frequency in the NSCLC group compared with the healthy control group (P=0.034, were different between the NSCLC and healthy control groups (Table 3), however, the significance of differences faded after FDR correction (Table 3 However, after FDR correction, there were no differences in the HLA alleles and haplotypes between pathologic stages I+II and III+IV (data not shown). However, there were no significant differences in the KIR genes and genotypes between the NSCLC and healthy control groups after FDR correction (Tables 4  and 5). KIR haplotypes A and B showed no significant differences in the NSCLC and healthy control groups with frequencies of 0.459 vs. 0.525 for haplotype A and 0.541 vs. 0.475 for haplotype B, respectively (P>0.05) (Supplementary Table 1). In the subgroups analysis, only KIR genotype 9 showed differences between pathologic stages I+II and III+IV (data not shown). However, after FDR correction, there were no differences in the KIR gene, genotype and haplotype frequencies between pathologic stages I+II and III+IV (data not shown).

Associations of HLA/KIR combinations with NSCLC
The distribution of the two KIR3D genes and their HLA-A3/A11 or HLA-B Bw4 ligands is given in Supplementary Table 2. There were no significant differences (P>0.05) in their combination frequencies between the NSCLC and healthy control groups. The distribution of the KIR2D genes and their HLA-C1/C2 ligands is given in Supplementary Table 2. The results for the KIR2D genes and their HLA-C1/C2 ligands showed that there were no significant differences in the frequencies between the NSCLC and healthy control groups (P>0.05).
NSCLC is one of the most immunogenic tumors, and its tumor antigens can potentially be recognized by CTLs and CD8 + cytotoxic T cells, which mediate antitumor responses [38]. In 2010, Yang et al [39] Table  2), though there were no differences between the two groups after FDR correction. Nevertheless, our results showed that HLA-C*08:01 occurred at a significantly higher frequency in the NSCLC group compared with the healthy control group and showed a significant difference between the two groups after FDR correction (OR=2.395; 95% CI: 1.359-4.221). One of the reasons for this discrepancy in the results between the Yang study and our study could be related to differences in sample sizes and pathomorphological type. In the current study, we enrolled 229 NSCLC patients and 217 healthy individuals, while only 100 lung cancer patients were included in Yang's study (81 NSCLC and 19 small cell lung cancer patients). Additionally, our NSCLC group only included adenocarcinoma patients, while Yang's study included lung cancer patients who were not further classified by pathomorphological type. Another reason might be the genetic differences between the North Han and Yunnan Han population. In 2015, Zhou et al. investigated the distribution of HLA allele and haplotype frequencies in the Chinese population, and found there were regional differences in the HLA diversities in China [40]. Moreover, our previous study also showed Han Chinese populations were divided into northern and southern clusters based on HLA-A, -B, and -DRB1 allele frequencies using phylogenetic tree and multidimensional scaling analyses [41]. The Han population from Yunnan is situated between the northern and southern clusters and showed different genetic characteristics from the Northern Han population [41]. The third reason could be because there was no correction for multiple comparisons in Yang's study, but we used FDR for multiple comparisons in the current study. In 2009, Nagata et al [42] reported that HLA-A*02 and HLA-A*24 could be prognostic factors for Japanese patients with NSCLC. Then, Bulut et al [43] investigated that HLA-A*02 was an independent risk factor for lymph node and distant metastases in patients with NSCLC in Turkish populations. Additionally, they also found that HLA-A*26 appeared to be a protective allele against metastases. However, we did not observe these differences in the HLA I gene allele and haplotype frequencies between pathologic stages I+II and III+IV after multiple comparison correction.
To escape CD8 + cytotoxic T cell recognition, tumor cells must eliminate HLA I molecules by downregulating their expression at the cell surface [44]. However, if HLA expression is completely lacking, NK cells will attack and kill tumor cells. Thus, variations in KIR genes have been thought as a major risk for developing cancer [45,46]. In 2010, Al Omar et al [17] reported that there were no significant differences in the proportion of patients with different KIR genes, genotypes and haplotypes among patients with solid tumors (NSCLC, small-cell lung cancer, colon cancer and kidney cancer). Then, Yu et al reported the KIR gene frequency showed no differences between NSCLC and healthy controls in a Chinese Han population [14]. Also Wiśniewski et al [13] did not find such differences in Polish Caucasians. Our results were similar to the Al Omar, Yu and Wiśniewski results in that we also found no differences in the KIR gene, genotype and haplotype frequencies. We also found that there were no differences in the KIR gene, genotype and haplotype frequencies between pathologic stages I+II and III+IV after multiple comparison correction.

Discussion
Several studies have reported that certain HLA/KIR combinations are associated with susceptibility and worse clinical outcomes and response to treatment in several types of cancer, including breast cancer, colorectal cancer and lung cancer [12][13][14][15][16][17][18]. In 2010, Al Omar et al performed an association study on HLA/KIR combinations with NSCLC; they found that NSCLC patients showed a significant increase in the frequency of KIR2DL1-C2 and decrease in the frequency of KIR2DL3-C1 in homozygotes [17]. However, in the current study, our results did not show any differences in the HLA/KIR frequencies between the NSCLC and healthy controls in the Chinese Han population, which was similar to a report by Wiśniewski for Polish Caucasians [13]. These authors observed only that HLA-C heterozygotic genotypes encoding both C1 and C2 ligand epitopes for KIR2DL2/3 and KIR2DL1 inhibitory receptors, respectively, were less frequent in patients than in controls, whereas opposite was true for C1C1 and C2C2 homozygotes. However, this association of KIR ligands was independent from their respective receptors or any other KIR genes [13]. In the present report, even when subgroup analysis was performed, there were no differences in the HLA/KIR combinations between pathologic stages I+II and III+IV.
Nevertheless, Wiśniewski et al found that NSCLC patients positive for the KIR2DL2 and KIR2DS2 genes and homozygous for C1 were 6 times more likely to respond to treatment than those with other genotypes (P=0.034). In accordance with these results, patients with the KIR2DL2 + /KIR2DS2 + -C1C1 genotype survived longer than others (P=0.009) [13]. However, Yu et al did not observe any relationship between HLA/KIR combinations and response to treatment in NSCLC in a Chinese Han population [14]. Here, we did not examine the response to NSCLC treatment, which is one of limitations of the current study; thus, we could not compare our results with others. However, we did not observe any differences in the HLA/KIR combinations between pathologic stages I+II and III+IV.
We used the association studies to investigate the relationship between HLA/KIR and NSCLC in the current study. One of the disadvantages in the association study is the high probability of false positive. Several factors influence the rate of false positive, such as population heterogeneity and stratification, and multiple testing. In order to avoid population heterogeneity and stratification, in this current study, we recruited the Han Chinese from the same region only, in addition, we restricted the pathomorphological to adenocarcinoma and made a series of inclusion and exclusion criteria for the case and control group which matched each other in gender and age. For the multiple testing, we used FDR correction for multiple comparisons and considered adjusted P value less than 0.05 statistically. FDR is a method of conceptualizing the rate of type I error in null hypothesis testing when conducting multiple comparisons. FDR-controlling procedures provide less stringent control of Type I errors compared to family-wise error rate (FWER) controlling procedures and have greater power at the cost of increased numbers of Type I errors [31].
Another limitation of the current study was smoking history, which is a key risk factor for NSCLC and was not included in the data derived from the healthy control individuals in the current study. Although about 85-90% of lung cancer cases are smokers, the frequency of this malignancy among non-smokers seems to rise [47]. Several genetic polymorphisms have been recently found to be associated with lung cancer among non-smoking Chinese women [48,49], including those not exposed to cooking oil fume [50,51]. For Brasilian lung cancer patients (mostly with adenocarcinomas), different rates of EFGR and KRAS mutations were found in smokers and non-smokers [52]. The lack of smoking status for our healthy control individuals and patients makes it difficult to perform analyses that include such exposure variables and to perform a gene-smoking interaction analysis. Therefore, we are planning to recruit sufficient numbers of non-smoker NSCLC patients and comparable number of non-smoker controls to examine genetic risk factors independent of smoking.
Our interesting finding that HLA-C*08:01 allele is associated with NSCLC in Yunnan Han Chinese is novel, as such a detailed study has not been so far performed in this or other populations (Pubmed search, 3 January, 2019). In addition, this association could hardly be detected, for example, in Caucasians, as HLA-C*08:01 is very rare there, in contrast to HLA-C*08:02, more frequent in Europeans but extremely rare in Chinese (www.allelefrequencies. net). In 2016, Eric Tran et al reported that they identified a polyclonal CD8 + T-cell response against mutant KRAS G12D in tumor-infiltrating lymphocytes, and observed regression of lung metastases after the infusion of HLA-C*08:02restricted tumor-infiltrating lymphocytes that were composed of four different T-cell clonotypes that specifically targeted KRAS G12D [53]. The molecules encoded by these two alleles differ to some extent in their peptide-binding motifs and therefore might also differ in their roles in the immune surveillance of cancer [54].

Conclusion
In the current study, we performed an association study between HLA I genes, KIR genes and HLA/KIR combinations and NSCLC in a Chinese Han population and found that HLA-C*08:01 is a risk factor for NSCLC (adenocarcinoma). In the future, larger scale studies are needed to better clarify and examine the association between HLA I genes, KIR genes and HLA/KIR combinations and NSCLC susceptibility, resistance and disease progression.