Wei Ping1,2, Li Hui2, Zhang Xiao-Min3, LI Jie-Wei2, Yang Siye1 and Li Junli2*
1Sichuan Academy of Traditional Chinese Medicine/SiChuan Institute for Translational Chinese Medicine/Translational Chinese Medicine Key Laboratory of Sichuan Province, China
2College of Computer Science, Sichuan Normal University, China
3School of Life Science and Engineering, Southwest University of Science and Technology, China
*Corresponding author:Li Junli, College of Computer Science, Sichuan Normal University, China
Submission: July 15, 2024;Published: July 26, 2024
ISSN: 2576-8816Volume11 Issue2
The occurrence and development of tumors are closely related to the immune system’s response. Identifying indicators that can effectively quantify and characterize the immune status of tumors is crucial for cancer diagnosis, staging, treatment, and prognosis. T-cell receptor (TCR) diversity and specificity have been identified as potential tumor biomarkers. However, merely analyzing TCR diversity or specificity independently does not always correlate positively with effective immune presentation of tumors. We define TCR effective diversity as the diversity of specific TCRs that can recognize and bind to specific antigenic peptides, thereby eliciting a positive immune response. By leveraging a specificity clustering model, we constructed an effective diversity index and evaluated its quantitative analysis in cancer classification across two datasets. The study demonstrates that the TCR effective diversity index, which integrates both diversity and specificity, can more accurately quantify the role of TCR in tumor immunity.
Keywords:TCR; Effective diversity; Tumor immunity
Early diagnosis of cancer is crucial, but tumors are difficult to prevent and treat. As a key immune molecule, TCR can recognize tumor neoantigens and trigger immune responses [1,2], making it an ideal tumor biomarker. However, the human TCR repertoire exhibits significant individual variability and extreme diversity, with a vast number of unique sequences, low inter-individual TCR overlap, and specificity for HLA and antigens. These characteristics complicate the quantitative characterization of tumor immunity.
The diversity of the TCR repertoire is highly correlated with pathological states, and studies have extensively explored tumor associated TCR diversity. However, sheer diversity alone does not consistently correlate with effective tumor immunity: while higher TCR diversity generally indicates a robust anti-tumor immune capacity, predicting better therapeutic responses and outcomes [3-10], exceptions exist [11-13]. The specific binding of TCRs to antigen-MHC complexes is crucial for monitoring immune-tumor interactions. Recent studies have begun to explore experimental techniques and predictive models for TCR specificity, such as GLIPH [14], Panpep [15], and PMTnet [16]. Current research efforts have focused on analyzing TCR diversity or specificity alone.
It is hypothesized that an effective diversity metric combining diversity and specificity can more accurately quantify the role of TCRs in tumor immunity (Figure 1). Effective diversity of TCRs is defined as the diversity of specific TCRs that can successfully recognize a specific antigen and elicit a positive immune response [17]. Based on the hypothesis that structural similarity determines functional similarity, a TCR structural similarity clustering algorithm was utilized to construct the TCR effective diversity index. The predictive performance of this index in distinguishing between cancer patients and healthy individuals was analyzed. The results validated that the effective diversity index can more accurately measure the body’s anti-tumor immune levels.
Figure 1:Schematic Diagram of TCR Diversity, Specificity, Efficacy, and Effective Diversity.
Note: A represents TCR diversity; B represents TCR specificity; C represents TCR efficacy; D represents TCR effective
diversity. Green cells - T cells, yellow cells - antigen-presenting cells; TCR - different colors represent different TCRs;
antigen peptide - different colors represent different antigen peptides; MHC - different colors represent different
MHCs; × indicates antigen not recognized by TCR.
The TCR diversity index is calculated at the amino acid sequence level using commonly employed metrics such as Shannon entropy, Gini-Simpson index, D50, and Pielou, as shown in formulas 1-4.
Shannon Entropy:
In these formulas, H represents Shannon entropy, N denotes the total observed number of clonotypes, also referred to as richness, and Pi represents the relative frequency of the i-th TCR clonotype.
Gini-Simpson Index:
In the formula, D represents the Gini-Simpson diversity index, N denotes the total observed number of clonotypes, also referred to as richness, and pi represents the relative frequency of the i-th TCR clonotype.
D50:
In the formula, C represents the number of dominant clonotypes that account for 50% of the total sequences, and N denotes the total observed number of clonotypes.
Pielou:
In the formula, H represents Shannon entropy, and N denotes the total observed number of clonotypes.
Based on the reasonable hypothesis that TCR structural similarity largely determines functional similarity, we constructed the TCR effective diversity index (EDI) using a specificity clustering model:
where n represents the number of clusters; Si represents the proportion of TCR sequences in the i-th cluster relative to the total number of TCR sequences; Ti represents the proportion of observed TCR clonotypes in the i-th cluster relative to the total number of clonotypes.
In Formula 5, the clustering algorithm uses GLIPH [14], an algorithm for identifying groups of T-cell receptors (TCRs) with similar functional properties. Compared to diversity indices, the effective diversity index comprehensively reflects both specificity and diversity.
We analyzed and compared the classification performance of individual diversity indices and the effective diversity index in distinguishing between healthy individuals and cancer patients using data from 83 healthy individuals and 87 cancer patients in Data 1 [1], and 20 healthy individuals and 16 colorectal cancer patients in Data 2 [4]. Among the diversity indices, Shannon entropy exhibited the best classification performance, while D50 showed the weakest classification performance. The AUC values of the effective diversity index were higher than those of the diversity indices (Table 1). This indicates that the single effective diversity index is effective for cancer classification and performs better than the diversity indices.
Table 1:AUC Values for TCR Diversity Indices and Effective Diversity Index Classification in Data 1 and Data 2.
We constructed Random Forest (RF) classifiers based on diversity indices, and a combination of both types of indices. We divided the dataset into a training set and a test set, which accounted for 80% and 20% of the total data, respectively. The RF classifier is trained on the training set and its performance is evaluated on the test set. The results show that the RF classifier incorporating the combined indices demonstrated the best classification performance, indicating that the inclusion of effective diversity indices improved classification performance (Table 2). This suggests that TCR effective diversity indices are effective for cancer classification, enhancing classification performance. Furthermore, feature importance rankings (Figure 2) show that effective diversity indices are the most important classification factors in the constructed RF model, significantly improving the classifier’s performance.
Table 2:Comparison of RF model classification performance based on different indices.
Figure 2:RF Feature Importance Analysis.
a) Ranking of RF feature importance scores for Dataset 1.
b) Ranking of RF feature importance scores for Dataset 2.
Numerous studies on TCR diversity and specificity have demonstrated their significant roles in tumor immunity research. However, most of these studies analyze diversity or specificity in isolation. In tumor immunity, diversity and specificity are often interrelated, and their combined effect needs to be validated through efficacy. TCR diversity does not distinguish between specific and non-specific TCRs, and the role of non-specific TCRs in diversity is minimal. While some tumor therapy studies show a correlation between high diversity and better prognosis, and low diversity levels with poorer prognosis, there are exceptions. Therefore, the diversity that exerts positive immune effects must be effective diversity; if diversity lacks efficacy, it still fails to achieve its purpose.
The comprehensive analysis of TCR diversity, specificity, and effectiveness can provide a more accurate understanding of the immune system’s function and improve strategies for the prevention, diagnosis, and treatment of immune-related diseases. The TCR effective diversity index, constructed by combining structural similarity expression of specificity with diversity quantification, to some extent reflects effective diversity. If the TCR effective diversity is sufficiently high, T cells can recognize a sufficient number of tumor neoantigens, thereby increasing the coverage and efficiency of the immune system and inhibiting tumor occurrence and development. The TCR effective diversity index is effective for cancer classification and outperforms diversity indices. Preliminary validation has shown that the effective diversity index is a better quantitative measure of the body’s anti-tumor immune level.
However, inferring TCR antigen specificity based solely on TCR sequences remains challenging. It may be feasible to construct a TCR effective diversity index based on receptor-ligand binding by incorporating TCR-antigen affinity prediction models. Nevertheless, the high false positive rate of current TCR-antigen affinity models is a significant issue that needs to be addressed. Additionally, a more refined effective diversity calculation model must consider the impact of non-linear factors such as immune evasion and the synergistic effects of various cell types. These issues require further research and exploration.
With the ongoing development and refinement of single-cell TCR sequencing and TCR-pMHC structure prediction technologies [18,19], TCR effective diversity is expected to become an important tool for cancer immune monitoring and prognosis prediction in the future.
© 2024 Li Junli. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and build upon your work non-commercially.