ORIGINAL RESEARCH

The possibility of evaluation mRNA expression profiling to predict progression of local stage colorectal cancer

Goncharov SV, Bozhenko VK, Zakharenko MV, Chaptykov AA, Kulinich TM, Solodkiy VA
About authors

Russian Scientific Center for Roentgenoradiology, Moscow, Russia

Correspondence should be addressed: Sergey V. Goncharov
Profsoyuznaya, 86, 117997, Moscow, Russia; ur.liam@5109

About paper

Funding: the study was supported through the grant by RSF 22-15-00448.

Author contribution: Goncharov SV, Bozhenko VK, Solodkiy VA — study concept and design; Bozhenko VK, Goncharov SV, Kulinich TM, Zakharenko MV, Solodkiy VA — data acquisition and processing; Goncharov SV, Chaptykov AA, Bozhenko VK, Kulinich TM — manuscript writing; Goncharov SV, Chaptykov AA, Bozhenko VK, Kulinich TM, Zakharenko MV, Solodkiy VA — editing.

Received: 2023-09-13 Accepted: 2023-12-01 Published online: 2023-12-23
|

Colorectal cancer (CRC) occupies one of the leading places in the structure of cancer incidence in Russia and all over the world. Furthermore, prevalence of colon cancer in the Russian Federation increased from 116.7 to 161.0 cases per 100,000 population over the last decade, while prevalence of rectal cancer increased from 90.4 to 121.1 cases [1].

Today, cancer stage according to the TNM system and grade are the main parameters determining the CRC prognosis. However, even the groups of patients with CRC clinically homogeneous based on stage and grade are characterized by high heterogeneity of the disease course and uncertain prognosis. Such diversity is due to the fact that several tumor variants with different molecular pathogenesis that form the tumor biological heterogeneity go under the guise of the same morphological type of cancer.

Сlinical prediction tools that are traditionally based on the statistical regression models represent one of the methods to combine all the prognostic information allowing one to avoid further stratification of the intermediate TNM system based on binary logistic regression [2]. With appropriate development and testing, these tools will be able to integrate and personalize information about the prognosis of certain patient and provide refined assessment of progression risk for clinical use.

To date, several test systems for determination of the disease prognosis and therapy efficacy based on assessing expression of genes in the tumor tissue have been created. Such test systems, as OncotypeDX, ColoPrint, ColDx, determine the likelihood of cancer progression based on the estimated expression of a number of genes in the tumor [3]. The efficiency of these commercially available systems is criticized in some papers. Thus, there is a report on the creation of a more effective system for prediction of CRC progression and I–II stage cancer response to treatment compared to OncotypeDX and ColoPrint [4]. The authors create a patientspecific treatment plan for early stage CRC, suggesting to include adjuvant chemotherapy in the treatment regimen, which is usually not done according to the current international and national guidelines.

There are also specific biological predictors for prediction of the possibility of CRC metastasis to the lymph nodes, such as heat shock protein 47 (HSP47) [5]. According to the authors, detection of this protein will also contribute to the treatment approach personalization in patients at high risk of metastasis to the lymph nodes.

Currently, various prognostic models based both on the use of advanced mathematical methods (neural network models, artificial intelligence models, construction of binary сlassification tree (BCT), peer reviews, etc.) and expansion of the set of explanatory variables (determination of point mutations, microsatellite instability, investigation of tumor microenvironment and expression profiles) are constantly proposed. However, there is still a need for prognostic model improvement. These issues prompted us to assess the possibility of expression profiling of mRNAs from tumor specimens in order to evaluate the colorectal cancer prognosis.

METHODS

The mRNA expression profiles of 63 genes (tab. 1) of potential contributors to various carcinogenesis pathways determined in 217 specimens of colorectal adenocarcinoma of different localization were included in the study. Adenocarcinoma specimens from the right half of the colon constituted 23% (50 specimens), while that from the left half of the colon constituted 39.6% (86 specimens). Rectal tumor specimens made up 37.4% (81 specimens). There were 97 males (44.7%), 120 females (45.3%). Specimens were collected during pathomorphological examination of surgical material. Inclusion criterion: morphologically verified colorectal adenocarcinoma, local stages T1-3, N0-2, M0. No specific anticancer treatment was performed before surgery. During the study the long-term outcomes were monitored in all patients for at least 36 months. The median follow-up period was 42 months. Exclusion criterion: multiple primary colorectal cancer, history of other type of cancer or cancer of other type at the time of inclusion in the study.

We had earlier reported the method of RNA extraction and real-time PCR settings [6]. As a result, relative expression of mRNAs of the studied panel of genes belonging to different functional groups was determined in each specimen of adenocarcinoma having different embryonic-anatomical localization.

Statistical data processing was performed using the Jamovi open source statistical software package (The Jamovi project; Australia). The logistic regression models constructed were evaluated based on R2 Durbin–Watson test for autocorrelation (DW). The model quality was considered acceptable at R2 > 0.3 and DW > 1.5.

We set the task to construct a binary logistic regression model for prediction of 3-year disease-free survival in patients with CRC using the mRNA expression profiles obtained considering the data of pathomorphological report. For that we formed a set of explanatory variables conditionally divided into two categories. The first category included the data of the surgical specimen pathomorphological examination: T and N, grade, lymphovascular and angiolymphatic invasion, ratio of the total number of resected lymph nodes to the number of metastatic ones. The second category of explanatory variables included mRNA expression profiles of 63 genes in the tumor specimens. To test predictive ability of the model developed, initial sample of patients was randomized into two subsamples: index (90% of observations) used to construct the model and control (10% of observations) used to assess the likelihood of prediction made using the model constructed. The percentages of the above subsamples suit the general practice adopted in modern scientific literature. Calculation was performed in the EViews v. 7.0 software package (IHS Global Inc.; USA). Assessment and comparison of the concurrent probit and logit models based on the McFadden's coefficient of determination and Akaike and Schwartz information criteria showed that logit specification was the most successful.

RESULTS

During the first phase, a logit model for prediction of CRC progression within 36 months after the diagnosis was constructed using conventional clinical and morphological criteria for CRC progression risk only as explanatory variables (tab. 2).

R2 was 0.29, and Durbin–Watson statistic was 1.51. These characteristics of binary logistic regression model suggest that the analysis of conventional morphological risk factors of progression, such as adenocarcinoma grade, tumor localization, total number of resected lymph nodes and number of metastatic ones predicts the CRC progression probability with the minimally satisfactory accuracy. Table tab. 3 provides a classification matrix of this logit model.

Fig. fig. 1 shows accuracy of the model developed with inclusion of conventional prognostic factors.

As we have pointed out before, the overall prediction accuracy of this model (56.62%) was not high, while prediction accuracy of 37% in patients with no progression was considered to be unsatisfactory. We used mRNA expression profiles of a panel of 63 genes from tumor specimens as supplementary explanatory variables during the next phase of the study.

The research resulted in construction of the second logit model, in which mRNA expression profiles from tumor specimens were added to morphological characteristics as explanatory variables. A total of 12 characteristics (variables) turned out to be significant in the mathematical model (tab. 4).

R2 coefficient was 0.4 in this model, while Durbin–Watson statistic was 1.64. A significant increase in the accuracy of the model constructed was achieved by including the expression levels of genes CCNB1, Ki67, GRB7, IGF1, IL2, IL6, IL8, GATA3 from tumor specimens in the regression equation. Classification matrix is provided in tab. 5.

The overall classification accuracy was 80.6% (fig. 3). We would like to emphasize that   prediction accuracy in patients with no progression increased from 37 to 70.5% relative to the first model.

This model was used to calculate a personalized prognosis for each patient in our sample. Fig. fig. 3 presents graphic representation of personal risk distribution based on progression detection. The median risk indicator was 57.1% [38.2; 70.7] in the group with no progression detected and 79.2% [68.3; 96.4] in cases of progression detection. The differences in risk indicators turned out to be significant (Kruskal–Wallis test: р < 0.05).

The risk factors of colorectal cancer progression assessed by pathomorphologists during the routine examination enable construction of the prognostic model that is minimally satisfactory in terms of accuracy. The increase in the prognostic model accuracy can be achieved through analysis of information beyond the bounds of pathomorphological stage. Assessment of mRNA expression profiles of genes CCNB1, Ki67, GRB7, IGF1, IL2, IL6, IL8, GATA3 in tumor specimens makes it possible to increase accuracy from 56.62 to 80.6%. The changes in expression of other genes of the panel also seem to be important, however, inclusion of those in the model does not result in higher accuracy due to multicollinearity, which can testify additionally that changes in the large intestinal mucosa associated with colorectal cancer are systemic.

We have noted that classical pathomorphological signs of high risk of CRC progression, such as lymphovascular and angiolymphatic invasion, grade, type of lymph node involvement, have negative regression coefficients, while GATA3 tumor suppressor has a positive coefficient. This pattern seems to be consistent: increased activity of the GATA3 cancer suppressor is typical for the less aggressive CRC course [7], while the presence of metastatic lymph nodes in the specimen, low grade, and angiolymphatic invasion indicate high risk of progression.

DISCUSSION

The urgent task of improving the outcomes of local stage colorectal cancer tertiary prevention is inextricably linked with objective stratification of cancer prognosis aimed at treatment personalization and, which is especially important, assessing the effectiveness of the existing and prospective treatment regimens. Despite the fact that to date pathomorphological CRC stage is the basis for the disease progression prognosis, it is the study of information about pathological progression without any reference to CRC stage that can become the key to overcoming the challenge of progression risk assessment. This information can be represented primarily by molecular genetic data obtained by analysis of tissues of the affected organ. We used 12 indicators obtained during pathomorphological examination and molecular genetic testing of the tumor to develop a prognostic logit model of progression. These include both generally accepted risk factors, such as tumor grade, angiolymphatic and perineural invasion, as well as nature of changes in the lymph nodes resected during surgery, and mRNA expression of eight genes: CCNB1, Ki67, GRB7, IGF1, IL2, IL6, IL8, GATA3.

The role of these genes in carcinogenesis was repeatedly discussed in the literature [812]. These genes belong to functional groups of regulators of cell cycle (CCNB1) and proliferation (Ki-67 GRB7), growth factors (IGF-1) and cytokines (IL2, IL6, IL8) involved in the colorectal cancer invasion and metastasis [12].

When comparing our findings with the earlier reported data, including the reports of large-scale studies, we noticed that it was difficult to clearly interpret the characteristics of tumor grade, angiolymphatic and lymphovascular invasion due to the lack of common classification system and assessment standards [1315]. This enables considerable variation of these characteristics across different clinics [16, 17]. Thus, for example, it is believed that PNI detection rate is usually underestimated, and the detection rate values vary between 9 and 42% [18]. The role of the resected to metastatic lymph node ratio in the specimen in CRC was first explored in 2005 [19]. This indicator was defined as a negative independent prognostic factor in stage III disease associated with overall and disease-free survival of patients with CRC. The indicator has a stronger influence on the prognosis of rectal cancer, than that of colon cancer. Its prognostic value increases when assessing more than 12 lymph nodes. Critical values of this indicator vary between 0.125 and 0.3 in different studies. There is still no consensus about the minimum number of lymph nodes harvested for appropriate estimation of this parameter.

It should be noted that, thanks to the efforts of medical associations, the clinical genomic databases have become available in the recent years. The analysis of such datasets allows one to better understand the CRC genomic landscape and assess treatment efficacy and safety in the subgroups of patients with different genomic profiles. It is noted in the literature that the differences between the databases on demographic, clinical characteristics, treatment regimens and overall survival should be considered when developing research and interpreting the results acquired from the clinical genomic databases [20].

Anyway, our findings confirm the trend: analysis of additional information, primarily molecular genetic data, beyond the bounds of pathomorphological stage in individuals with colorectal cancer significantly increases accuracy of predicting the likelihood of progression. Furthermore, the search for new predictors and, just as important, extensive validation of prognostic systems should be continued.

CONCLUSIONS

We have found that the risk factors of CRC progression identified during standard pathomorphological examination ensure prediction accuracy of 56.62% when using a binary prognostic logit model in our sample of patients. Moreover, classification errors occur primarily because of patients showing no progression throughout the 36-month followup period. Inclusion of mRNA expression levels of genes CCNB1, Ki67, GRB7, IGF1, Il2, Il6, Il8, GATA3 from tumor specimens in the model as explanatory variables increases prediction accuracy to 80.6%. This suggests that expansion of the search for outcome predictors beyond the bounds of the TNM pathomorphological stage is a promising way to increase accuracy in order to implement effective CRC tertiary prevention measures.

КОММЕНТАРИИ (0)