ORIGINAL RESEARCH

Analysis of 13 TP53 and WRAP53 polymorphism frequencies in russian populations

About authors

1 Research Centre of Medical Genetics (RCMG), Moscow, Russia

2 Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia

Correspondence should be addressed: Marina V. Olkova
Gubkina 3, Moscow, 119991; ur.xobni@sciteneg

About paper

Funding: the study was carried out as part of the public contract between the Ministry of Science and Higher Education of the Russian Federation and the Research Centre of Medical Genetics (phenotyping of samples, database construction, data analysis).

Acknowledgement: we would like to express our appreciation to Oleg Balanovsky, head of the Genome Geography Laboratory of the Vavilov Institute of General Genetics for study management and manuscript editing, to all DNA donors and Biobank of North Eurasia for provided collection of samples, as well as to the Center for Precision Genome Editing and Genetic Technologies for Biomedicine of the Pirogov Russian National Research Medical University (Moscow, Russia) for the opportunity to use the molecular biology technologies.

Author contribution: Olkova MV — study design, statistical analysis, manuscript writing; Petrushenko VS — bioinformatics analysis, Ponomarev GYu — experiments.

Compliance with ethical standards: the study was carried out in accordance with the World Medical Association Declaration of Helsinki. All samples were obtained from Biobank of North Eurasia. The informed consent was obtained from all donors.

Received: 2020-11-26 Accepted: 2020-12-12 Published online: 2021-01-12
|

ТР53 gene is responsible for synthesis of one of the most notorious tumor suppressors, the р53 protein, which plays a vital part in maintaining genetic stability of the cell and cancer prevention. After activation due to cell damage, р53 triggers a number of cellular responses aimed at cell recovery and survival, or, in case the recovery is impossible, at programmed cell death. Such diverse pleiotropic tissue effects of р53 are due to total effect of co-expressed p53 isoforms. To date, at least 12 р53 isoforms have been reported, which are produced through alternative initiation of translation, the alternative promoter usage and alternative splicing [1]. All p53 isoforms share the common DNA-binding domain, but contain distinct transactivation and inhibitory domain, enabling the differential regulation of gene expression [2].

The ТР53 gene shows an autosomal dominant pattern of inheritance; it is associated with the risk of Li–Fraumeni syndrome and other hereditary cancer syndromes. The altered sensitivity to certain medications in people with a number of ТР53 gene polymorphisms has been confirmed (tab. 1).

The region of WRAP53 gene with at least three alternative promoters is located in the region 13.1 of the short arm of chromosome 17 partially overlapping the 5'-region of the TP53 gene, which is located on the chain oppositely oriented to WRAP53 in a head-to-head orientation [3]. The WRAP53 gene plays a dual role. First, it encodes the antisense RNA (WRAP53α), which regulates the levels of p53 mRNA through interaction with the first exon of ТР53, and is also involved in stimulation of  p53 protein production due to its impact on the 5'- untranslated region of p53 mRNA [4, 5]. Second, WRAP53 is responsible for WRAP53β protein (also called WDR79 and TCAB1) synthesis; WRAP53β belongs to WD40 protein family. This protein contributes to maintaining the integrity and normal function of the Cajal bodies essential for maturation of the splicing machinery and  telomere maintenance [68]. WRAP53β also promotes accumulation of the repair factor 53BP1 at DNA double-strand breaks, thus stimulating the DNA repair [9]. The WRAP53β protein possibly possesses oncogenic properties, as evidenced by the WRAP53β overexpression in various cancer cell lines compared to normal cells [7, 8]. It should be noted that the involvement of this protein in carcinogenesis currently remains questionable: there is a theory that overexpression may be caused by the involvement of WRAP53β in DNA multiple double-strand break repair in case of cancer development in certain tissue [7].

The WRAP53 gene mutations show the autosomal recessive pattern of inheritance. The homozygous mutations of this gene may result in dyskeratosis congenita and Li–Fraumeni syndrome.

The clinical significance of genes ТР53 and WRAP53, as well as high prevalence of their germline pathogenic variants in various cancer types [10], explain the need for studying the frequencies of these genes in populations of different countries. The frequencies of polymorphisms of these genes have already been studied in some European countries and the United States: the details on frequencies of both clinically significant polimorphisms and markers with uncertain significance may be found on the web-sites of such projects as ClinVar [11] of the National Center for Biotechnology Information of the USA, Ensembl (joint scientific project of the European Bioinformatics Institute and Sanger Institute) [12], and Genome Aggregation Database (gnomAD) [13]. In Russia, “Genokarta”, the website of genetic encyclopedia created by researchers from the Novosibirsk State University is being actively developed [14]. Our study aimed at exploring distribution and frequencies of 13 polymorphisms of TP53 and WRAP53 genes in Russian populations is directed to expand the scientific knowledge in this area with regard to populations living in our country.

METHODS

DNA sampling

DNA samples were provided by Biobank of North Eurasia [15]. DNA was extracted from blood and saliva using the standard phenol-chloroform extraction method. The study included 1,785 DNA samples of volunteers who belonged to 28 Russian populations, which, based on their places of residence, covered the main regions of Russia (tab. 2). Inclusion criteria: volunteers belonging to certain ethnic group (based on the self-identified ethnicity in four or more generations). Exclusion criteria: samples failing to meet criteria of belonging to certain ethnic group. Since the study was aimed at investigation of autosomal markers, the gender distribution was not taken into account during DNA sampling.

The size of each population was 30–87 people. The composition of studied populations is presented in tab. 2. It should be taken into account that the studied genes ТР53 and WRAP53 were autosomal, therefore, the actual number of studied alleles was twice as much: 60–174 alleles for each population.

Selection of polymorphisms

The list of ТР53 and WRAP53 polymorphisms was established based on genetic variants with proven clinical significance submitted to ClinVar database (except the ТР53 intronic variant, rs17881850). The intronic variant rs17881850 was included in the study in order to compare the allele frequencies of neutral polymorphism with the allele frequencies of genetic variants with confirmed clinical significance. Unfortunately, after genotyping much of polymorphisms included in original list had to be excluded from analysis, i. e. the population frequencies were calculated only using markers genotyped successfully in all populations.

Genotyping

All individuals were genotyped for nine ТР53 exon polymorphisms (rs587781663, rs17882252, rs150293825, rs112431538, rs149633775, rs144340710, rs1042522, rs1800371, rs201753350) and one intronic variant (rs17881850), as well as for three WRAP53 polymorphisms (rs17880282, rs2287499, rs34067256). Genotyping was carried out using the Illumina (Illumina Inc.; USA) genomic analysis microarray technology. The standard 0.15 GenCall score cutoff value was used to discard the poorly typed samples.

Basic data on studied polymorphisms

Full information on studied polymorphisms was obtained from the web-site of the  National Center for Biotechnology Information of the USA [16], in particular from the ClinVar archive [17] and Genome Aggregation Database (gnomAD) [18]. The position of polymorphism in the human genome was specified based on the version GRCh38.p12 of human reference Genome Assembly (tab. 1).

Since the information about some polymorphisms available from public domains was incomplete, all markers were also studied by functional analysis through hidden Markov models designed for prediction of missense protein variants using the fathmm web-site [19]. To minimize the number of false positives, a conservative threshold of –3.0 was chosen for analysis. The data obtained were included in tab. 1.

Mathematical and statistical methods

Calculation of population alternative allele frequencies for studied polymorphisms, calculation of χ2 criterion and р-value for assessment of Hardy–Weinberg genotype frequencies, as well as assessment of alternative allele frequency distribution normality in studied populations were performed with RStudio R, version 4.0.2 (RStudio; USA) and Microsoft Excel (Microsoft Corp.; USA). The differences were considered significant at р < 0.01.

Multidimensional scaling

Two-dimensional representation of spatial distribution of populations based on the alternative allele frequencies calculated for studied markers ТР53 and WRAP53 was obtained with STATISTICA10 software package (StatSoft; USA) via multidimensional scaling using the Nei's genetic distances calculated with the DJ genetic software (RCMG; Russia).

RESULTS

Calculation of alternative allele frequencies for studied markers in Russian populations

Based on the genotyping of 28 Russian populations, we calculated alternative allele frequencies for nine exon ТР53 polymorphisms (exon 2 —  rs201753350; exon 3 — rs1042522, rs1800371; exon 6 — rs144340710; exon 7 — rs112431538, rs149633775; exon 10 — rs587781663, rs17882252, rs150293825), one intronic ТР53 variant (rs17881850), as well as for three polymorphisms of exon 2 of WRAP53 gene (rs17880282, rs2287499, rs34067256). The calculated population alternative allele frequencies for the listed markers are presented in tab. 2.

Alternative allele frequencies and Hardy–Weinberg equilibrium

To test the studied markers ТР53 and WRAP53 for Hardy– Weinberg equilibrium in the population, χ2 criterion was calculated based on the existing allele ratio and the calculated in accordance with the Hardy–Weinberg principle marker population frequencies. tab. 3 was compiled in order to visualize the relationship between the alternative allele frequencies and the Hardy–Weinberg equilibrium. In five of 28 studied populations (“Central Caucasus”, “Dagestan”, “northern Russians”, “Tatars” and “Transcaucasia”) the combination of high (compared to the listed in tab. 2 reference frequencies for the world’s populations of appropriate origin) alternative allele frequencies and their non-equilibrium pattern in the population (orange cells) were observed for most markers. This suggests that the external factors (for example, accidental inbreeding) may affect the pattern of studied alleles in the discussed population. Genotyping errors may also affect the results.

In the population of “southeastern Russians” alternative alleles for many markers were identified; the frequencies of those were higher compared to reference European population, however, the alleles showed no deviation from the Hardy– Weinberg equilibrium.

In some populations (“Komi and Udmurts”, “Siberian Tatars”, “Western Caucasus”), the diversity of identified markers was higher compared to reference populations of appropriate origin, however, their frequencies were low (close to reference values) and satisfied the Hardy–Weinberg principle. Among studied markers, two markers (rs1042522, located in exon 3 of ТР53, and rs2287499, located in exon 2 of WRAP53) were characterized by high frequencies and satisfied the Hardy– Weinberg principle in all populations.

Assessment of normality for distribution of alternative allele frequencies in the populations

Since in theory the neutral alleles are not affected by natural selection, and their population frequencies may follow a normal distribution, we have assessed the normality of the marker frequency distribution in the population using the Shapiro-Wilk test. The test results have made it possible to confirm the nullhypothesis for two markers: rs1042522 (W = 0.95, p = 0.18) and rs2287499 (W = 0.97, p = 0.46). For other markers, the normal distribution has not been confirmed.

Analysis of marker population frequencies by multidimensional scaling (MDS)

In our study, multidimensional scaling was the most effective method for the studied populations’ positioning in the lowdimensional space allowing us to evaluate the genetic distances between populations. MDS was performed for 29 populations (Tatar and African populations were excluded due to sharp contrast between their marker frequencies and the data for the main population pool) (see figure). For the MDS performed the stress value was 0.068, and the alienation coefficient was 0.058.

The populations were pre-labeled as belonging to one of three groups: Asian, European and Caucasian. The populations grouped based on their origin (see figure) allowed us to determine three appropriate clusters: Asian, European and Caucasian. In the Asian and European clusters having the overlap area of a significant size, the populations are closer to each other based on polymorphism frequencies, and in the Caucasian cluster, the large frequencies’ variability between populations is observed.

The boundaries of Asian cluster are quite clear. The only exception is the joint population which includes Kazakhs, Karakalpaks, Uighurs and Nogais; the position of this population outside the cluster is due to higher level of some markers showing the equilibrium pattern compared to other Asian populations (tab. 3).

European cluster has a more compact shape with high population density around the central reference European population, the marker frequencies for which have been obtained from public sources. The following three European populations appear to be far beyond the cluster: “northern Russians", "southeastern Russians" and the joint population of Mari and Chuvash. The population of “northern Russians” is the only European population showing higher frequencies of many markers deviating from equilibrium, which are absent in other European population (tab. 3). Compared to the “northern Russians" population, the population of "southeastern Russians" characterized by high frequencies of some polymorphisms unusual for European populations, the majority of which are in Hardy–Weinberg equilibrium (tab. 3), is extended beyond European cluster in the opposite direction on the plot. The "Mari and Chuvash" population is deep inside the Asian cluster, which may be due to anthropological composition of the Chuvash that includes both Caucasian individuals and a significant proportion of Mongoloid individuals together with mixed forms.

All populations of Caucasian cluster, except for the "Western Caucasus” population falling in the overlap area with Asian cluster and being in the close proximity to European cluster, are characterized by high frequencies of all analyzed polymorphisms and their non-equilibrium patterns (tab. 3), which brings them close to “northern Russians" based on the discussed parameters (see figure).

DISCUSSION

The study of ТР53 and WRAP53 gene polymorphisms in 28 populations covering all major regions of Russia made it possible to assess the frequency and distribution of selected markers in various Russian regions and populations.

Assessment of alternative allele frequencies for the studied markers in Russian populations revealed two major trends:

  • two of 13 studied markers (rs1042522, located in exon 3 of ТР53, and rs2287499, located in exon 2 of WRAP53) are characterized by high frequencies in all populations, normal distribution of marker population frequencies, and Hardy–Weinberg proportions of alleles;
  • in five populations (“Central Caucasus”, “Dagestan”, “northern Russians”, “Tatars” and “Transcaucasia”), the frequencies of most markers (except the mentioned above rs1042522 and rs2287499, as well as the benign rs144340710 located in exon 6 of ТР53) appeared to be high, and the alleles of these markers showed non-equilibrium patterns. In the Tatar population the abundance, frequencies and distribution of the studied polymorphisms’ alternative alleles were significantly higher compared to reference global values; they also significantly exceeded the frequencies and abundance of these polymorphisms in the major pool of Russian populations (tab. 2). This matter is subject to further research.

Two prevalent equilibrium markers, rs1042522 of ТР53 gene and rs2287499 of WRAP53 gene, are listed in the ClinVar data base as benign markers. It means that their frequency is too high for markers to be pathogenic mutations; markers are found in hetero- and homozygous state in individuals without severe disease for that gene; markers show no disease association in appropriately sized case-control studies [20]. The fact that the alternative allele of rs1042522 polymorphism being a part of р53 DNA-binding domain [21] shows high frequency in all populations compared to reference genome may indicate the reference genome carrying a random minor allele. This assumption is indirectly supported by the reported increased functional ability to induce apoptosis and prevent cancer development in alternative Arg72 variant of р53 protein compared to reference Pro72 variant [22].

Despite rs1042522 and rs2287499 are listed in scientific databases as benign markers, literature contains a large amount of data on their involvement in carcinogenesis. In particular, it has been reported that heterozygous variant Arg/ Pro of p53 Pro72Arg polymorphism (tab. 1) is associated with high risk of melanoma compared to heterozygous variant Pro/Pro [23]; another paper reports association of the marker Pro/Pro genotype with increased risk of nonsmall cell lung cancer in patients from Moscow Region [21].

There are many literary sources on survival of cancer patients, homo- and heterozygous for Pro72Arg, however, the reported data is controversial. There is evidence of increased median survival time in cervical cancer patients carrying Arg/Pro genotype when compared with patients with Arg/Arg and Pro/ Pro genotypes [24], however, the extensive research carried out by Danish specialists [25] has shown no association of the mentioned above polymorphism with lower mortality after cancer and lower cancer incidence in the general population.

However, rs1042522 is better known as a marker included in the expert panel of Pharmacogenomics Knowledgebase, which is associated with altered body response to some antineoplastic drugs [26]. There is evidence of the p53 Pro allele association with toxicity due to chemotherapy [27], as well as evidence of lower response rate to fluorouracil-based chemotherapy in gastric cancer patients carrying the Pro/Pro genotype compared to patients carrying the Arg/Arg genotype [28].

The clinical significance of rs2287499 marker of gene WRAP53 is much less well understood, however, there is evidence of moderate linkage disequilibrium between studied markers rs1042522 and rs2287499. The haplotype combination CA/GC is associated with increased risk of breast cancer, and the haplotype combination GA/CC, by contrast, is assumed to be a protective factor against breast cancer [29].

The abundance and frequencies of other polymorphisms in Russian populations vary significantly, however the calculated frequencies of most polymorphisms correspond to reference marker frequencies for Asian and European populations in accordance with the origin of surveyed Russian populations. Exceptions are five listed above populations with high frequencies of the discussed markers. The clinical significance of some studied polymorphisms (for example, the intronic variant rs17881850) remains poorly understood. However, recent evidence suggests that intronic ТР53 gene polymorphisms may also have some clinical significance [30].

CONCLUSION

The study allowed us to obtain data on germline ТР53 (10 markers of five exons and one intron) and WRAP53 (three markers of exon 2) polymorphism frequencies for 28 Russian populations. In the majority of populations the calculated polymorphism frequencies are close to values obtained for reference global population (Asian or European) of appropriate origin. Six populations ("Central Caucasus", "Dagestan", "northern Russians", "south-eastern Russians", "Tatars" and "Transcaucasia") are characterized by increased marker frequencies compared to reference values; in all listed populations except the population of “southeastern Russians” the marker alleles with high frequencies do not satisfy the Hardy–Weinberg principle. The Tatar population is characterized by high frequencies of polymorphisms’ allelic disequilibrium, which demonstrates the need for more detailed investigation of those in this population in order to reveal the cause of such differences.

КОММЕНТАРИИ (0)