This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (CC BY).
ORIGINAL RESEARCH
NGS technology as a tool for the Wilson’s disease diagnosis and severity assessment
1 Sechenov First Moscow State Medical University (Sechenov University), Moscow, Russia
2 Petrovsky Russian Scientific Center for Surgery, Moscow, Russia
3 Moscow Scientific and Practical Center for Laboratory Research, Moscow, Russia
4 Ott Research Institute of Obstetrcis, Gynecology and Reproductive Medicine, St. Petersburg, Russia
5 Federal Scientific and Clinical Center for Infectious Diseases of the Federal Medical Biological Agency, St. Petersburg, Russia
6 State Scientific Center of the Russian Federation — Federal Medical Biophysical Center named after A. I. Burnazyan, Moscow, Russia
Correspondence should be addressed: Inna G. Tuluzanovskaya
Yelansky, 2, bld. 2, 119435, Moscow, Russia; ur.liam@77t_anni
Author contribution: Balashova MS — follow-up of patients, NGS data analysis, manuscript writing; Zhuchenko NA — literature review and analysis, manuscript writing; Tuluzanovskaya IG — follow-up of patients, literature analysis and processing, manuscript writing; Glotov OS — molecular genetic testing data preparation, manuscript editing; Senina OS — data entry, manuscript writing; Ignatova TM — literature analysis, manuscript editing; Asanov AYu — literature analysis, manuscript writing and editing.
Compliance with ethical standards: the study was approved by the Ethics Committee of the Sechenov University (protocol No. 07-23 dated 27 April 2023). All subjects or their legall representatives submitted the informed consent for participation in the study.
Wilson’s disease (WD, OMIM #277900) is an autosomal recessive copper metabolism disorder caused by pathogenic and likely pathogenic variants in the gene ATP7B encoding the hepatocyte copper-transporting ATPase [1]. The disease is characterized by excessive accumulation of copper in the liver, brain, cornea, and other organs, resulting in the broad spectrum of clinical manifestations: from acute liver failure to neurological and psychiatric symptoms. Despite its monogenic nature, the WD phenotypic heterogeneity is extremely high: the age of onset can vary between 2 and 72 years, and clinical features can include isolated liver damage, as well as neurological symptoms without liver failure, the combination of those, and various extrahepatic manifestations [1].
The genotype-phenotype correlation is one of the key factors of such variability. To date, more than 1000 variants in ATP7B associated with WB have been reported; a number of studies demonstrate the association of certain variants with the earlier onset and more severe disease course, or with the late onset and predominantly hepatic form [2, 3].
In various populations, the genotype-phenotype differences manifest themselves in view of the most common/major variant. Thus, p.His1069Gln (H1069Q) is most common in Europe; homozygosity for the variant is associated with the late disease onset and predominance of neurological symptoms [3, 4]. The p.Arg778Leu (R778L) variant, which is associated with the earlier onset (often in childhood) and predominantly hepatic form, prevails in East Asian populations [2, 5].
The unique/specific frequent nucleotide sequences also showing characteristic clinical features are found in some regions (for example, c.-436_-422del in India and p.Val1146Met in Sardinia) [6].
Meta-analyses confirm that nonsense and frameshift variants leading to complete absence of functional protein are correlated to the more severe and earlier disease, while the missense variants preserving ATP7B residual activity are more often associated with the milder phenotype and late onset [2, 7].
Thus, understanding of genotype-phenotype correlations in Wilson’s disease are of not only theoretical, but also practical significance: it allows one to predict the disease course, determine screening priority in specific ethnic groups, and, perhaps, develop personalized approaches to therapy in the future.
The study aimed to identify correlations between genetic variants in the ATP7B gene and the WD clinical manifestations using next-generation sequencing.
METHODS
The study was based on the analysis of data of 81 patients with the diagnosis of Wilson’s disease verified by next generation sequencing (NGS). Information was obtained from the clinical genetic database containing the data on 296 WD patients. The patients were followed up at the Clinical Center (Tareyev Rheumatology, Nephrology, and Occupational Pathology Clinic) of the Sechenov First Moscow State Medical University (Sechenov University) between 2015 and 2019.
The diagnosis of WD was established in accordance with the Russian and European guidelines using the Leipzig Score (Leipzig, 2001). The main criteria for establishing the diagnosis were as follows: results of the gene ATP7B molecular genetic testing; typical clinical manifestations (hepatic and neurological symptoms, Kayser–Fleischer ring); copper metabolism indicators.
The laboratory diagnosis included complete blood counts and clinical urine test; blood biochemistry panel (including assessment of the levels of liver transaminase activity, lipid profile, iron metabolism indicators); copper metabolism indicator testing (plasma ceruloplasmin, 24-hour urine copper tests). Instrumental methods included abdominal ultrasound, ophtalmological slit lamp examination.
Molecular genetic testing of biomaterial (blood) samples by NGS was performed at the Ott Research Institute of Obstetrcis, Gynecology and Reproductive Medicine. The target NGS panel was used that included the ATP7B gene and a number of potential genetic modifiers: HFE, COMMD1, XIAP, CFTR, APOE, PRNP.
The panel was implemented on the NimbleGen SeqCap EZ Choice platform (151012_HG38_CysFib_EZ_HX3, ROCHE, Switzerland). Sequencing was performed using the MiSeq Sequencing System (Illumina, USA) ensuring high-throughput whole genome sequencing.
The variants identified were confirmed by Sanger sequencing.
Bioinformatics analysis
Bioinformatics analysis of the DNA sample sequencing results was conducted using the following software tools: GeneTalk (https://www.gene-talk.de/), UGENE (http://ugene.unipro.ru/), IonReporter (https://ionreporter.lifetechnologies.com/ir/), PolyPhen-2 (genetics.bwh.harvard.edu/pph2/), and PAPI ((http://papi.unipv.it/).
The data obtained were interpreted in accordance with the recommendations of the data interpretation Guidelines [8, 9].
Pathogenicity prediction software tools were used to predict the effect of variants: SIFT (http://sift.jcvi.org/), PolyPhen-2 (genetics.bwh.harvard.edu/pph2/), ClinVar, PROVEAN, fathmmMKL, WilsonGen, etc.
Statistical analysis
Standard statistical analysis methods were used in the study. Statistical data processing was performed using the IBM SPSS Statistics software package (USA), as well as Microsoft Excel.
RESULTS
Biomaterials of patients, which were through ATP7B testing by NGS, were selected in the clinical genetic database of WD patients formed [10]. The total number of patients was 81 (among them 23 males and 58 females), the patients’ average age at the time of examination was 29.21 ± 6.5 (8–68 years).
Clinical characteristics
Signs of liver damage prevailed in the vast majority of patients (60%). A total of 65.72% of these patients had hepatic manifestations only, and 34.28% of patients showed minimal neurological manifestations.
The combination of hepatic and cerebral manifestations (neurological and psychiatric) was found in 31%. A total of 9% were asymptomatic (the disease was detected during family screening, and treatment was started before the disease onset). The average age of the patients identified during family screening was 15.85 (7–27 years)
The WD onset variants were segregated by severity: severe course (decompensated cirrhosis, fulminant hepatitis, liver failure) — 25.8%; relatively mild course (chronic hepatitis, cirrhosis without failure, extrahepatic manifestations) — 38.5%; extrahepatic pathology — 35.7% of cases.
The average age of WD onset was 18.21 ± 8.55 (5−45 years) (figure).
Assessment of biochemical markers also showed that the patients’ average ceruloplasmin level reached 0.113 g/L. The distribution of values across groups was as follows: 38.7% had the levels below 0.1 g/L, a half of surveyed individuals (50%) had the levels between 0.1 and 0.2 g/L, and 11.3% had normal values (over 0.2 g/L).
The analysis of 24-h urinary copper excretion showed that the average value was 466.75 µg/day, with the considerable variation (standard deviation 635.82). The levels did not exceed the normal value of 50 µg/day only in a small number of patients (4.6%).
The diagnosis timing analysis revealed a significant delay in the majority of cases. Wilson’s disease was detected within three months after the emergence of symptoms only in a third of patients (33%). In 29% of patients the diagnosis took from three months to a year, in 30% — from one year to 10 years, and in 8% of patients the diagnosis was established more than 10 years after the disease onset.
During therapy 48.8% of patients received combination treatment with D-PAm and zinc sulfate. A total of 40.2% received D-Pam monotherapy, and 8.5% took zinc sulfate only. The average duration of anti-copper therapy before clinical stabilization was achieved was 10.25 ± 4.7 months, and individual ranges varied between 2 and 36 months.
The WD development dynamics was assessed no earlier than two years after the diagnosis and start of treatment (tab. 1). In the Tareyev Clinic, patients were followed up for on average of 5.8 years; the longest follow-up period was 46 years.
Range and frequency of pathogenic and likely pathogenic variants in the gene ATP7B
The study involved ATP7B gene sequencing in 81 patients with confirmed WD. The core group of patients was represented by Russians. A total of 31 pathogenic variants were identified. The list, characteristics, and abundance of those are provided in tab. 2.
NGS allowed us to identify nucleotide sequences on both chromosomes in 96% and 98% of alleles. In 4% of patients (2% of alleles), no candidate variants were found, despite elaborative clinical features of WD.
Three variants predominated among the nucleotide sequences identified: c.3207C>A (p.His1069Gln) — the most common, found in 51.85% of alleles; c.3190G>A (p.Glu1064Lys) — found in 8.64% of alleles; c.3402delC (p.Ala1135fs) — reported in 6.17% of alleles. The majority of patients (72.5%) turned out to be compound heterozygotes; in 16.15% of cases, sporadic rare ATP7B gene variants were identified. The previously undescribed variants potentially associated with the WD development were detected: c.1870-8A>G; c.3655A>T (p.Ile1219Phe); c.3036dupC (p.Lys1013fs).
The distribution of variants by effects was as follows: 73.33% — missense, 14% of alleles — frameshift, 4.67% — splice site variants, 4% — nonsense, 4% — indel. Variants were found in all the ATP7B gene exons, except 1, 3, 5, 9, 10, 12, and 21.
Assessment of the relationship between nucleotide variants and WD severity
In the study, we analyzed the association of the effect of nucleotide sequences in the gene ATP7B and the WD severity. The following were considered as the disease severity criteria: age of onset (three categories: under the age of 15 years; 15–31 years; over the age of 31 years); the degree of liver damage at the time of disease onset (decompensated cirrhosis — 33%, compensated cirrhosis — 49.4%, no cirrhosis — 17.3%).
Nucleotide sequences were grouped based on their potential effects on the protein product (tab. 3):
- Severe abnormalities in both gene copies — nonsense variants, frameshift variants, splice site variants.
- Mixed effect — severe abnormalities in one gene copy (nonsense, frameshift, splicing) and relatively mild in another one (missense variants, indels).
- Mild abnormalities in both gene copies — missense variants, indels.
Age of onset: no significant correlation between the genotype (based on the nucleotide sequence effect on protein synthesis) and the age of disease onset was revealed; the cases of late onset (≥ 31 years) were reported only in patients, who were homozygous or compound heterozygous for the major variant.
Liver damage: the lack of cirrhosis was more often reported in homozygous or compound heterozygous carriers of the c.3207C>A (p.His1069Gln) variants (15% vs. 2.3%); patients with nonsense, frameshift or splicing variants in both ATP7B gene copies more often showed decompensated liver cirrhosis at onset, and the age of onset was below 15 years.
Thus, the following correlations were reported:
- moderate correlation between the WD manifestation type and the genotype (based on the nucleotide sequence effect on protein synthesis): correlation coefficient r = −305 (p = 0.009);
- moderate correlation between the liver cirrhosis class (Child-Pugh score) and the potential effect of nucleotide variants: r = −0.374 (p = 0.004);
- moderate correlation between blood cholinesterase (CE) levels and the effect of nucleotide sequences: moderate correlation r = 0.5368 (p = 0.004).
DISCUSSION
The paper reports the data of patients diagnosed with Wilson’s disease verified by NGS, as well as the results of assessing the relationship between the disease severity and types of variants in the gene TP7B.
The analysis has shown that the data obtained are consistent with the up-to-date understanding of the WD clinical manifestations and genetic diversity. In particular, it has been proven that variants with complete loss of protein function are correlated to the more severe disease form. At the same time, we have identified some features specific for the studied cohort.
Clinical polymorphism of the disease and predominance of the abdominal form have been confirmed [12–14]. The age of onset and diagnosis timing are consistent with global trends [12, 15]. However, some differences have been noted. Thus, the share of asymptomatic cases in our cohort (30.6%) was higher, than in most published studies (10–20%) [16].
Today, the NGS method is extensively used to study WD genetic features all over the world. In our study, the c.3207C>A (p.His1069Gln) variant turned out to be the most common (51.85% of alleles), which is in line with the data reported for Russia and Eastern Europe [17, 18]. The c.3190G>A (p.Glu1064Lys) variant ranked second in frequency (8.63% of alleles), c.3402delC (p.Ala1135fs) ranked third (6.17% of alleles). Rare mutations were identified in 16.15% of alleles. Three nucleotide variants (c.1870–8A>G, c.3655A>T, c.3036dupC) were first described as pathogenic in terms of WD.
In a study by scientists from the Far East, a cohort of 100 people from Eastern Eurasia was analyzed. The major nucleotide sequence p.His1069Gln (c.3207C>A) was found in 48% of patients (homozygous form — in 30%); the p.Glu1064Lys (c.3190G>A) variant was found in 20%; the p.Met769HisfsTer26 (c.2304insC) was found in 8%; other variants accounted for 23.9% [17].
According to the Human Gene Mutation Database, the distribution of nucleotide sequence types in the gene АТР7В (60% — missense and nonsense variants; 26% — indels, 9% — splice site variants) was similar, but there was a larger share of missense variants (73.33%) [19].
In addition, we compared the range of nucleotide sequences identified in the gene АТР7В in this study with the diagnostic panel for Wilson’s disease most commonly used in the RF (tab. 4). Of 12 variants included in the routine diagnostic panel, four variants were detected: c.2304insC; c.3207C>A; c.3402delC; c.3649_3654del6. We identified three variants, which were regularly reported in patients we examined, but were not included in the standard panel: c.2332C>G (p.Arg778Gly); c.4125−2A>G; c.3190G>A (p.Glu1064Lys).
Molecular genetic testing by next-generation sequencing (NGS) showed high informational value when used to identify pathogenic variants associated with WD. The standard diagnostic panel used in the RF covers only part of the range of variants typical for the Russian population of patients with WD. The NGS method allows one to detect both common and rare variants, not included in standard panels. The data obtained substantiate the need to expand the existing diagnostic panels considering the regional specifics of the patients’ genetic profile.
In the study, we assessed the relationship between genetic features (variant type, its pathogenetic effect) and WD clinical manifestations, specifically the disease course. We also revealed no significant correlation between the fact of being a homozygous major variant carrier and the disease severity or other clinical characteristics tested, as well as age and gender, like in the study conducted by Ferenci [14].
We revealed a significant correlation between the nucleotide sequences severely impairing the protein product function (nonsense, frameshift, splice site variants) and the following clinical signs: WD manifestation with severe liver damage; more severe liver cirrhosis at the time of diagnosis; decreased cholinesterase (CE) levels.
The findings are consistent with some studies. Thus, in one study a significant correlation was reported for nonsense mutations only, while the correlation for missense mutations was weak or lacking [4]. The correlation of nonsense and frameshift variants with the earlier disease onset and more severe WD course was shown [20].
A number of large-scale studies showed only weak or lacking genotype-phenotype correlation. According to the study conducted by Chinese researchers, only 38% of age of onset variability can be explained by genotype [7].
According to the EuroWilson registry, genotype explains about 27% of age of onset variability and less than 20% of disease form variability (hepatic or neurological form) [21]. With the p.His1069Gln and p.Glu1064Lys missense variants, motor impairment was reported in в 53–58% of cases, brain MRI alterations in 59–69%, Kayser–Fleischer rings in 29–31%, cognitive impairment in 24–27% of cases. With the LOF variant p.Met769HisfsTer26, ultrasonography revealed changes in the liver in 60% of patients. No gender dependence was found [17].
Despite moderate correlation between the pattern of nucleotide sequences in the ATP7B gene and the Wilson’s disease course severity identified in our study, it is still difficult to clearly determine genotype-phenotype relationships. This is due to a complex of interrelated factors that can be systematized in several key directions.
Genetic factors
Variable expression and clinical polymorphism. Thus, a broad spectrum of clinical manifestations, from asymptomatic course to severe multiple organ failure, was found within the same family having the same genotype [14, 22]. This suggests the impact of additional genetic and environmental modifiers.
High genetic heterogeneity. More than 60% of patients turned out to be compound heterozygotes for two different pathogenic variants. However, functional sequelae of the majority of rare nucleotide sequences are still poorly understood, which makes it difficult to predict the phenotype [23].
Lack of phenotype homogeneity in homozygotes for the major variant. Patients with the same genotype (for example, homozygotes for p.H1069Q) demonstrate significant differences in the age of onset, liver damage severity, and neurological symptoms [4, 7].
Influence of genetic modifiers. Along with the main gene ATP7B, the phenotype is influenced by variants in other genes: MTHFR (homocysteine metabolism); COMD1, ATOX1, XIAP (copper homeostasis); APOE (lipid metabolism); PRNP, HFE (common metabolic pathways), etc. These polymorphic variants have a significant impact on the age of onset (before 5–12 years) [24, 25].
Clinical and demographic factors
Gender differences. In females, the WD onset occurs on average 3–6 years later, than in males, which is associated with protective effects of estrogens on copper metabolism. Hormone levels, puberty, and menopause can provoke aggravation of symptoms or alter their severity [26].
Environmental and epigenetic factors
Features of the diet, increased copper consumption (for example, with drinking water or food products) can accelerate the disease onset.
Comorbidities. Viral hepatitis infection, alcohol abuse, or the use of hepatotoxic drugs worsens liver damage.
Epigenetic states. DNA methylation, histone modifications, and the action of non-coding RNAs can modulate the expression of ATP7B and modifier genes, affecting the phenotype [27].
Methodological study limitations
Heterogeneity of phenotyping criteria. Various scores for assessing the severity of liver damage, neurological symptoms, and biochemical markers are used in different studies. Insufficient family screening coverage. The lack of information about the relatives hampers segregation of genotypes and phenotypes in families.
Late diagnosis. Many patients seek care as late as in the phase of cirrhosis decompensation or neurological complications, which results in misrepresentation of the natural disease history.
Selective data loss. Patients with severe genotypes are more likely to undergo liver transplantation or die before inclusion in the study, which results in survivorship bias.
Small sample size. Small cohort size reduces statistical power to detect correlations, especially for rare mutations [13, 28].
The ED pathogenesis multifactorial nature explains difficulties in determining clear genotype-phenotype associations. The following is necessary to overcome these limitations: large multicenter studies involving the standardized phenotyping criteria; comprehensive analysis of not only ATP7B, but also genetic modifiers; consideration of environmental and epigenetic factors in prediction models; prospective follow-up of families with the identified genotype.
It is only such integrated approach that will make it possible to increase the WD course prediction accuracy and personalize therapeutic strategies.
Despite the lack of direct correlation between the genotype and the WD age of onset, the analysis conducted has shown that the liver damage severity depends on the nature of nucleotide variants. The use of NGS methods in clinical practice significantly simplifies and accelerates the diagnosis of WD, which allows for the prompt initiation of anti-copper therapy.
CONCLUSIONS
The study involving the use of next-generation sequencing (NGS) in the cohort of Russian patients with Wilson’s disease (WD) revealed 31 pathogenic variants in the gene ATP7B, among which c.3207C>A (p.His1069Gln), c.3190G>A (p.Glu1064Lys), and c.3402delC (p.Ala1135fs) were the most common. We identified a correlation between the nucleotide sequences causing severe ATP7B protein abnormalities (nonsense, frameshift, and splice site variants) and adverse clinical manifestations of WD (severe liver damage, severe cirrhosis, decreased cholinesterase levels), while homozygosity for the c.3207C>A (p.His1069Gln) variant was not correlated to the disease severity. The data obtained confirm high diagnostic efficacy of NGS in WD and emphasize promise of further research focused on genotypephenotype interactions for therapy personalization and patient management improvement.