DETECTION OF CFTR MUTATIONS IN CHILDREN WITH CYSTIC FIBROSIS

Cystic fibrosis (CF) is one of the most common monogenic disorders of humans. The knowledge of population frequency of a mutant genotype causing a monogenic disease helps to optimize DNA testing and to reduce its costs and time required for the procedure. This article presents the results of a retrospective study of the CFTR gene in 191 children with mixed manifestations of CF. To screen for 24 most common mutations, we used the diagnostic PCR panel; minor mutations were detected by next generation sequencing. The diagnostic panel allowed us to identify 18 typical CFTR mutations, including F508del (allelic frequency of 54.7%), dele 2,3 (21kb) (7.3%), 2143delT (3.4%), 2184insA (3.4%), 1677delTA (2.4%), N1303K (2.1%), 3849+10kbC>T (2.1%), E92K (2.1%), G542X (1.6%), W1282X (1.6%), S1196X (1.3%), R334W (1.0%), 394delTT(0.8%), 3944delGT (0.8%), 3821delT (0.5%), 2789+5G>A (0.5%), 621+1G>T(0.3%), and 2183AA>G (0.3%). Sequencing revealed the presence of 24 potentially pathogenic CFTR variants in the sample. We also discovered 8 minor CFTR variants previously unseen in Russian patients with CF, including 4 new CFTR mutations: p.Glu819Ter, p.Gln378Ter, p.Val1360Phefs, and p.Lys1365Argfs.

Cystic fibrosis (CF) is a hereditary autosomal recessive disease that affects all exocrine glands, leading to severe impairment of the respiratory and digestive systems. CF is caused by deleterious mutations in the CFTR gene (CFTR stands for cystic fibrosis transmembrane conductance regulator) [1], most commonly by F508del (rs113993960) which results in the deletion of phenylalanine at position 508 in the protein [1][2][3]. There is no known cure for CF; complex care should be provided for patients with CF throughout their lifetime.
CF is one of the most common hereditary diseases. According to the World Health Organization, the disease occurs in 1 in 2,500-3,000 newborns [3]. The Russian Cystic Fibrosis Patient Registry reported 2,916 new cases of CF in 2015 [4]. In 2016 the incidence of the disease among Russian neonates was 1 : 8,788 [5]. 1
It is crucial to recognize CF before it is clinically manifested; timely diagnosis reduces the risk of irreversible damage to the respiratory and digestive systems and improves the quality of life of patients and their families [6].
Neonatal screening for CF adopted by the Russian Federation in 2006 is an important tool for early diagnosis. It comprises a series of diagnostic tests run consecutively, including the immunoreactive trypsinogen (IRT) blood test, the IRT repeat test, and the sweat chloride test ordered if IRT levels are elevated above the normal range [7].
Molecular genetic (or DNA) screening for mutations in the CFTR gene is conducted in several steps. The first step includes screening for the most common mutations using special diagnostic panels [3,7,8]. If this test comes out negative, the whole gene is sequenced [3,9] and a search is performed for large structural CFTR variations, if necessary [3].
In Russia, genetic screening is not mandatory and is normally recommended if the sweat test cannot be done or its results are inconclusive. However, the CFTR genotype is one of the factors predicting the severity of the disease [3]; once it has been established, the doctor can come up with an adequate pharmacogenetic treatment plan [2,3]. One of the advantages of DNA testing is its accuracy: unlike the sweat test, it is not affected by the physiology of an individual patient.
At present, there is a need for better availability of genetic screening in the Russian Federation. Even so, in the recent years extensive genetic epidemiology data on cystic fibrosis have been collected in Russia. The most common CFTR mutations have been identified [3,8], and genetic variations associated with the disease in different ethnic groups have been described, as well as regional variations in the frequency of pathogenic alleles [8,10,11]. A good example here is the E92K (rs121908751) mutation typically found in the Chuvash people. A record of CFTR mutations has been kept by the Russian Cystic Fibrosis Patient Registry since 2011 [12]. A new registry of CFTR allelic variants has been created as part of the open-source international database of genetic variations LOVD v.3.0 (Leiden Open Variation Database). The registry is called SeqDB-LOVD/ Consensus view on the clinical effects of genetic variants and lists CFTR allelic variants occurring in the Russian population [13]. SeqDB-LOVD provides information on the clinical relevance of CFTR variants, including rare ones that were identified only due to the active use of NGS in research studies.
According to SeqDB-LOVD, there are currently over 220 clinically relevant CFTR mutations occurring in the Russian population; interestingly, new, previously unknown allelic variants come from relatively small samples [9]. With that in mind, one can safely assume that the real diversity of pathogenic CFTR mutations is much vaster.
About 500 children are annually referred to the Pediatric Unit of Children's Clinical Hospital (Pirogov Russian National Medical Research University) from different regions of Russia; of them about 100 are diagnosed with CF. Between 2014 and 2017, the Pediatric Unit admitted over 200 children with clinical signs of CF whose genotype was either unknown (no molecular genetic tests had been performed) or partially known (only one known CFTR mutation had been identified). The aim of this work was to determine the spectrum of pathogenic CFTR variants in the sample of 191 patients with severe CF with mixed clinical manifestations.

METHODS
For this retrospective study we selected blood samples collected from 191 children with severe or moderately severe cystic fibrosis referred to the Children's Clinical Hospital of Pirogov Russian National Medical Research University between 2014 and early 2017. In most cases, no genetic screening had been done to confirm the diagnosis. The main group consisted of boys and girls from 57 Russian regions (Moscow and Stavropol regions were represented by 15 patients each; other regions, by 1 to 9 patients each). The study included patients with clinically established diagnosis of severe CF with mixed manifestations (E 84.8). Patients with clinically established CF with predominantly pulmonary manifestations (Е 84.0) or with mild or borderline symptoms were excluded from the study. The sample mainly consisted of unrelated patients; there were also 4 pairs of siblings. The study was approved by the Ethics Committee of Pirogov University (Protocol 172 dated February 2, 2018).
Peripheral blood samples were collected at the facilities of the Children's Clinical Hospital. Genomic DNA was isolated from the whole blood specimens stored in the Biobank of Kulakov National Medical Research Center of Obstetrics, Gynecology and Perinatology using the reagent kit Proba-GS-Genetika (DNA-Technology, Russia) according to the manufacturer's instructions.
Screening for rare and unknown mutant variants of CFTR was done on the Ion Torrent TM next generation sequencing platform (Thermo Fisher Scientific, USA). We targeted a number of coding regions (27 exons of CFTR), intron-exon boundaries and the promoter region. Additionally, the panel included a fragment for the identification of the pathogenic intron variant 3849+10kbC>T (rs75039782) and the regions flanking the dele2,3(21kb) mutation, a common deletion of exons 2 and 3 in the CFTR genes (Table 1).
Before sequencing, the targets were enriched by PCR, for which we used at least 10 ng of the input genomic DNA amount. The PCR products were ligated to the adapters by T4 DNA ligase (Thermo Fisher Scientific, USA) according to the manufacturer's protocol. The quality of the prepared DNA libraries was assessed using the Agilent 2100 Bioanalyzer and the Agilent High Sensitivity DNA Kit (Agilent Technologies, USA). Next generation sequencing was carried out using the Ion PGM Next-Generation Sequencing Systems (Ion Torrent™, USA) and the Ion PGM™ Template OT2 400 Kit (Ion Torrent™, USA) in the Laboratory of Molecular Genetics of Kulakov National Medical Research Center of Obstetrics, Gynecology and Perinatology.
Primary data analysis was assisted by the Torrent server 4.4.3. The obtained sequences were aligned to the reference genome GRCh37/hg19 by the TMAP tool; the reference Mutations included in the panel were unambiguously identified or were shown to be absent in 99% of cases. In two samples (1%) the melting curves recorded for one of the mutant gene variants looked abnormal. Direct sequencing of these samples revealed the presence of "off-target" single nucleotide polymorphisms in the regions hybridized to the allele-specific probes (Fig. 1). Forty-seven PCR-sequenced samples reported to be free of CFTR mutations were additionally sequenced by NGS. In total, 300 different genotypes were identified by sequencing, of which 24 could be clinically relevant (we accounted for the variants described in locus-specific databases as pathogenic, nonsense, or frameshift mutations) ( Table 2). Some genotypes were observed more than once, such as p.Ser466Ter (rs121908805), which occurred as part of the compound allele in 5 unrelated patients (Table 3).
Of all detected mutations, 4 had not been described previously, including two frameshifts (c.4093delA/p.Lys1365Argfs and c.4078delG/p.Val1360Phefs) and two nonsense mutations (c.1132C>T/p.Gln378Ter and c.2455G>T/p.Glu819Ter) with a pathogenic potential (Table 4). These previously unknown variants were heterozygous and occurred in combination with the most frequent CFTR mutation (Table 3). We submitted these mutations to SeqDB-LOVD.
During Sanger validation, a deletion was detected in two samples in exon 24 resulting in the frameshift p.Ile1214Phefs (rs397508630).
Our extensive DNA testing revealed that 178 patients from the sample had 2 pathogenic mutations and 13 patients had The proportion of patients with 2 "severe" (class I-III) CFTR mutations [19] was 69.6%. The proportion of patients with one or two "mild" (class IV-V) mutations [19] was 8.4%. Patients with one or two mutations of «uncertain clinical relevance» made up 22%.

DISCUSSION
We have detected 36 different pathogenic variants of the CFTR gene in the studied group of patients. The majority of these mutations are known to be common in the Russian population [4,8]. F508del (rs113993960) prevailed in the studied sample taken as a whole, as well as in the separate subgroups of patients coming from the regions dominated by Russians. The frequency of other mutations in the sample was consistent with the reports of CF in the Russian population [4,8]. Ten mutations with the highest frequency in the sample are listed in the Russian CF Patient Registry [4]. The 1677delTA (rs121908776) mutation was the most common in children from the North Caucasus. Children from Chuvashia had the E92K (rs121908751) mutation typically associated with their ethnicity. The obtained results suggest that the study sample is representative of the Russian population afflicted with cystic fibrosis. Genotyping data obtained from the studied sample provide new information about the genetic diversity of cystic fibrosis in Russia.
Using different sequencing techniques, we detected 24 clinically relevant mutations of the CFTR gene (including 22 minor variants); of them 8 had not been previously reported by the Russian CF Patient Registry, including p.Gln39Ter (rs397508168), p.Phe1286Ser (rs121909028), p.Ile1214Phefs (rs397508630), p.Trp1063Terfs, p.Glu819Ter, p.Gln378Ter, p.Val1360Phefs, and p.Lys1365Argfs. According to in silico prediction tools, these mutations are pathogenic (belong to class I) and result in the truncated CFTR protein.
PCR-based sequencing demonstrated a detection rate of 86.1% for deleterious CFTR mutations (in 98.9% of cases one or two pathogenic variants were detected). This value meets the requirements for diagnostic panels [19]. However, considering the huge array of genetic epidemiology data obtained in the recent years [4,13] and the results of additional diagnostic testing we performed on the samples, we believe that the detection rate can be improved by including p.Ser466Ter (rs121908805), p.Trp1282Arg (rs397508616) and p.Leu15Phefs (rs397508715) mutations into the panel. The PCR-based kissing-probe method that we used to screen for known CFTR mutations has a few advantages over alternative approaches, such as MLPA or RFLP): all stages of the procedure including the analysis of melting curves take place in one device, and electrophoresis is not required. The results are interpreted automatically. At the same time, visual control of the melting curves is possible, facilitating detection of polymorphisms located close to the targeted mutation. Considering its relative simplicity, good optimization potential (the method can be adjusted for PCR multiplexing, and the number of testing tubes with individual samples can be cut down) and automatic control of the procedure, this method can be used for high throughput sequencing/screening for common hereditary diseases.
The detection rate of extensive sequencing-based DNA testing was 95.4% (at least one pathogenic mutation was detected in each case). Detection rates may have been affected by the limitations of the NGS technology; as a rule, panels and analytical algorithms are optimized for better screening results [20]. Ion Torrent cannot reliably detect mutations inside homopolymer regions, such as 2184insA (rs121908786). In our study, the adenine deletion inside the region TATTT[A/-] TTTTTTCT (mutation p.Ile1214Phefs (rs397508630)) was detected only after the fragment was Sanger-sequenced. Lengthy deletions and duplications also pose a problem for Ion Torrent, as recognition of their heterozygous genotypes requires specific bioinformatic algorithms of data processing; long deletions require incorporation of additional targets into the panel to cover their boundaries [9] or even a series of additional targets corresponding to the most frequent genotypes observed in a population. So far, residents of the Russian Federation with CF have been shown to have a few lengthy deletions, of which CFTRdele 2,3 is the most common Note: * -represents 4 previously undescribed CFTR mutations shown in bold; ** -represents p.Ile1214Phefs (rs397508630) detected by Sanger sequencing; ? -means that candidate variants have not been identified.