Clarification of the status of some mutations considered pathogenic, by harmless mutations attributes

About authors

1 Bioinformatics Data Processing Department,
Genotek Ltd., Moscow, Russia

2 Lomonosov Moscow State University, Moscow, Russia

3 The Core Facilities Center “Genetic Polymorphism”,
Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia

Correspondence should be addressed: Dmitry O. Korostin
ul. Gubkina, d. 3, Moscow, Russia, 119991; moc.liamg@nitsorok.d

Received: 2016-02-10 Accepted: 2016-02-19 Published online: 2017-01-05

Prediction of mutation pathogenicity and its effect on the phenotype is an important task of modern bioinformatics. This task is particularly difficult in regard to single nucleotide polymorphisms, as their effect is very hard to predict. Information on pathogenic mutations is provided by curated databases such as Online Mendelian Inheritance in Man (OMIM) and The Human Gene Mutation Database (HGMD) which include data from experimental works. However, as different authors interpret the term “mutation pathogenicity” differently, it is necessary to double-check data before using them. We have assessed HGMD database quality using the most common bioinformatic tools, namely, snpEff, polyphen2 and SIFT. Our study relied on the characteristics specific for harmless mutations: high frequency in a population, weak effect on amino acid sequence of a protein, low pathogenicity as computed by the utilities used in the study. As a result, we have identified clearly harmless variants among those in the mutation database, as well as ambiguous ones in which a mutation type depends on characteristics and tools used for the analysis.

Keywords: pathogenicity, human genetics, high-throughput sequencing, population analysis, search for mutations