Clarification of the status of some mutations considered pathogenic, by harmless mutations attributes
Prediction of mutation pathogenicity and its effect on the phenotype is an important task of modern bioinformatics. This task is particularly difficult in regard to single nucleotide polymorphisms, as their effect is very hard to predict. Information on pathogenic mutations is provided by curated databases such as Online Mendelian Inheritance in Man (OMIM) and The Human Gene Mutation Database (HGMD) which include data from experimental works. However, as different authors interpret the term “mutation pathogenicity” differently, it is necessary to double-check data before using them. We have assessed HGMD database quality using the most common bioinformatic tools, namely, snpEff, polyphen2 and SIFT. Our study relied on the characteristics specific for harmless mutations: high frequency in a population, weak effect on amino acid sequence of a protein, low pathogenicity as computed by the utilities used in the study. As a result, we have identified clearly harmless variants among those in the mutation database, as well as ambiguous ones in which a mutation type depends on characteristics and tools used for the analysis.