The approach to patient clustering based on the microchip data confined to distinct loci using the combinations of variants

Iulmetova LN, Kulemin NA, Sharova EI
About authors

Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of the Federal Medical Biological Agency, Moscow, Russia

Correspondence should be addressed: Elena I. Sharova
Malaya Pirogovskaya, 1 str. 3, Moscow, 119435, Russia; moc.liamg@87avorahs

About paper

Funding: the study was supported by the President of the Russian Federation (grant for young postdocs МК-2951.2022.1.4)

Acknowledgements: the authors thank dbGaP for providing access to the phs000421.v1.p1 and phs000001.v3.p1 datasets. The dataset with the dbGaP registration number phs000421.v1.p1 was obtained from the genetic study of Fuchs' endothelial corneal dystrophy (FECD) available from The authors recognize the grants that have been used to support registration of cases and controls and would be used in this GWAS: R01EY016514 (DUEC, PI: Gordon Klintworth), R01EY016482 (CWRU, PI: Sudha Iyengar) and 1X01HG006619-01 (PI: Sudha Iyengar, Natalie Afshari). The authors express their gratitude to the FECD study participants and the FECD research team for their valuable contribution to this study. The dataset with the dbGaP registration number phs000001.v3.p1 was obtained from the Age-Related Eye Disease Study (AREDS) database available from AREDS was funded by the National Eye Institute (N01-EY-0-2127). The authors thank the AREDS participants and research team for their valuable contribution to this study. The authors express their gratitude to L.O. Skorodumova, research fellow at the Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine, for valuable suggestions, comments, and support.

Author contribution: Sharova EI — concept and selection of data; Sharova EI, Iulmetova LN — planning and selection of methods; Kulemin NA — project funding and management; Iulmetova LN — design and computation; Sharova EI, Iulmetova LN, Kulemin NA — discussion, manuscript writing and editing.

Compliance with ethical standards: the study was performed according to the principles of the Declaration of Helsinki using the data of the phs000421.v1.p1 and phs000001.v3.p1 projects, the access to which was approved and provided by dbGaP in accordance with the policy of approval and access to specific datasets.

Received: 2022-12-12 Accepted: 2023-01-20 Published online: 2023-02-12

Fuchs' endothelial corneal dystrophy is a socially significant hereditary disease. More than a half of cases in the European population are caused by the increased number of trinucleotude repeats in the TCF4 gene. The study was aimed to develop and test the approach of dividing patients into groups based on the chip-based genotyping and genome-wide association study (GWAS) results. The analysis was conducted using FECD Genetics Multi-center Study and AREDs project datasets containing the data of 1721 clinical cases and 2408 control patients. When analyzing the GWAS results, the patients and the control group were divided into two groups by means of hierarchical clustering suggesting that patients with the increased number of repeats in the TCF4 gene are carriers of specific combinations of genomic variants (haplotypes). It was shown that individual variants cannot be used for the molecular genetic stratification of patients with the increased number of repeats in TCF4 due to inconsistent results obtained for the variants. Furthermore, the haplotype-based approach outperformed the SNPs in terms of odds ratio. The paper proposes a method that enables further search for the biologically relevant combinations of genomic variants.

Keywords: genome wide association study, Fuchs endothelial corneal dystrophy, trinucleotide repeat expansion, patient stratification, locus