ORIGINAL RESEARCH
Predicting the outcomes of in vitro fertilization programs using a random forest machine learning model
1 Higher School of Economics National Research University, Moscow, Russia
2 Kulakov National Medical Scientific Centre for Obstetrics, Gynecology and Perinatal Medicine, Moscow, Russia
Correspondence should be addressed: Ayuna E. Dashieva
Akademika Oparina, 4B, Moscow, 117198, Russia, ur.liam@aveihsad.rd
Author contribution: Vladimirsky GM — predictive models training, literature analysis, choice of research methods; Zhuravleva MA — preprocessing and analysis of data, literature analysis, manuscript authoring; Dashieva AE — processing of source material, analysis of results; Korneeva IE, Nazarenko TA — development of the survey for the database, manuscript editing.
Currently, in vitro fertilization (IVF) with embryo transfer is the main method of treatment of all forms of infertility, but successful cases ending in pregnancy still account for only a third of all cycles performed. It is necessary to take into account many parameters and investigate the connections between them in order to properly evaluate the results of IVF. Over the past decades, a number of IVF prediction models have been developed with the aim at assessing the outcomes in individual cases, but, given the generally poor prognostic capacity, only a few of them have proven to be clinically significant. This study aimed to create nonlinear IVF outcomes prediction models and identify the most significant factors affecting the said outcomes. Using a database containing information on more than 700 indicators of 7004 women aged 18 to 45 years who participated in the IVF program in Russia from 2010 to 2020, we trained a random forest model that predicted pregnancy in the IVF cycle with ROC-AUC = 0.69. This paper describes 20 most successful predictors of the resulting model and interprets their contribution to the prognosis. Of these, body mass index (BMI) and the number of received and fertilized oocytes have been covered in the scientific literature previously as predictors of IVF outcomes, but other parameters, such as anamnestic data, previous participation in an IVF program (number of cases and their results), as well as serum concentration of AMH, rarely appear in foreign prognostic models.
Keywords: IVF, prognostic model, infertility, random forest