MICROBIOTA OF SEMEN SAMPLES WITH NORMOZOOSPERMIA: ANALYSIS OF REAL-TIME PCR DATA

The analysis of semen microbiota is difficult due to the lack of established criteria for interpretation of microbiological tests. The aim of the study was to determine the stable clusters of semen microbiota analyzed by real-time PCR in samples with normozoospermia. Semen samples of 227 men with normal spermiograms were included in the study. The quantity of total bacterial DNA and at least one group of microorganisms was more than 10 3 GE/ml in 107 (41.7%) samples. Four stable microbiota clusters with the prevalence of a specific microorganism group were distinguished in these samples: obligate anaerobes (OA) cluster (proportion in the centroid — 81.1%); Lactobacillus spp . cluster (proportion in the centroid — 64.3%); gram-positive facultative anaerobes (GPFA) cluster (proportion in the centroid — 92.5%); Enterobacteriaceae / Enterococcoccus (EE) cluster (proportion in the centroid — 80.8%). The clusters were ranked by frequency of occurrence: OA cluster was the most prevalent (43 (40.2%) of 107), second-most frequent were GPFA-cluster (27 (25.2%)) and Lactobacillus -cluster (22 (20.6%)). EE-dominated cluster was found in 15 (14.0%) cases.

Semen microbiota remains an under investigated part of human microbiome despite the strong interest in it, as well as the capabilities of modern molecular technologies.This biomaterial is especially significant in the context of infertility treatment [1].The male factor is responsible for infertility in half of all the couples [2], however, the cause of infertility in men often remains unidentified [3].Infection is behind only 6-10% of all male infertility cases [4].It was shown that some bacteria can cause direct damage to spermatozoa decreasing their motility and viability [5].The use of molecular-based technique, primarily nextgeneration sequencing (NGS), made it possible to detect complex bacterial communities both in the ejaculate of patients with infectious and inflammatory processes and in healthy men with normozoospermia [1,[6][7][8][9][10].Some of the detected microorganisms were fastidious or non-culturable (including obligate anaerobes) [8,10,11], which could explain a larger number of positive samples compared to the results of culture method.However, the detection of microorganisms in the semen of patients with normozoospermia forced researchers to abandon the concept of bacteriospermia as a marker of an exclusively pathological condition [6,7,9].Instead, cautious assumptions have been made about the association between the semen microbiota composition and abnormalities in the semen analysis [6,9].
The few semen microbiota studies from patients with normozoospermia were conducted on a limited number of samples, which prevented researchers from forming a clear idea about the norm for this biomaterial [1].Moreover, the NGS used in these studies has a number of disadvantages preventing its wide implementation in routine medical practice: high cost and labor input, the complexity of standardizing the procedure and interpreting the results.
In practice, real-time PCR, another molecular-based technique, is more promising for routine analyses of semen microbiota.The release of a registered test kit for assessing male urogenital microbiota has opened up new possibilities for detecting a wide range of pathogenic and opportunistic bacteria in semen.These microorganisms include fastidious and non-culturable bacteria, as well as Lactobacillus spp.[12,13], which are commonly considered the inhabitants of the female reproductive tract.The availability of real-time PCR raises the question of correctly interpreting its results.The presence of many bacterial groups in various combinations and quantities required the use of mathematical modeling methods to identify patterns in semen microbiota composition.Cluster analysis allowed us to reduce the entire variety of identified microorganisms to four stable types of microbial communities, characterized by the predominance of different bacterial groups [12].Further studies of samples with normal and abnormal spermiogram parameters are required to evaluate the clinical significance of the microbiota types.
The aim of the study was to identify stable variants of microbiota analyzed by means of real-time PCR in semen samples with normozoospermia.

Patient groups
The study included 227 semen samples with normozoospermia from men (aged 20-59, mean age 33 ± 4.7) who came to the "Garmonia" Medical Center (Yekaterinburg, n = 142) and to the urological clinic of the Ivanovo State Medical Academy (Ivanovo, n = 85) seeking preconception care from January 2019 to March 2020.
Inclusion criteria: all examined patients during the last four weeks did not receive medications that could affect the semen microbiota, such as hormonal or antibacterial drugs; normozoospermia according to semen analysis results.
Exclusion criteria: hypogonadotropic and hypergonadotropic hypogonadism, type 1 and 2 diabetes, hypo-and hyperthyroidism; sexually transmitted infections (Chlamydia trachomatis, Neisseria gonorrhoeae, Mycoplasma genitalium, Trichomonas vaginalis); clinical manifestations of prostatitis such as pain and dysuria; karyotype abnormalities, mutations in the CFTR gene, microdeletions in the AZF locus of the Y chromosome.
Semen samples were collected from each patient in accordance with the following guidelines; semen analysis parameters and semen microbiota composition were evaluated.

Semen sampling
Patient preparation and sampling were conducted in compliance with WHO's guidelines for the examination and processing of human semen (p.2.2.4 of the Manual).Ejaculatory abstinence for the period of 2-5 days was mandatory.Prior to semen collection, patients urinated and washed their external genitalia.Semen was collected through masturbation into a sterile container [14].

Semen analysis parameters
The semen analysis was carried out after a 30-60-minute liquefaction of the material; the quantity (concentration) and motility of spermatozoa was calculated using a Biola SCA sperm analyzer (NPF Biola; Russia).Sperm morphology was assessed in stained preparations at a microscope magnification × 1000 using a Spermac Stain diagnostic kit (Ferti Pro; Belgium).
Obtained data were interpreted in accordance with the WHO criteria [14].
DNA extraction PREP-NA-PLUS kit (DNA-Technology; Russia) was used for DNA-extraction.Semen samples were prepared using the following technique: 1.0 ml of semen was put into an Eppendorf tube with 1.0 ml of transport medium ("Transport media with mucolytic agent"; InterLabService Ltd., Russia) which was then shaken in the vortex until the substances mixed completely.The tube was centrifuged at 13,000 rpm for 10 minutes (Mini-Spin centrifuge; Eppendorf, Germany).After removing the supernatant, 50 μl of the precipitate was used for extraction of the DNA.

Semen microbiota analysis
The study was conducted using Androflor reagent kit (DNA-Technology; Russia) and DTprime detection thermal cycler (DNA-Technology; Russia) following the manufacturer's instructions.Once the amplification is over, the special software (DNA-Technology; Russia) automatically calculates the quantities (expressed in genome equivalents per 1 ml (GE/ml)) of the total bacterial load (TBL), lactobacilli and each of the detected opportunistic microorganisms (OM) in a given sample.The kit allows detecting the following microbial groups: gram-positive facultative anaerobes (Streptococcus spp.Staphylococcus spp., Corynebacterium spp.); gram-negative facultative anaerobes (Haemophilus spp., Pseudomonas aeruginosa / Ralstonia spp./ Burkholderia spp.);Enterobacteriaceae / Sterile deionized water was used as the negative control sample (NCS).Positive signals were detected in the negative control sample for some bacterial groups no earlier than in the 35 th amplification cycle.In these cases, the bacterial load was less than 10 3 GE/ml.Thus, the quantity of microorganisms needed to be at least 10 3 GE/ml for it to be considered above threshold, which meant that a positive signal was received in real-time PCR before the 35 th cycle.The exceptions were U. urealyticum, U. parvum, M. hominis since there was no positive signal for these microorganisms in the negative control sample.If the signal was detected at any amplification cycle for these microorganisms groups, real-time PCR result for them was regarded as positive.Yeast-like fungi of the Candida spp.
were not included in this study.

Statistical methods
Analysis of the structural characteristics of semen microbiota was carried out using the MSSC clustering model, which minimizes the sum over all clusters of intra-cluster sums of squared distances from cluster elements to their centroids [15].The clustering problem was solved using the k-means++ algorithm [16], implemented in the scikit-learn machine learning library.The optimal clustering was selected on the basis of internal assessments of the clustering quality: the Silhouette coefficient [17] and the Davies-Bouldin index (DBI) [18].
To run the k-means ++ clustering algorithm, each of the analyzed samples was represented as a vector (p, s) Є R 50 , consisting of a vector of primary signs p Є R 19 , taken from the data of semen microbiota analysis by real-time PCR, and of a vectors of secondary signs s Є R 31 , calculated using the primary signs.
The primary signs were the absolute values of the values determined by the Androflor kit (TBL and 18 bacterial groups).
Based on the primary characteristics, the following secondary characteristics were calculated: corrected TBL (CTBL), equal to the total mass of the 18 determined bacterial For optimal clustering, the stability of clusters to changes in the sample size was tested.For this purpose, random subsamples of 1 to 100% of the original sample were clustered and the cluster stability index was calculated using the following formula: where 1 {true} : {true, false} → {0, 1} -logical argument indicator function; A(x), A`(x) -the label of the observed cluster x, resulting from clustering based on the original dataset and subsample respectively; k = {1,2,3,4}, l = {1,2,3,4} -cluster labels.

RESULTS
Bacterial DNA (TBL) was not detected or was detected in the quantities lower than 10 3 GE/ml in 81 (35.7%) semen samples.TBL was detected in quantities higher than 10 3 GE/ml in 39 (17.1%) samples, however the quantities of specific bacterial groups were below the threshold value.
Different bacterial groups were found in a variety of associations with each other.Thus, we have decided to perform cluster analysis in order to identify the microbial communities typical of semen microbiota.

Semen microbiota cluster analysis
For cluster analysis, 107 samples were selected in accordance with the following criteria: TBL in the quantity of at least 10 3 GE/ml, at least one group of bacteria in the quantity of at least 10 3 GE / ml.
The optimal number of clusters in the examined dataset was determined on the basis of the values of the silhouette coefficient and Davies-Bouldin index (Table 2).The best clustering quality corresponds to the highest silhouette coefficient and the lowest Davies-Bouldin Index.In accordance with the obtained values of the indices, it was optimal to select 4, 9 or 10 clusters.However, after testing cluster stability, the ones obtained as a result of 9-and 10-clustering, were found to be less stable than the ones obtained as a result of 4-clustering.Thus, 4 main clusters of semen microbiota were identified.Each of the resulting clusters was characterized by the predominance of a particular consolidated bacterial group.The diagrams in Fig. 1 show the range of characteristics of the objects in their respective clusters.
Cluster 1 -the OA-dominated variant.CTBL amounted to 10 4.3 GE / ml in the centroid.The absolute quantity of all the OA was comparable to the CTBL and amounted to 10 4.2 GE/ml in the centroid (Fig. 1A).The proportion of OA in the centroid reached 81.1% in relation to the CTBL.We were unable to determine the predominant OA group with the test; several OA groups were present simultaneously.This microbiota variant was identified in 43 (40.2%)out of 107 samples.
Cluster 2 -the lactobacilli-dominated variant.It was identified in 22 (20.6%)out of 107 samples.CTBL amounted to 10 4.0 GE/ml in the centroid.The absolute quantity of all lactobacilli was lower than the CTBL in the centroid and amounted to 10 3,5 GE/ml (Fig. 1B).The proportion of lactobacilli in the centroid reached 64.3% in relation to the CTBL.OA, GPFA, and GNFA were present simultaneously with Lactobacillus spp.
Cluster 3 -characterized by the predominance of GPFA, was identified in 27 (25.2%)out of 107 samples.CTBL was 10 3.7 GE/ml in the centroid.The absolute quantity of all GPFA was comparable to the CTBL and amounted to 10 3.7 GE/ml in the centroid (Fig. 1C).The proportion of GPFA in the centroid reached 89.4% in relation to the CTBL.Most often this cluster was formed around Corynebacterium spp.and Streptococcus spp. in patients with normozoospermia.
Cluster 4 -the EE-dominated variant.CTBL was 10 4.2 GE/ml in the centroid.The absolute quantity of all EE was less than the CTBL and amounted to 10 4.1 GE/ml in the centroid (Fig. 1D).The proportion of EE in the centroid reached 80.8% in relation to the CTBL.This microbiota variant was identified in 15 (14.0%) out of 107 samples.

Analysis of the microbial clusters' stability
To analyze the stability of the identified clusters, subsamples of 1-100% volume of the original sample were generated (1000 random subsamples without return for each value of the volume).
Figure 2 shows the graphs depicting stability of the clusters obtained on the basis of 4-clustering semen microbiota samples with normozoospermia.The most stable are the clusters with the predominance of GPFA (cluster 3; Fig. 2C) and with the predominance of EE (cluster 4; Fig. 2D).

DISCUSSION
In this study, microbial DNA in above-threshold values (at least 10 3 GE/ml) was found in 146 (64.3%) of 227 semen samples meeting the criteria for normozoospermia.In 81 (35.7%) samples bacterial DNA was absent or was detected in an amount of less than 10 3 GE/ml and could be kitome DNA (microbial DNA present in reagent kits) [19].The results are consistent with the data of other researchers who noted the presence of microorganisms in the semen of men with normal semen parameters [1,[6][7][8]20].In 107 (47.1%) samples with the TBL of at least 10 3 GE/ml, up to 14 bacterial groups were found in above-threshold values.This is also consistent with the previously obtained data on the presence of polymicrobial associations in the seminal fluid of healthy men [1,8,20].
Bacteria of the Corynebacterium genus were identified in 17.2% of the studied samples, which was more often than other bacterial groups.Streptococcus spp., Peptostreptococcus spp./ Parvimonas spp., Bacteroides spp./ Porphyromonas spp./ Prevotella spp., Lactobacillus spp., Enterobacteriaceae spp./ Enterococcus spp.were present in 10.6-13.2% of samples.The rest of the analyzed bacterial groups were found in 3.5-9.7% of the samples.The simultaneous detection of several bacterial groups in various combinations makes it impossible to interpret the obtained results without additional mathematical analysis.
The positive samples, depending on the predominant group of microorganisms, were grouped into four clusters, similar to those obtained in the study of all semen types [21]: variants with the predominance of OA, Lactobacillus spp., GPFA, EE.The last two clusters are more stable than the first two.Although the clusters were identified exclusively mathematically, they are formed by microorganisms with similar physiological characteristics.In particular, three of the four identified clusters (with the predominance of OA, GPFA, EE) are formed by phylogenetically heterogeneous microorganisms with the same oxygen requirements, which was also noted in other studies [1].Apparently, this is due to the presence of various ecological niches for the microorganisms colonizing semen, which is not surprising, since semen is a mixture of biomaterials from different parts of the urogenital tract [6].
Most of the positive samples (40.2%) were attributed to the cluster with the OA predominance; their amount in the centroid reached 81.1% of all detected microorganisms.Microbiota in these samples was characterized by significant heterogeneity within the OA group without dominance of any particular species.A similar cluster, consisting of obligate anaerobic bacteria, was identified in the work which studied semen microbiota by NGS sequencing [1].However, the use of a routine culture-based analysis allowed us to identify OA as the predominant group of microorganisms only in 15% of semen samples which had been OA-prevalent when tested by means of real-time PCR [12].
A quarter of all samples (25.2%) were assigned to a cluster with the predominance of GPFA.This is the microbiota variant that was previously described as typical for the urogenital tract of healthy men [4].Among other microorganisms, bacteria of the Staphylococcus, Streptococcus, and Corynebacterium genera (assigned to the GPFA group) were detected in the semen of men without signs of sexually transmitted infections by the culture-based method [22].However, identifying GPFA in semen does not always mean that this bacterial group is predominant in this biomaterial [12].The use of modern molecular-based techniques also makes it possible to identify fastidious and non-culturable microorganisms, which clarifies their contribution to semen microbiota composition.
A smaller number of semen samples (20.6%) were attributed to the cluster with the predominance of Lactobacillus spp.The role of these bacteria, the main representatives of the vaginal normal microbiota, in the semen microbiota composition is not so obvious.Some researchers noted the presence of lactobacilli in semen samples with normozoospermia and associate this with male fertility [8,9].Others believe that increased numbers of Lactobacillus spp. in semen are a marker of hormonal disorders and the basis for further comprehensive examination of the patient [23].
The EE-dominated cluster was the smallest in the sample pool; the presence of this bacterial group was noted only in 14.0% of cases.Some representatives of EE, primarily Escherichia coli and Enterococcus feacalis, are considered to be a common cause of inflammatory pathology of male urogenital tract [24].Perhaps this is due to the high incidence of their detection by culture-based technique.For example, during the parallel study of semen samples using the culturebased technique and real-time PCR, it was shown that in almost half of the cases when enterobacteria and enterococci were determined by the cultures as predominant, other predominant microorganisms were detected by real-time PCR.Most often, these were OA, which, most likely, is due to the reduced ability to identify anaerobes during in vitro culturing [12].The role of E. coli and E. feacalis, as well as other representatives of the EE group, in fertility disorders and sperm quality has not been definitively identified and requires further study.
This study once again demonstrates the frequent presence of microorganisms in semen samples meeting the criteria for normozoospermia.In most of the analyzed samples, microbiota was predominantly represented by obligate anaerobic bacteria, rather than gram-positive facultative anaerobes, which were detected using the culture-based method [22].

CONCLUSIONS
In half of the cases, semen samples that met the criteria for normozoospermia contained microbiota in the abovethreshold values.The identified microorganisms were grouped using cluster analysis into four stable types according to the predominance criterion of a certain group of microorganisms: obligate anaerobes, Lactobacillus spp., Gram-positive facultative anaerobes, Enterobacteriaceae spp./ Enterococcus spp.The clusters were ranked by frequency of occurrence: the variant with the predominance of obligate anaerobes; gram-positive facultative anaerobes-dominated variant; Lactobacillus spp.dominated variant; Enterobacteriaceae spp./ Enterococcus spp.-dominated variant (identified in 40.2, 25.2, 20.6 and 14.0% of positive samples respectively).The use of molecular methods may lead us to the rethinking of ideas about the composition of the microbiota identified in semen samples with normozoospermia.Association of certain variants of semen microbiota with inflammatory pathologies of the reproductive tract and fertility disorders remains an unresolved question.It is possible that there are informative microbiological markers associated with these conditions.The study of the microbial composition of pathological semen samples is the next necessary step in the search for such diagnostic markers.

Fig. 1 .
Fig. 1. Results of cluster analysis of semen microbiota analyzed by means of real-time PCR (n = 107).The ordinate shows the values of the features in the centroid.Diagrams of the predominant groups of microorganisms are highlighted using red rectangles.Cluster 1 (n = 43) is characterized by the predominance of obligate anaerobes (А); cluster 2 (n = 22) is characterized by the predominance of Lactobacillus spp.(B); cluster 3 (n = 27) is characterized by the predominance of gram-positive facultative anaerobes (C); cluster 4 (n = 15) is characterized by the predominance of Enterobacteriaceae spp./ Enterococcus spp.(D)

Table 1 .
Detection rate of specific bacterial groups in quantities exceeding the threshold value (n = 227)* Note: * -for Ureaplasma urealyticum, Ureaplasma parvum, Mycoplasma hominis threshold values are > 0, for other bacterial groups they are ≥ 10 3 GE/ml.