ORIGINAL RESEARCH
Microbiota of semen samples with normozoospermia: analysis of real-time PCR data
1 Ural State Medical University, Yekaterinburg, Russia
2 Medical Center “Garmonia”, Yekaterinburg, Russia
3 Yeltsin Ural Federal University, Yekaterinburg, Russia
4 Institute of Mathematics and Mechanics, Yekaterinburg, Russia
5 Ivanovo State Medical Academy, Ivanovo, Russia
Correspondence should be addressed: Еkaterina S. Voroshilina
Repina, 3, Yekaterinburg, 620014; moc.liamg@anilihsorov
Acknowledgments: the authors would like to thank VN Khayutin, director of “Garmonia” Medical Center, for allowing them to conduct the study in the clinic's laboratory department.
Author contribution: Voroshilina ES — organization of the study, data analysis, writing the article; Zornikov DL — data analysis, writing the article; Ivanov AV — statistical processing, data analysis, writing the article; Pochernikov DG — patient selection, writing the article; Panacheva EA — literature review, data analysis, patient selection, conducting semen analyses and PCR tests, writing the article.
Compliance with ethical standards: the study was approved by the Ethics Committee of Ural State Medical University, Federal State Budget Educational Institution of Higher Education under the Ministry of Health of the Russian Federation (Protocol № 7 dated September 20, 2019). All patients signed the informed written consent to participation in the study.
Semen microbiota remains an under investigated part of human microbiome despite the strong interest in it, as well as the capabilities of modern molecular technologies. This biomaterial is especially significant in the context of infertility treatment [1]. The male factor is responsible for infertility in half of all the couples [2], however, the cause of infertility in men often remains unidentified [3]. Infection is behind only 6–10% of all male infertility cases [4]. It was shown that some bacteria can cause direct damage to spermatozoa decreasing their motility and viability [5].
The use of molecular-based technique, primarily nextgeneration sequencing (NGS), made it possible to detect complex bacterial communities both in the ejaculate of patients with infectious and inflammatory processes and in healthy men with normozoospermia [1, 6–10]. Some of the detected microorganisms were fastidious or non-culturable (including obligate anaerobes) [8, 10, 11], which could explain a larger number of positive samples compared to the results of culture method. However, the detection of microorganisms in the semen of patients with normozoospermia forced researchers to abandon the concept of bacteriospermia as a marker of an exclusively pathological condition [6, 7, 9]. Instead, cautious assumptions have been made about the association between the semen microbiota composition and abnormalities in the semen analysis [6, 9].
The few semen microbiota studies from patients with normozoospermia were conducted on a limited number of samples, which prevented researchers from forming a clear idea about the norm for this biomaterial [1]. Moreover, the NGS used in these studies has a number of disadvantages preventing its wide implementation in routine medical practice: high cost and labor input, the complexity of standardizing the procedure and interpreting the results.
In practice, real-time PCR, another molecular-based technique, is more promising for routine analyses of semen microbiota. The release of a registered test kit for assessing male urogenital microbiota has opened up new possibilities for detecting a wide range of pathogenic and opportunistic bacteria in semen. These microorganisms include fastidious and non-culturable bacteria, as well as Lactobacillus spp. [12, 13], which are commonly considered the inhabitants of the female reproductive tract. The availability of real-time PCR raises the question of correctly interpreting its results. The presence of many bacterial groups in various combinations and quantities required the use of mathematical modeling methods to identify patterns in semen microbiota composition. Cluster analysis allowed us to reduce the entire variety of identified microorganisms to four stable types of microbial communities, characterized by the predominance of different bacterial groups [12]. Further studies of samples with normal and abnormal spermiogram parameters are required to evaluate the clinical significance of the microbiota types.
The aim of the study was to identify stable variants of microbiota analyzed by means of real-time PCR in semen samples with normozoospermia.
METHODS
Patient groups
The study included 227 semen samples with normozoospermia from men (aged 20–59, mean age 33 ± 4.7) who came to the “Garmonia” Medical Center (Yekaterinburg, n = 142) and to the urological clinic of the Ivanovo State Medical Academy (Ivanovo, n = 85) seeking preconception care from January 2019 to March 2020.
Inclusion criteria: all examined patients during the last four weeks did not receive medications that could affect the semen microbiota, such as hormonal or antibacterial drugs; normozoospermia according to semen analysis results.
Exclusion criteria: hypogonadotropic and hypergonadotropic hypogonadism, type 1 and 2 diabetes, hypo- and hyperthyroidism; sexually transmitted infections (Chlamydia trachomatis, Neisseria gonorrhoeae, Mycoplasma genitalium, Trichomonas vaginalis); clinical manifestations of prostatitis such as pain and dysuria; karyotype abnormalities, mutations in the CFTR gene, microdeletions in the AZF locus of the Y chromosome.
Semen samples were collected from each patient in accordance with the following guidelines; semen analysis parameters and semen microbiota composition were evaluated.
Semen sampling
Patient preparation and sampling were conducted in compliance with WHO’s guidelines for the examination and processing of human semen (p. 2.2.4 of the Manual). Ejaculatory abstinence for the period of 2–5 days was mandatory. Prior to semen collection, patients urinated and washed their external genitalia. Semen was collected through masturbation into a sterile container [14].
Semen analysis parameters
The semen analysis was carried out after a 30–60-minute liquefaction of the material; the quantity (concentration) and motility of spermatozoa was calculated using a Biola SCA sperm analyzer (NPF Biola; Russia). Sperm morphology was assessed in stained preparations at a microscope magnification × 1000 using a Spermac Stain diagnostic kit (Ferti Pro; Belgium).
Obtained data were interpreted in accordance with the WHO criteria [14].
DNA extraction
PREP-NA-PLUS kit (DNA-Technology; Russia) was used for DNA-extraction. Semen samples were prepared using the following technique: 1.0 ml of semen was put into an Eppendorf tube with 1.0 ml of transport medium (“Transport media with mucolytic agent”; InterLabService Ltd., Russia) which was then shaken in the vortex until the substances mixed completely. The tube was centrifuged at 13,000 rpm for 10 minutes (MiniSpin centrifuge; Eppendorf, Germany). After removing the supernatant, 50 μl of the precipitate was used for extraction of the DNA.
Semen microbiota analysis
The study was conducted using Androflor reagent kit (DNATechnology; Russia) and DTprime detection thermal cycler (DNATechnology; Russia) following the manufacturer’s instructions.
Once the amplification is over, the special software (DNATechnology; Russia) automatically calculates the quantities (expressed in genome equivalents per 1 ml (GE/ml)) of the total bacterial load (TBL), lactobacilli and each of the detected opportunistic microorganisms (OM) in a given sample. The kit allows detecting the following microbial groups: gram-positive facultative anaerobes (Streptococcus spp. Staphylococcus spp., Corynebacterium spp.); gram-negative facultative anaerobes (Haemophilus spp., Pseudomonas aeruginosa / Ralstonia spp. / Burkholderia spp.); Enterobacteriaceae / Enterococcus spp. group; obligate anaerobes (Gardnerella vaginalis, Eubacterium spp., Sneathia spp. / Leptotrichia spp. / Fusobacterium spp., Megasphaera spp. / Veillonella spp. / Dialister spp., Bacteroides spp. / Porphyromonas spp. / Prevotella spp., Anaerococcus spp., Peptostreptococcus spp., Atopobium cluster), mycoplasmas (Mycoplasma hominis, Ureaplasma urealyticum, Ureaplasma parvum), transient microbiota (Lactobacillus spp.), yeast-like fungi (Candida spp.).
Sterile deionized water was used as the negative control sample (NCS). Positive signals were detected in the negative control sample for some bacterial groups no earlier than in the 35th amplification cycle. In these cases, the bacterial load was less than 103 GE/ml. Thus, the quantity of microorganisms needed to be at least 103 GE/ml for it to be considered above threshold, which meant that a positive signal was received in real-time PCR before the 35th cycle. The exceptions were U. urealyticum, U. parvum, M. hominis since there was no positive signal for these microorganisms in the negative control sample. If the signal was detected at any amplification cycle for these microorganisms groups, real-time PCR result for them was regarded as positive. Yeast-like fungi of the Candida spp. were not included in this study.
Statistical methods
Analysis of the structural characteristics of semen microbiota was carried out using the MSSC clustering model, which minimizes the sum over all clusters of intra-cluster sums of squared distances from cluster elements to their centroids [15]. The clustering problem was solved using the k-means++ algorithm [16], implemented in the scikit-learn machine learning library. The optimal clustering was selected on the basis of internal assessments of the clustering quality: the Silhouette coefficient [17] and the Davies–Bouldin index (DBI) [18].
To run the k-means ++ clustering algorithm, each of the analyzed samples was represented as a vector (p, s) Є R50, consisting of a vector of primary signs p Є R19, taken from the data of semen microbiota analysis by real-time PCR, and of a vectors of secondary signs s Є R31, calculated using the primary signs.
The primary signs were the absolute values o f the values determined by the Androflor kit (TBL and 18 bacterial groups).
Based on the primary characteristics, the following secondary characteristics were calculated: corrected TBL (CTBL), equal to the total mass of the 18 determined bacterial groups; mass fractions of microorganisms in relation to CTBL; masses of bacterial groups consolidated in accordance with the Androflor kit: Lactobacillus spp., gram-positive facultative anaerobes (GPFA), obligate anaerobes (OA), gram-negative facultative anaerobes (GNFA), Enterobacteriaceae spp. / Enterococcus spp. (EE) and mycoplasmas, mass fractions of consolidated bacterial groups in relation to CTBL.
For optimal clustering, the stability of clusters to changes in the sample size was tested. For this purpose, random subsamples of 1 to 100% of the original sample were clustered and the cluster stability index was calculated using the following formula:
where 1{true}: {true, false} → {0, 1} — logical argument indicator function; A(x), A`(x) — the label of the observed cluster x, resulting from clustering based on the original dataset and subsample respectively; k = {1,2,3,4}, l = {1,2,3,4} — cluster labels.
RESULTS
Bacterial DNA (TBL) was not detected or was detected in the quantities lower than 103 GE/ml in 81 (35.7%) semen samples.
TBL was detected in quantities higher than 103 GE/ml in 39 (17.1%) samples, however the quantities of specific bacterial groups were below the threshold value.
In 107 (47.1%) samples out of 227, TBL was at least 103 GE/ml (median — 103,8, interquartile range — 103,5–104,4 GE/ml) with 1 to 14 bacterial groups detected in quantities, exceeding the threshold value, simultaneously. Detection rate of specific bacterial groups is given in tab. 1.
Different bacterial groups were found in a variety of associations with each other. Thus, we have decided to perform cluster analysis in order to identify the microbial communities typical of semen microbiota.
Semen microbiota cluster analysis
For cluster analysis, 107 samples were selected in accordance with the following criteria: TBL in the quantity of at least 103 GE/ml, at least one group of bacteria in the quantity of at least 103 GE / ml.
The optimal number of clusters in the examined dataset was determined on the basis of the values of the silhouette coefficient and Davies–Bouldin index (tab. 2). The best clustering quality corresponds to the highest silhouette coefficient and the lowest Davies–Bouldin Index. In accordance with the obtained values of the indices, it was optimal to select 4, 9 or 10 clusters. However, after testing cluster stability, the ones obtained as a result of 9- and 10-clustering, were found to be less stable than the ones obtained as a result of 4-clustering. Thus, 4 main clusters of semen microbiota were identified.
Each of the resulting clusters was characterized by the predominance of a particular consolidated bacterial group. The diagrams in fig. 1 show the range of characteristics of the objects in their respective clusters.
Cluster 1 — the OA-dominated variant. CTBL amounted to 104.3 GE / ml in the centroid. The absolute quantity of all the OA was comparable to the CTBL and amounted to 104.2 GE/ml in the centroid (fig. 1A). The proportion of OA in the centroid reached 81.1% in relation to the CTBL. We were unable to determine the predominant OA group with the test; several OA groups were present simultaneously. This microbiota variant was identified in 43 (40.2%) out of 107 samples.
Cluster 2 — the lactobacilli-dominated variant. It was identified in 22 (20.6%) out of 107 samples. CTBL amounted to 104.0 GE/ml in the centroid. The absolute quantity of all lactobacilli was lower than the CTBL in the centroid and amounted to 103,5 GE/ml (fig. 1B). The proportion of lactobacilli in the centroid reached 64.3% in relation to the CTBL. OA, GPFA, and GNFA were present simultaneously with Lactobacillus spp.
Cluster 3 — characterized by the predominance of GPFA, was identified in 27 (25.2%) out of 107 samples. CTBL was 103.7 GE/ml in the centroid. The absolute quantity of all GPFA was comparable to the CTBL and amounted to 103.7 GE/ml in the centroid (fig. 1C). The proportion of GPFA in the centroid reached 89.4% in relation to the CTBL. Most often this cluster was formed around Corynebacterium spp. and Streptococcus spp. in patients with normozoospermia.
Cluster 4 — the EE-dominated variant. CTBL was 104.2 GE/ml in the centroid. The absolute quantity of all EE was less than the CTBL and amounted to 104.1 GE/ml in the centroid (fig. 1D). The proportion of EE in the centroid reached 80.8% in relation to the CTBL. This microbiota variant was identified in 15 (14.0%) out of 107 samples.
Analysis of the microbial clusters’ stability
To analyze the stability of the identified clusters, subsamples of 1–100% volume of the original sample were generated (1000 random subsamples without return for each value of the volume).
Figure 2 (fig. 2) shows the graphs depicting stability of the clusters obtained on the basis of 4-clustering semen microbiota samples with normozoospermia. The most stable are the clusters with the predominance of GPFA (cluster 3; fig. 2C) and with the predominance of EE (cluster 4; fig. 2D).
DISCUSSION
In this study, microbial DNA in above-threshold values (at least 103 GE/ml) was found in 146 (64.3%) of 227 semen samples meeting the criteria for normozoospermia. In 81 (35.7%) samples bacterial DNA was absent or was detected in an amount of less than 103 GE/ml and could be kitome DNA (microbial DNA present in reagent kits) [19]. The results are consistent with the data of other researchers who noted the presence of microorganisms in the semen of men with normal semen parameters [1, 6–8, 20]. In 107 (47.1%) samples with the TBL of at least 103 GE/ml, up to 14 bacterial groups were found in above-threshold values. This is also consistent with the previously obtained data on the presence of polymicrobial associations in the seminal fluid of healthy men [1, 8, 20].
Bacteria of the Corynebacterium genus were identified in 17.2% of the studied samples, which was more often than other bacterial groups. Streptococcus spp., Peptostreptococcus spp. / Parvimonas spp., Bacteroides spp. / Porphyromonas spp. / Prevotella spp., Lactobacillus spp., Enterobacteriaceae spp. / Enterococcus spp. were present in 10.6–13.2% of samples. The rest of the analyzed bacterial groups were found in 3.5–9.7% of the samples. The simultaneous detection of several bacterial groups in various combinations makes it impossible to interpret the obtained results without additional mathematical analysis.
The positive samples, depending on the predominant group of microorganisms, were grouped into four clusters, similar to those obtained in the study of all semen types [21]: variants with the predominance of OA, Lactobacillus spp., GPFA, EE. The last two clusters are more stable than the first two. Although the clusters were identified exclusively mathematically, they are formed by microorganisms with similar physiological characteristics. In particular, three of the four identified clusters (with the predominance of OA, GPFA, EE) are formed by phylogenetically heterogeneous microorganisms with the same oxygen requirements, which was also noted in other studies [1]. Apparently, this is due to the presence of various ecological niches for the microorganisms colonizing semen, which is not surprising, since semen is a mixture of biomaterials from different parts of the urogenital tract [6].
Most of the positive samples (40.2%) were attributed to the cluster with the OA predominance; their amount in the centroid reached 81.1% of all detected microorganisms. Microbiota in these samples was characterized by significant heterogeneity within the OA group without dominance of any particular species. A similar cluster, consisting of obligate anaerobic bacteria, was identified in the work which studied semen microbiota by NGS sequencing [1]. However, the use of a routine culture-based analysis allowed us to identify OA as the predominant group of microorganisms only in 15% of semen samples which had been OA-prevalent when tested by means of real-time PCR [12].
A quarter of all samples (25.2%) were assigned to a cluster with the predominance of GPFA. This is the microbiota variant that was previously described as typical for the urogenital tract of healthy men [4]. Among other microorganisms, bacteria of the Staphylococcus, Streptococcus, and Corynebacterium genera (assigned to the GPFA group) were detected in the semen of men without signs of sexually transmitted infections by the culture-based method [22]. However, identifying GPFA in semen does not always mean that this bacterial group is predominant in this biomaterial [12]. The use of modern molecular-based techniques also makes it possible to identify fastidious and non-culturable microorganisms, which clarifies their contribution to semen microbiota composition.
A smaller number of semen samples (20.6%) were attributed to the cluster with the predominance of Lactobacillus spp. The role of these bacteria, the main representatives of the vaginal normal microbiota, in the semen microbiota composition is not so obvious. Some researchers noted the presence of lactobacilli in semen samples with normozoospermia and associate this with male fertility [8, 9]. Others believe that increased numbers of Lactobacillus spp. in semen are a marker of hormonal disorders and the basis for further comprehensive examination of the patient [23].
The EE-dominated cluster was the smallest in the sample pool; the presence of this bacterial group was noted only in 14.0% of cases. Some representatives of EE, primarily Escherichia coli and Enterococcus feacalis, are considered to be a common cause of inflammatory pathology of male urogenital tract [24]. Perhaps this is due to the high incidence of their detection by culture-based technique. For example, during the parallel study of semen samples using the culturebased technique and real-time PCR, it was shown that in almost half of the cases when enterobacteria and enterococci were determined by the cultures as predominant, other predominant microorganisms were detected by real-time PCR. Most often, these were OA, which, most likely, is due to the reduced ability to identify anaerobes during in vitro culturing [12]. The role of E. coli and E. feacalis, as well as other representatives of the EE group, in fertility disorders and sperm quality has not been definitively identified and requires further study.
This study once again demonstrates the frequent presence of microorganisms in semen samples meeting the criteria for normozoospermia. In most of the analyzed samples, microbiota was predominantly represented by obligate anaerobic bacteria, rather than gram-positive facultative anaerobes, which were detected using the culture-based method [22].
CONCLUSIONS
In half of the cases, semen samples that met the criteria for normozoospermia contained microbiota in the abovethreshold values. The identified microorganisms were grouped using cluster analysis into four stable types according to the predominance criterion of a certain group of microorganisms: obligate anaerobes, Lactobacillus spp., Gram-positive facultative anaerobes, Enterobacteriaceae spp. / Enterococcus spp. The clusters were ranked by frequency of occurrence: the variant with the predominance of obligate anaerobes; gram-positive facultative anaerobes-dominated variant; Lactobacillus spp. — dominated variant; Enterobacteriaceae spp. / Enterococcus spp. — dominated variant (identified in 40.2, 25.2, 20.6 and 14.0% of positive samples respectively). The use of molecular methods may lead us to the rethinking of ideas about the composition of the microbiota identified in semen samples with normozoospermia. Association of certain variants of semen microbiota with inflammatory pathologies of the reproductive tract and fertility disorders remains an unresolved question. It is possible that there are informative microbiological markers associated with these conditions. The study of the microbial composition of pathological semen samples is the next necessary step in the search for such diagnostic markers.