OPINION
Method for quantitative assesment of gut microbiota: a comparative analysis of 16S NGS and qPCR
1 Centre for Strategic Planning and Management of Biomedical Health Risks of the Federal Medical Biological Agency, Moscow, Russia
2 National Medical Research Center for Therapy and Preventive Medicine of the Ministry of Healthсare of the Russian Federation, Moscow, Russia
Correspondence should be addressed: Olga A. Zlobovskaya
Pogodinskaya, 10, str. 1, Moscow, 119121, Russia; ur.abmfpsc@ayaksvobolZO
Author contribution: Zlobovskaya OA — concept, literature review, manuscript writing; Kurnosov AS, Sheptulina AF, Glazunova EV — manuscript editing.
16S NGS: Broad Capabilities and Significant Limitations
The 16S NGS method has become an essential tool for studying the microbiota. Its main advantage is the ability to simultaneously sequence multiple samples and detect a wide range of microorganisms. Comprehensive assessment of the taxonomic composition of microbial communities and their diversity makes 16S NGS indispensable for fundamental research. However, despite its benefits, this method has several significant limitations that can lead to distorted quantitative results.
Uneven amplification (dependence on primers)
Universal primers are used to amplify the variable regions of the 16S rRNA gene. They exhibit different affinities for the DNA of various taxa, resulting in unequal amplification efficiency during library preparation [1]. As a result, the microbiota structure data can be skewed, with some taxa being overestimated, while others are underestimated or entirely missed.
Uneven amplification (dependence on taxonomic composition)
The most abundant taxa gain a significant advantage during the early stages of amplification [2], thus reducing the likelihood of accurately detecting rare taxa (up to 10% of the total community). Since each sample has a unique microbiota composition, it is impossible to apply a systematic correction for all samples, even when using the same protocols [3, 4].
Low sensitivity
On average, between 5,000 and 50,000 reads are obtained per sample when using the 16S NGS method. However, according to Poisson distribution, quantitative assessment of a taxon can only be considered statistically reliable when there are at least 100 reads for that taxon in the sample [5]. This limits the ability to reliably quantify taxa that make up less than 0.2–2% of the total reads (depending on the total number of reads). Increasing the number of reads per sample is not always effective, as the amplification of dominant taxa occurs in the early stages, leading to a significant underrepresentation or even loss of minor taxa. Consequently, the taxonomic diversity saturation curve reaches a plateau at 20,000–50,000 reads, meaning that further increasing the number of reads will not improve data representativeness. This is especially important for minor opportunistic microorganisms that may have clinical significance at low concentrations but are often either undetected or inaccurately quantified. Additionally, there is no consensus among researchers on whether it is more accurate to compare samples with different numbers of reads or to introduce bias by unifying the number of reads [6, 7].
Reduced specificity
When analyzing short regions (V1-V3, V3-V4, V6, etc.), the high degree of conservation in the 16S region often prevents taxonomic resolution at the species level, and sometimes even at the genus level [1, 8]. Using the full-length 16S gene increases the resolution of sequencing but is only available on such platforms as ONT, PacBio, and LoopSeq. A significant drawback of these platforms is their higher error rate compared to short-read platforms like Illumina.
Limitations of relative quantification of taxa
The 16S NGS method evaluates only the relative abundance of taxa, not their absolute quantity. This means that an increase in the relative abundance of one taxon, for example, due to dietary changes, will automatically reduce the proportion of other taxa in the analysis. Simultaneous changes in multiple taxa in either direction makes the reconstruction of the true dynamics of the community impossible [4–6].
Impact of 16S rRNA gene copy number
Each microbial species has a unique number of 16S rRNA gene copies, which is rarely considered during analysis, particularly when identifying sequences to the genus or family level. Even when using specialized plugins for QIIME 2, biases usually persist. One reason is that in cases where the copy number data for a specific taxonomic group is absent from the rrnDB database, the algorithm automatically assigns a copy number of one.
Uneven phylogenetic resolution
Different regions of the 16S rRNA gene have varying levels of phylogenetic resolution [1, 8–10]. This leads to inconsistent classification accuracy, complicating the comparison of data across different studies.
Differences in sequencing platforms and data processing methods
The choice of sequencing platforms and library preparation methods can lead to significant variations in results [1, 11, 12]. As mentioned above, this makes it more challenging to compare data across various studies.
Dependence on databases
Different databases (RDP, SILVA, Greengenes, etc.) can yield different quantitative assessments for the same sample [1, 13]. Additionally, databases are updated every few years, which means that newly introduced taxa may be missing.
qPCR: Specialized Tasks, High Accuracy
Unlike NGS, specific DNA fragments are amplified in real-time PCR (qPCR). This results in several advantages.
High sensitivity and a broad quantitative range
qPCR enables the detection and quantification of even a few target copies in a reaction with high precision. This is especially important when studying rare clinically significant taxa, which may be missed by 16S NGS. Additionally, qPCR can reliably quantify up to 107–108 target copies in a reaction.
High specificity
Oligonucleotides are designed to distinguish even closely related microorganisms with high accuracy.
Improved Precision
Unlike 16S NGS, the absence of simultaneous amplification of hundreds of different targets leads to a more reliable individual assessment of a specific taxon abundance.
Fast and simple interpretation
Unlike 16S NGS, qPCR does not require complex bioinformatics methods for data interpretation. This makes it more accessible and convenient for clinical research and diagnostics, where speed and accuracy are critical.
High reproducibility
qPCR provides higher reproducibility compared to 16S NGS due to the simplicity of the method and data analysis. This is particularly important for clinical diagnostics and long-term studies, and also facilitates data comparison between different studies and laboratories.
Absolute quantification
qPCR allows for both relative and absolute quantification of taxa. Thus qPCR enables analysis of microbiota dynamics under different conditions, unlike the relative approach of NGS.
Reduced dependency on sample quality
qPCR analysis is less dependent on the initial quality of the sample (e.g., quantity, presence of PCR inhibitors) compared to the 16S NGS method, where these factors significantly impact the library preparation stage.
Nevertheless, the qPCR method also has certain limitations. However, unlike NGS, many of potential issues can be minimized if addressed properly.
Selection of target microorganisms
Preselected genetic targets are amplified in qPCR, which requires prior knowledge of the microbiota key representatives in the given study.
Target region selection
The most commonly studied region for the majority of bacteria is the 16S rRNA gene, making it the typical target for qPCR assay development. However, this is a highly conserved genomic region, so for some taxonomic units at the species level (and occasionally at the genus level, e.g., Oscillibacter/ Dysosmobacter), it may not be possible to develop specific systems that amplify 16S region. For some microorganisms, whole-genome data are available, allowing the selection of another region for detection. However, these organisms are in the minority, so the chosen target may be nonspecific, or the system may fail to amplify all members of the given taxonomic group.
Limitation on the number of taxa
High qPCR specificity limits the number of taxa that can be analyzed simultaneously. For accurate quantitative assessment, it is recommended to combine no more than two targets (if they exhibit a broad range and are consistently present in most samples) or three targets (for rare taxa) in a single tube. Moreover, due to the limited number of taxa analyzed in this method, qPCR does not provide information on the structure of the entire microbial community or its diversity, which may also hold clinical significance.
Biases related to gene copy number
This issue can arise if the system is designed to detect a taxonomic group at a higher level (e.g., family), where different
Need for data standardization
Converting the data obtained through qPCR into absolute values requires the use of calibration standards. For maximum accuracy, it is essential to pre-assess the standards using droplet digital PCR. In addition, the sensitivity and linear range of oligonucleotide systems should preferably be tested not on model samples (e.g., plasmid or amplicon titration) but on the genomic DNA of the corresponding taxon, ideally against a background of fecal DNA in clinically relevant quantities.
CONCLUSION
A comparison of the 16S NGS and qPCR methods shows that NGS is better suited for studying the overall composition and diversity of the microbiota. However, its use for quantitative assessment is limited by several factors that currently lack practical solutions. Meanwhile, qPCR offers more accurate and reliable quantitative assessment, making it the preferred method for studies where high precision is required, and the target markers are well-defined.