WHOLE-GENOME SEQUENCING AND COMPARATIVE GENOMIC ANALYSIS OF MYCOBACTERIUM SMEGMATIS MUTANTS RESISTANT TO IMIDAZO[1,2-b][1,2,4,5]TETRAZINES, ANTITUBERCULOSIS DRUG CANDIDATES

The spread of multidrug and extensively drug-resistant Mycobacterium tuberculosis urges the development of novel antituberculosis drugs. Previously, we studied the compounds representing the class of substituted imidazo[1,2- b ][1,2,4,5] tetrazines capable of inhibiting serine/threonine protein kinases (STPK) in the original M. smegmatis aphVIII + test-system. To unveil the mechanism of action of drug candidates, it is necessary to search for mutations in the mycobacterial genome that confer resistance to these compounds. The aim of our work was to find and describe such mutations in M. smegmatis strains. We carried out the whole-genome sequencing of 9 mutants resistant to 3 imidazo[1,2- b ][1,2,4,5]tetrazines. Seven of 9 mutant strains were found to have the Y52H mutation in the highly conserved mycobacterial gene MSMEG_1601 encoding a protein with an unknown function. Additionally, three of those 7 strains were shown to have two mutations in the MSMEG_1380 encoding a transcriptional regulator. The remaining 2 mutant strains had mutations in MSMEG_0641 and MSMEG_2087 genes encoding transporter-proteins. No mutations were found in STPK genes, meaning that they might be not the primary targets of the studied compounds. Further investigation of MSMEG_1601 function may be of interest as this protein might be the biological target or a part of a new mechanism underlying resistance to antituberculosis drug candidates.

According to the World Health Organization, over 2 billion people (1/3 of the world population) are infected with Mycobacterium tuberculosis, the causative agent of tuberculosis (TB), one of the deadliest infectious diseases that kills 10.8 million people every year [1]. The key challenge in the fight against TB is the emergence and spread of mycobacterial strains resistant to both rifampicin and isoniazid (multidrug-resistant TB, MDR-TB) and those additionally resistant to fluoroquinolones and one of the second-line injectable drugs (extensively drug-resistant TB, XDR-TB) [2,3]. Therefore, the development of antituberculosis drugs with a novel mechanism of action is a key objective in fighting TB.

DNA isolation and whole-genome sequencing
Mycobacterial DNA was isolated from 15 ml of the liquid culture according to the protocol described in [7]. After preliminary isolation, DNA was treated with RNase A (Thermo Fischer Scientific, USA) and extracted in the phenol-chloroform-isoamyl alcohol solution (25 : 24 : 1).
DNA libraries were prepared using Nextera kits (Illumina, USA); sequencing was carried out on the Illumina MiSeq platform using the MiSeq Reagent Kit v3 2 x 315 bp (Illumina, USA). Sequencing of the wild-type strain genomic DNA was conducted with the MiSeq Reagent Kit v2 2x150 bp (Illumina, USA). The obtained data were submitted to the NCBI Sequence Read Archive (SRA) (entry ID SRP145443).

Processing of whole-genome sequencing data and comparative genomic analysis
The obtained reads were aligned to the reference genome (NC_008596.1, PRJNA57701) using the BWA-MEM algorithm [8]. The pileup was generated by mpileup (-B -f) in SAMtools [9]. Single nucleotide variants were called by running mpileup2snp (--min-avg-qual 30 --min-var-freq 0.80 --p-value 0.01 --outputvcf 1) in VarScan 2.3.9 [10]. Annotation was created using vcf_ annotate.pl (courtesy of Natalya Mikheecheva of the Laboratory of Bacterial Genetics, Vavilov Institute of General Genetics). The non-synonymous single nucleotide variants found within open reading frames and absent in the wild-type strain were selected for further analysis. The similarity search was conducted in BLAST (https://blast.ncbi.nlm.nih.gov).

Comparative genomic analysis
After genome assembly, we conducted a comparative genomic analysis of mutant and wild-type strains. The following unique single nucleotide polymorphisms were identified: 1) CGT to AGT substitution in codon 233 (R>S) of MSMEG_0641 (binding-protein-dependent transporters inner membrane component) in the mutant at R 10; 2) ACG to GTG substitution in codon 52 (T>V) of MSMEG_1380 (transcriptional regulator) in the mutant at R 19; 3) insertions of VG amino acids at position 51 of MSMEG_1380 (transcriptional regulator) in the mutants at R 11 and at R 17; 4) TAC to CAC substitution in codon 52 (Y>H) of MSMEG_1601 (hypothetical protein) in the mutants at R 1, at R 2, at R 8, at R 11, at R 14, at R 17, and at R 19; 5) TAC to TGC substitution in codon 188 (Y>C) of MSMEG_2087 (transporter small conductance mechanosensitive ion channel (MscS) family protein) in the mutant at R 9.
Genes containing the above-mentioned mutations are not pseudogenes but the functions of the proteins they encode have not been confirmed experimentally.

Identification of homologous genes in the genome of M. tuberculosis
The similarity search carried out in BLAST returned the homologs of M. tuberculosis proteins with the above-mentioned mutations (Table).

DISCUSSION
The crucial phase in the development of any novel antibacterial drug is the study of its mechanism of action. Obtaining mutants resistant to the studied compound and the identification of mutations underlying this resistance is a classical approach to the detection of possible targets for an antibiotic. We have conducted the comparative genomic analysis of 9 mutants   cross-resistant to all three studied compounds representing the class of substituted imidazo [1,2-b] [1,2,4,5]tetrazines. Having analyzed the mutants' genomes, we selected the most plausible drivers of drug resistance: 5 mutations in 4 genes. Two mutations were identified in genes encoding a transmembrane transporter (MSMEG_0641) and a mechanosensitive channel (MSMEG_2087); these mutations can affect transport of the studied compounds into and out of the cell. Two mutations were found in the MSMEG_1380 gene encoding a TetR family transcriptional regulator. TetR proteins can participate in the regulation of drug resistance by controlling expression of different membrane transporters. For example, the TetR protein of M. abscessus activates expression of cell transporters MmpS5/MmpL5 implicated in the resistance to thioacetazone derivatives [11].
Of all the identified mutations, the most promising for further research might be the mutation in the MSMEG_1601 gene, as it is present in 7 out of 9 mutants. This is a highly conserved mycobacterial gene: it is found in all representatives of the Mycobacterium genus, including M. leprae with its very reduced genome, and in some other actinobacteria, and belongs to the so called "mycobacterial core hypotheticals" (highly conserved proteins with unknown functions) [12], though it is not vital for the growth of mycobacteria in vitro [13]. The proteomic analysis of different M. tuberculosis lineages demonstrated that the Rv3412 protein homologous to MSMEG_1601 is found in greater abundance in virulent strains, including a LAM strain, in comparison with attenuated strains of M. bovis BCG. This allowed the authors to suppose a possible implication of the Rv3412 protein in the infection process [14].

CONCLUSIONS
We have discovered 5 mutations in 4 genes that possibly confer resistance to substituted imidazo [1,2-b] [1,2,4,5]tetrazines. The contribution of each mutations is yet to be confirmed by reverse genetics. However, it is already clear that one of them located within the MSMEG_1601 gene represents a certain interest: unlike other mutant genes, MSMEG_1601 is not linked to transmembrane transport and might be a direct biological target for substituted imidazo [1,2-b] [1,2,4,5]tetrazines.