METHOD

Designing of custom barcodes for sequencing on the MGI platform

Shmitko AO, Bulusheva IA, Vasiliadis YuA, Suchalko ON, Syrko DS, Belova VA, Pavlova AS, Korostin DO
About authors

Pirogov Russian National Research Medical University

Correspondence should be addressed: Аnna O. Shmitko
Ostrovityanova, 1/1, Мoscow, 117997, Russia; moc.liamg@79imhsanna

About paper

Funding: this research was funded by the grant №075-15-2019-1789 from the Ministry of Science and Higher Education of the Russian Federation allocated to the Center for Precision Genome Editing and Genetic Technologies for Biomedicine.

Author contribution: Shmitko AO — study planning, data collection, writing Original Draft Preparation; Bulusheva IA — methodology, data analysis, writing Original Draft Preparation; Vasiliadis IA, Suchalko ON, Syrko DS — data analysis, software, visualization; Belova VA — study planning, manuscript review and editing; Pavlova AS — data curation, data analysis, software; Korostin DO — conceptualization, supervision, methodology, manuscript review and editing.

Received: 2024-08-15 Accepted: 2024-09-17 Published online: 2024-10-10
|
Fig. 1. А. The concept of the quad method. Each MGI barcode serves as a root for the quad, and custom barcodes are generated by sequential changes at each position of the original barcode: A→T, T→G, G→C, C→A. B. An example of a barcode quad. 47A is the original MGI barcode, and 47B, 47C, 47D are the custom barcodes generated using the quad method.
Fig. 2. The nucleotide balance in a pool of 4n + 2 barcodes. The colored lines represent 295 the nucleotide fractions. The black strong lines represent the boundaries of the weak criterion. 296 The fine lines represent the boundaries of the strong criterion for barcode compatibility [8]
Fig. 3. The incompatibility graph showing 63 quads and MGI barcodes not included in 298 the quads. All quads that passed filtering based on the mismatch number are shown in green, 299 the barcodes from the set of 96 MGI barcodes not included in the quads are shown in orange, 300 the MGI barcodes from the set of 128 MGI barcodes are shown in blue, the 999- manufacturer 301 verification sequence is shown in red. The line connects the incompatible barcodes and quads; 302 the number above the line indicates the lowest number of mismatches between them
Fig. 4. Venn diagram for comparing the sequences of custom and original MGI barcodes from the 128 barcode set and 999 validation barcode provided by MGI. The custom barcodes are shown in green, the original MGI barcodes which do not overlap with quads are shown in red, and the MGI barcodes overlapping with quads (which were used as roots for the quads) are shown in orange
Fig. 5. The average ratio of undecoded reads (%) and complete data (Gb) per lane for 309 the libraries with MGI barcodes (in blue, data from 40 lanes) and with custom barcodes (red, 310 data from 7 lanes)
Table. The list of barcode sequences and full sequences of top and bottom oligonucleotides for adapter preparation. Ad_Bttm -bottom oligo