Whole-Genome Analysis of Mycobacterium avium subsp. paratuberculosis IS900Insertions Reveals Strain Type-Specific Modalities
Published: 2021
Abstract:
Mycobacterium avium subsp. paratuberculosis (Map) is the etiological agent of Johne’s disease in ruminants. The IS900 insertion sequence (IS) has been used widely as an epidemiological marker and target for PCR diagnosis. Updated DNA sequencing technologies have led to a rapid increase in available Map genomes, which makes it possible to analyze the distribution of IS900 in this slow-growing bacterium. The objective of this study is to characterize the distribution of the IS900 element and how it affects genomic evolution and gene function of Map. A secondary goal is to develop automated in silico restriction fragment length polymorphism (RFLP) analysis using IS900. Complete genomes from the major phylogenetic lineages known as C-type and S-type (including subtypes I and III), were chosen to represent the genetic diversity of Map. IS900 elements were located in these genomes using BLAST software and the relevant fragments extracted. An in silico RFLP analysis using the BstEII restriction site was performed to obtain exact sizes of the DNA fragments carrying a copy of IS900 and the resulting RFLP profiles were analyzed and compared by digital visualization of the separated restriction fragments. The program developed for this study allowed automated localization of IS900 sequences to identify their position within each genome along with the exact number of copies per genome. The number of IS900 copies ranged from 16 in the C-type isolate to 22 in the S-type subtype I isolate. A loci-by-loci sequence alignment of all IS900 copies within the three genomes revealed new sequence polymorphisms that define three sequevars distinguishing the subtypes. Nine IS900 insertion site locations were conserved across all genomes studied while smaller subsets were unique to a particular lineage. Preferential insertion motif sequences were identified for IS900 along with genes bordering all IS900 insertions. Rarely did IS900 insert within coding sequences as only three genes were disrupted in this way. This study makes it possible to automate IS900 distribution in Map genomes to enrich knowledge on the distribution dynamics of this IS for epidemiological purposes, for understanding Map evolution and for studying the biological implications of IS900 insertions.