- Genome Report
- Open Access
Complete genome analysis of clinical Shigella strains reveals plasmid pSS1653 with resistance determinants: a triumph of hybrid approach
Gut Pathogens volume 11, Article number: 55 (2019)
Shigella is ranked as the second leading cause of diarrheal disease worldwide. Though infection occurs in people of all ages, most of the disease burden constitutes among the children less than 5 years in low and middle income countries. Recent increasing incidence of drug resistant strains make this as a priority pathogen under the antimicrobial resistance surveillance by WHO. Despite this, only limited genomic studies on drug resistant Shigella exists. Here we report the first complete genome of clinical S. flexneri serotype 2a and S. sonnei strains using a hybrid approach of both long-read MinION (Oxford Nanopore Technologies) and short-read Ion Torrent 400 bp sequencing platforms. The utilization of this novel approach in the present study helped to identify the complete plasmid sequence of pSS1653 with structural genetic information of AMR genes such as sulII, tetA, tetR, aph(6)-Id and aph(3′’)-Ib. Identification of AMR genes in mobile elements in this human-restricted enteric pathogen is a potential threat for dissemination to other gut pathogens. The information on Shigella at genome level could help us to understand the genome dynamics of existing and emerging resistant clones.
Shigella is the second leading cause of diarrheal deaths globally, mainly among children less than 5 years. Shigella flexneri and Shigella sonnei are the leading cause of diarrhea in developing countries like India while other two serogroups are relatively uncommon . Historically, S. sonnei is mainly seen in developed countries but its recent spread into developing countries over the last decades has raised major public health concerns . Due to its low infectious dose, clinical severity, serotype specific immunity, emerging antimicrobial resistance and having humans as the only natural host, Shigella is categorized as a priority pathogen among enteric bacteria on Global Antimicrobial Resistance Surveillance System (GLASS) by World Health Organization (WHO) .
The key virulence factors that are involved in the pathogenesis of Shigella are located on both the plasmid and chromosome of the pathogen enabling it to survive intra-cellularly. Shigellosis is generally self-limiting but the use of antibiotics reduces the duration of symptoms and pathogen shedding which in turn reduces transmission. The increasing awareness of disease burden and emerging threats posed by drug resistant Shigella have resulted in an interest in the development of Shigella vaccines which are currently in the clinical trial stage .
There is an increasing interest in exploring the molecular epidemiology of genetically encoded virulence and resistance factors in Shigella as this provides information on the severity of infection, transmission and the pathogen response to antimicrobials. The virulence and resistance determinants are mainly located on mobile genetic elements (MGEs) such as plasmids, insertion sequences, integrons, pathogenicity islands and bacteriophages in Shigella spp. Horizontal gene transfer (HGT) of these elements acts as an important driver for bacterial evolution . Through HGT, the pathogen enhance its ability to establish infection and to acquire resistance to outcompete other susceptible bacteria in the gut by transferring genes between the commensal and other pathogenic bacteria that are circulating locally [5, 6]. These MGEs can be predicted using whole genome sequencing (WGS) through bioinformatics analysis. Recently, the advancement of whole genome sequencing methodologies has a major impact on bacterial genoe wide studies and in the epidemiological analysis of bacterial pathogens.
In this study, we report the first complete genome of S. flexneri serotype 2a and S. sonnei strain using a hybrid assembly approach of both long-read MinION (Oxford Nanopore Technologies) and short-read Ion Torrent 400 bp sequencing platforms. The availability of the complete genome of Shigella clinical strains and subsequent genome analysis provides a better understanding into its genome characteristics including virulence, resistance and mobile genetic elements.
Materials and methods
The two clinical Shigella strains, S. flexneri 2a (FC906) and S. sonnei (FC1653) sequenced were isolated from stool specimens at the Department of Clinical Microbiology, Christian Medical College, Vellore, India.
Genomic DNA was extracted using QIAamp DNA Mini Kit (QIAGEN, Hilden, Germany) according to the manufacturer’s instructions. DNA quality and quantity was assessed using Nanodrop spectrophotometry (Thermofisher, USA) and Qubit 3.0 (Thermofisher, USA) respectively. To get the closed genome, a hybrid approach using long read MinION and short read IonTorrent sequencing was performed as described previously . Briefly, short read sequencing was performed with 400-bp read chemistry using an IonTorrent™ Personal Genome Machine™ (PGM) (Life Technologies, Carlsbad, CA) as per manufacturer’s instructions. Long read sequencing was performed using SQK-LSK108 Kit R9 version (Oxford Nanopore Technologies, Oxford, UK) using 1D sequencing method according to manufacturer’s protocol.
Assembly and annotation
The Fast5 files were generated from MinION sequencing and the reads were base called with Albacore 2.0.1 (https://nanoporetech.com/about-us/news/new-basecaller-now-performs-raw-basecalling-improved-sequencing-accuracy). Canu 1.7 was used for error correction of reads and assembly with genome size of 3.0 m as input . The quality of the MinION reads was assessed using MinIONQC (https://github.com/roblanf/minion_qc). To increase the accuracy and completeness of genome, we performed hybrid assembly using both Ion torrent and MinION reads with Unicycler (v0.4.7) . By default, unicycler utilizes SPAdes  to assemble the short reads with different k-mers and filter out the low depth regions along with error correction and quality checks. Subsequently, it trims and generates the short read assembly graph. In addition, it uses Miniasm  and Racon  to assemble the MinION long reads and further the reads were bridged to determine all the genome repeats and produces complete genome assembly. In addition, multiple rounds of short reads polishing was performed with Pilon  to reduce the base level errors in long read assembly.
After assembly, the genomes were annotated using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP). Virulence and antimicrobial resistance genes (ARG) were detected in silico by VirulenceFinder ((https://cge.cbs.dtu.dk/services/VirulenceFinder/)  and ResFinder (https://cge.cbs.dtu.dk/services/ResFinder/) database respectively with the 90% threshold for identity and with 60% of minimum length coverage . Sequence type of the isolates were analyzed using MLST 2.0 (Multi Locus Sequence Typing) tool (https://cge.cbs.dtu.dk//services/MLST/) . Shigella PAIs was compared with the reference sequences through BLASTn and visualized using Easyfig . The genomes were screened for prophages using PHAST tool . ISsaga was used to predict the number of insertion sequences in the genome (https://www-issaga.biotoul.fr/issaga_index.php) .
Species confirmation was performed by biochemical tests (motility, urea, citrate, indole, triple sugar iron) and species specific PCR was done [20, 21]. A pure isolated colony was used for genomic DNA extraction. The strain identification was confirmed through BLAST annotation using NCBI database and species was predicted using KmerFinder available at center for genomic epidemiology.
Results and discussion
A hybrid assembly approach provided a complete single chromosome for S. flexneri (FC906) as well as chromosome and 3 plasmids with size of 8401 bp, 6015 bp and 2690 bp for S. sonnei (FC1653). On BLAST analysis, the plasmids showed 100%, 99.7% and 100% similarity against previously identified plasmids S. sonnei FDAARGOS_524 plasmid unnamed2, S. sonnei IDH01791 plasmid pSSE3 and S. sonnei CFSAN030807 plasmid pCFSAN030807_8 respectively. The comparison of genetic content of the plasmids against its respective reference plasmid are depicted in Fig. 1a–c. Utilization of this approach facilitates the complete genome analysis of clinical strains, especially in studying the structural arrangement of mobile genetic elements which plays a major role in AMR dissemination. The genome features of the sequenced isolates are given in the Table 1.
The annotated chromosome of FC906 has been deposited in GenBank under accession number CP037996. For FC1653, the annotated chromosome and plasmids have been deposited under accession numbers CP037997 and CP037998, CP037999, CP038000, respectively.
Virulence and resistance determinants
The S. flexneri genome possesses virulence genes such as invasion plasmid antigen (ipaH), long polar fimbriae (lpfA), and serine protease autotransporter protein (pic and sigA) belongs to SPATEs family. Alike, S. sonnei genome carried invasion plasmid antigen (ipaH), long polar fimbriae (lpfA), enterotoxin ShET-2 (senB) and serine protease autotransporter protein (sigA). Generally the ipaH family genes are present in multiple copies on both the virulence plasmid and chromosome of the Shigella genomes . However, the gene was identified in chromosome in the sequenced isolates.
Further, the toxin genes that belongs to SPATE family has been commonly categorized into 2 classes. The gene sigA belongs to class 1 and are toxic to epithelial cells, whereas pic gene is non-toxic and usually involved in colonization. These were first reported in S. flexneri serotype 2a which is in accordance with the present study . In addition, the gene encoding Shigella enterotoxin 2 identified in S. sonnei, is reported to be involved in invasion process and play an important role in transport of electrolytes .
The genomes were also found to contain multiple resistance genes conferring resistant to streptomycin, beta-lactamase, tetracycline, trimethoprim/sulfamethoxazole, aminoglycosides and chloramphenicol. Resistance genes such as aadA1, blaOXA-1, tetB, dfrA1, and catA1 were identified in the S. flexneri chromosome. In S. sonnei, dfrA1 gene was identified in chromosome, the genes sulII, aph(6)-Id, aph(3’’)-Ib and tet(A) were identified in plasmid 1, herein named as pSS1653. These were the acquired resistance genes commonly reported among Shigella spp. On mutation analysis in quinolone resistance determining region (QRDR), S. flexneri had double mutations in gyrA (S83L and D87N) and single mutation in parC (S80I) genes. Similarly, S. sonnei had mutations S83L and D87G in gyrA and S80I in parC genes. No mutations were observed in gyrB gene. These mutations are commonly associated with fluroquinolone resistance in Shigella spp. as reported in previous studies [25,26,27].
Mobile genetic elements and pathogenicity island
Mobile elements such as bacteriophages, integrons, IS elements and PAIs are the major drivers of Shigella genome evolution and plasticity. They play a crucial role in pathogen virulence and in resistance spread. Analysis revealed the presence of class 1 integrons in S. flexneri and no integron in S. sonnei. In addition, the insertion sequences (IS) elements in Shigella are found to contribute to the antibiotic resistance and the evolution of the pathogen . Shigella genomes naturally harbour hundreds of IS and inactivation of genes (formation of pseudogenes) have been caused by IS, either through IS mediated interruption or IS mediated genome rearrangement. This inactivation of genes hinders the ability of Shigella to cause disease in humans [28, 29]. In this study, 735 and 857 pseudogenes were identified in S. flexneri and S. sonnei respectively. Also a total of 391 and 535 IS elements were predicted to be present in S. flexneri and S. sonnei genomes. The most common family identified in both the genome was the IS1 family, accounting for approximately 29% and 32% of the IS elements, followed by IS3_ssgr_IS3 family in S. flexneri and S. sonnei. The predicted IS elements were given in Table 1.
In Shigella, the serotype conversion is generally mediated by bacteriophages . The hybrid assembly analysis revealed, 15 phage regions (8 intact, 4 incomplete, 3 questionable) in S. flexneri. Similarly in S. sonnei, 15 phage regions with 5 intact, 6 incomplete and 4 questionable were identified. The phage regions covers approximately 10% and 7% of the entire chromosome of S. flexneri and S. sonnei respectively. On the third phage region of the S. flexneri chromosome, intact SfII bacteriophage was identified which is responsible for conferring the serotype 2a. The details of the identified prophages, length, position, number of CDS and GC content are provided in Tables 2 and 3.
Pathogenicity islands are the clusters of mobile elements that encode various virulence factors . PAI such as SHI-1 (also called she), SHI-2 and Shigella resistance locus (SRL) were identified in S. flexneri genome. SHI-1 contains virulence genes like pic and sigA. SHI-2 comprising of genes encoding a aerobactin operon, iron acquisition siderophore system, transposases and several hypothetical proteins that are associated with the increased virulence of the pathogen . The resistance locus, SRL contains aadA1, blaOXA-1, cat and tet genes conferring resistance to streptomycin, beta-lactams, chloramphenicol and tetracyclines.
Whereas, SHI-1 was absent in S. sonnei, and possess only SHI-2 island. This could be due to the ability of the SHI-1 to undergo spontaneous and specific excision via site-specific recombination . This shows that S. sonnei might have lost its SHI-1 region in the course of evolution process to add other important genes for their successful survival. These pathogenicity islands are reported to be associated with phage integrases, suggesting the role of phages in the evolution of Shigella . The BLAST comparison of these islands with reference was shown in Figs. 2 and 3.
The present study provided insights into the genetic content and complete structure of various mobile genetic elements that carries virulence and resistance determinants. Though, whole genome sequencing is a valuable tool for studying the bacterial genomes, the short read assembly (IonTorrent) could provide only limited information, particularly on the complete mobile genetic elements. However, long read assembly (MinION) could generate closed genome with enhanced information on the structural arrangement of mobile elements but with high error rate. Interestingly, the hybrid assembly approach involving short and long reads provided complete genome with acceptable error rate (< 10%). Thus the utilization of this novel approach in the present study helped to identify the complete plasmid sequence of pSS1653 with structural genetic information of AMR genes such as sulII, tetA, tetR, aph(6)-Id and aph(3’’)-Ib. Identification of AMR genes in mobile elements in this human-restricted enteric pathogen is a potential threat for dissemination to other gut pathogens. Further, limited information available on Shigella at genome level calls for a genomic surveillance studies to monitor the evolutionary trends and genome dynamics of emerging and existing resistance clones.
Availability of data and materials
Kotloff KL, Riddle MS, Platts-Mills JA, Pavlinac P, Zaidi AKM. Shigellosis. Lancet. 2018;391:P801–12.
Thompson CN, Duy PT, Baker S. The rising dominance of Shigella sonnei: an intercontinental shift in the etiology of bacillary dysentery. PLoS Negl Trop Dis. 2015;9:e0003708.
World Health Organization. Global antimicrobial resistance surveillance system (GLASS) report: early implementation 2017–2018.
Juhas M. Horizontal gene transfer in human pathogens. Crit Rev Microbiol. 2015;41:101–8.
Ragupathi NK, Sethuvel DP, Gajendran R, Anandan S, Walia K, Veeraraghavan B. Horizontal transfer of antimicrobial resistance determinants among enteric pathogens through bacterial conjugation. Curr Microbial. 2019;76:666–72.
Holt KE, Nga TV, Thanh DP, Vinh H, Kim DW, Tra MP, Campbell JI, Hoang NV, Vinh NT, Van Minh P, Thuy CT. Tracking the establishment of local endemic populations of an emergent enteric pathogen. Proc Natl Acad Sci. 2013;110:17522–7.
Vasudevan K, Ragupathi NK, Jacob JJ, Veeraraghavan B. Highly accurate-single chromosomal complete genomes using IonTorrent and MinION sequencing of clinical pathogens. Genomics. 2019. https://doi.org/10.1016/j.ygeno.2019.04.006.
Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27:722–36.
Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol. 2017;13:e1005595.
Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–77.
Li H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics. 2016;32:2103–10.
Vaser R, Sović I, Nagarajan N, Šikić M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 2017;27:737–46.
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9:e112963.
Joensen KG, Scheutz F, Lund O, Hasman H, Kaas RS, Nielsen EM, et al. Real-time whole genome sequencing for routine typing, surveillance, and outbreak detection of verotoxigenic Escherichia coli. J Clin Microbiol. 2014;52:1501–10.
Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund O, et al. Identification of acquired antimicrobial resistance genes. J Antimicrob Chemother. 2012;67:2640–4.
Larsen MV, Cosentino S, Rasmussen S, Friis C, Hasman H, Marvig RL, et al. Multilocus sequence typing of total genome sequenced bacteria. J Clin Microbiol. 2012;50:1355–61.
Sullivan MJ, Petty NK, Beatson SA. Easyfig: a genome comparison visualizer. Bioinformatics. 2011;27:1009–10.
Zhou Y, Liang Y, Lynch KH, Dennis JJ, Wishart DS. PHAST: a fast phage search tool. Nucleic Acids Res. 2011;39:W347–52.
Varani AM, Siguier P, Gourbeyre E, Charneau V, Chandler M. ISsaga is an ensemble of web-based methods for high throughput identification and semi-automatic annotation of insertion sequences in prokaryotic genomes. Genome Biol. 2011;12:R30.
Bopp CA, Brenner FW, Fields PL, et al. Escherichia, Shigella, and Salmonella. In: Murray PR, Baron EJ, Jorgensen J, Pfaller MA, Yolken RH, editors. Manual of clinical microbiology. 8th ed. Washington, DC: American Society for Microbiology; 2003. p. 654–71.
Kim HJ, Ryu JO, Song JY, Kim HY. Multiplex polymerase chain reaction for identification of shigellae and four Shigella species using novel genetic markers screened by comparative genomics. Foodborne Pathog Dis. 2017;14:400–6.
Venkatesan MM, Buysse JM, Kopecko DJ. Use of Shigella flexneri ipaC and ipaH gene sequences for the general identification of Shigella spp. and enteroinvasive Escherichia coli. J Clin Microbiol. 1989;27:2687–91.
Nave HH, Mansouri S, Moghadam MT, Moradi M. Virulence gene profile and multilocus variable-number tandem-repeat analysis (MLVA) of enteroinvasive Escherichia coli (EIEC) isolates from patients with diarrhea in Kerman, Iran. Jundishapur J Microbiol. 2016;9:6.
Zaidi MB, Estrada-García T. Shigella: a highly virulent and elusive pathogen. Curr Trop Med Rep. 2014;1:81–7.
Zhu Z, Cao M, Zhou X, Li B, Zhang J. Epidemic characterization and molecular genotyping of Shigella flexneri isolated from calves with diarrhea in Northwest China. Antimicrob Resist Infect. 2017;6:92.
Gu B, Qin TT, Fan WT, Bi RR, Chen Y, Li Y, Ma P. Novel mutations in gyrA and parC among Shigella sonnei strains from Jiangsu Province of China, 2002–2011. Int J Infect Dis. 2017;59:44–9.
Cui X, Wang J, Yang C, Liang B, Ma Q, Yi S, Li H, Liu H, Li P, Wu Z, Xie J, Jia L, Hao R, Wang L, Hua Y, Qiu S, Song H. Prevalence and antimicrobial resistance of Shigella flexneri serotype 2 variant in China. Front Microbiol. 2015;6:435.
Prosseda G, et al. Shedding of genes that interfere with the pathogenic lifestyle: the Shigella model. Res Microbiol. 2012;163:399–406.
Wei J, et al. Complete genome sequence and comparative genomics of Shigella flexneri serotype 2a strain 2457T. Infect Immun. 2003;71:2775–86.
Parajuli P, Deimel LP, Verma NK. Genome analysis of Shigella flexneri serotype 3b strain SFL1520 reveals significant horizontal gene acquisitions including a multidrug resistance cassette. Genome Biol Evol. 2019;11:776–85.
Sakellaris H, Luck SN, Al-Hasani K, Rajakumar K, Turner SA, Adler B. Regulated site-specific recombination of the she pathogenicity island of Shigella flexneri. Mol Microbiol. 2004;52:1329–36.
Ingersoll M, Groisman EA, Zychlinsky A. Pathogenicity Islands of Shigella. In: Hacker J, Kaper JB, editors. Pathogenicity islands and the evolution of pathogenic microbes. Current topics in microbiology and immunology, vol 264/2. Berlin: Springer; 2002. p. 49–65.
The authors gratefully acknowledge the Institutional Review Board of the Christian Medical College, Vellore (83-i/11/13) for approving the study and providing lab space and facilities. The study is part of the Ph.D. dissertation under The Tamil Nadu Dr. M.G.R. Medical University.
The study was supported by the Indian Council of Medical Research, New Delhi (Ref. No: AMR/TF/55/13ECDII dated 23/10/2013).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Muthuirulandi Sethuvel, D.P., Veeraraghavan, B., Vasudevan, K. et al. Complete genome analysis of clinical Shigella strains reveals plasmid pSS1653 with resistance determinants: a triumph of hybrid approach. Gut Pathog 11, 55 (2019). https://doi.org/10.1186/s13099-019-0334-5
- Hybrid assembly
- Pathogenicity island
- Insertion sequences