Draft genome sequences of the type strains of Shigella flexneri held at Public Health England: comparison of classical phenotypic and novel molecular assays with whole genome sequence
© Ashton et al.; licensee BioMed Central Ltd. 2014
Received: 17 February 2014
Accepted: 23 March 2014
Published: 31 March 2014
Public Health England (PHE) holds a collection of Shigella flexneri Type strains isolated between 1949 and 1972 representing 15 established serotypes and one provisional type, E1037. In this study, the genomes of all 16 PHE Type strains were sequenced using the Illumina HiSeq platform. The relationship between core genome phylogeny and serotype was examined.
The most common target gene for the detection of Shigella species in clinical PCR assays, ipaH, was detected in all genomes. The type-specific target genes were correctly identified in each genome sequence. In contrast to the S. flexneri in serotype 5 strain described by Sun et al. (2012), the two PHE serotype 5 Type strains possessed an additional oac gene and were differentiated by the presence (serotype 5b) or absence (serotype 5a) of gtrX. The somatic antigen structure and phylogenetic relationship were broadly congruent for strains expressing serotype specific antigens III, IV and V, but not for those expressing I and II. The whole genome phylogenies of the 15 isolates sequenced showed that the serotype 6 Type Strain was phylogenetically distinct from the other S. flexneri serotypes sequenced. The provisional serotype E1037 fell within the serotype 4 clade, being most closely related to the Serotype 4a Type Strain.
The S. flexneri genome sequences were used to evaluate phylogenetic relationships between Type strains and validate genotypic and phenotypic assays. The analysis confirmed that the PHE S. flexneri Type strains are phenotypically and genotypically distinct. Novel variants will continue to be added to this archive.
KeywordsShigella flexneri type strains Next generation sequencing technology Molecular serotyping
Shigella flexneri is the predominant cause of shigellosis in the developing world , making appropriate subtyping tools for tracking S. flexneri epidemiology vital to global public health. The S. flexneri serotyping scheme differentiates isolates serologically based on the expression of the major type specific somatic antigen (I-VI) and common group factor antigens (3,4 designated Y and 7,8 designated X) . The common group factor antigens account for the complex intra-serotype relationships. Currently, there are 15 established serotypes. Traditional S. flexneri serotyping is performed by slide agglutination using antiserum raised in rabbits against type specific and group factor antigens. Recently, Sun et al. published a multiplex PCR approach for molecular serotyping of S. flexneri. This method differentiates the 15 accepted serotypes based on known differences in (i) their gtr genes encoding the type specific antigens I, II, IV, and V, group factor antigen 7,8 (X) and 1c (gtrI, gtrII, gtrIV, gtrV, gtrX, and gtrIC) (ii) the oac gene that mediates O-acetylation modification in serotypes 1b, 3a, 3b, and 4b and (iii) the wzx 6 for detection of serotype 6.
Public Health England (PHE) holds an historic collection of 16 S. flexneri Type strains isolated between 1949 and 1972. Strains belonging to this set have been used to produce standardised antiserum for the phenotypic serotyping scheme at PHE for over 60 years. To increase the utility of this collection, we report the draft whole genome sequences of the 16 PHE S. flexneri Type strains in order to facilitate a greater understanding of how whole genome phylogenies compare to typing data generated from diagnostic and molecular serotyping targets.
Comparison of the phenotypic and genotypic serotyping
Genome sequencing results
Phenotypic serotype []
Genes detected using PCR scheme [].
PCR serotype []
Serotype derived from genome sequence according to PCR scheme []
Number of SNPs different in the core regions compared with reference strain
Proportion of reference mapped to
Type I + Group 6
gtrI + oac
gtrI + gtrC
Type II + Y
Type II + X
gtrII + gtrX
Type III + X
gtrIII + gtrX
Type III + Y
Type III + Y
3c Not included in the PCR scheme
Not included in the PCR scheme
Type IV + Y
Type IV + Group 6
gtrIV + oac
Type V + Y
gtrV + oac
5a Not included in the PCR scheme
Not included in the PCR scheme
Type V + X
gtrV + oac + gtrX
5b Not included in the PCR scheme
Not included in the PCR scheme
MASF IV-1 E1037
gtrIV + lpt-O
MASF IV-1 E1037
4a with Ipt-O (opt) gene
Genome sequencing and analysis
Genome statistics for the S. flexneri genomes sequenced in this study
Number of high quality mapped reads
Average coverage of S. flexneri 2457 T reference
Kmer used in Velvet assembly
Number of contigs
De novo assembly genome size
Genomic data deposition
Wellcome Trust Sanger Institute sequence data is available in the Short Read Archive under the following accession numbers (serotype): ERS088060 (1a); ERS088061 (1b); ERS088062 (1c); ERS088063 (2a); ERS088064 (2b); ERS088065 (3a); ERS088066 (3b); ERS088067 (3c); ERS088068 (4a); ERS088069 (4b); ERS088071 (5a); ERS088072 (5b); ERS088073 (6); ERS088074 (X); ERS088075 (Y); ERS088076 (E1037).
Mapping of the sequencing reads to the 4.6 Mbp S. flexneri serotype 2a strain 2457 T reference genome resulted in 99–455 times coverage, with between 731 and 47787 SNPs compared to the reference genome (Table 1). De novo assembly resulted in an average N50 of 31621 with an average of 447 contigs (Table 2).
It has long been reported that the somatic O antigen of S. flexneri serotype 6 differs considerably from that of the other S. flexneri serotypes and that strains of S. flexneri serotype 6 resemble strains of S. boydii immunochemically . Consistent with previous studies and phenotypic information, serotype 6 formed an out group from the other S. flexneri serotypes sequences (data not shown)  being more closely related to Shigella boydii CDC 3083–94 (GenBank: CP001063.1); differing by 47 787 SNPs from S. flexneri 2a (Table 1) and approximately 7300 SNPs from S. boydii CDC 3083–94 (data not shown).
In 1972, colleagues in our laboratory reported a provisional new serotype, designated E1037, frequently submitted to PHE between 2004 and 2013 (276 isolates submitted to GBRU since 2004). Phylogenetically, E1037 is closely related to Serotype 4a (Figure 1). Other groups have supported the extension of the accepted classification scheme to include this novel type [11, 12].
The presence of key diagnostic and molecular serotyping genes was also determined. We confirmed the presence of the ipaH gene (the target gene for the detection of Shigella species in diagnostic PCR assays) in all the PHE Type strains. It was not possible to de novo assemble the complete ipaH gene in any strain analysed here due to the presence of multiple homologues of ipaH in the genome. However, all 16 genomes showed the presence of the entire length of ipaH by either BLAST comparison of multiple contigs or mapping to the S. flexneri 2a 2457 T reference genome.
The molecular serotyping detailed in Sun et al.  correlated with the phenotypic data for all isolates tested (Table 1). The provisional type, E1037, was the only Type Strain to contain a copy of the plasmid-mediated seroconverting Ipt-O (opt) gene . In contrast to the serotype 5 strain described by Sun et al. (2012) , both PHE serotype 5 Type strains encoded an additional oac gene which was intact according to de novo assembly and the presence of the oac gene was confirmed by PCR . The 5a and 5b serotypes were differentiable by the presence (serotype 5b) or absence (serotype 5a) of gtrX (Table 1).
The PHE S. flexneri Type strain data set has been used in the validation and evaluation of genotypic and phenotypic assays and has facilitated the study of phylogenetic relationships within this species during outbreak investigations (unpublished observations). Analysis of the genome sequences, in conjunction with the phenotypic serotyping data, provided new insights into this historic strain set. Comparisons with the PCR serotyping scheme highlighted the need to add novel variants  in order to maintain a comprehensive collection of relevant Type strains.
Single nucleotide polymorphisms.
The authors would like to acknowledge the support of Kathie Grant, Catherine Arnold, Jonathan Green and PHE NGS Implementation Group.
- Kotloff KL, Nataro JP, Blackwelder WC, Nasrin D, Farag TH, Panchalingam S, Wu Y, Sow SO, Sur D, Breiman RF, Faruque AS, Zaidi AK, Saha D, Alonso PL, Tamboura B, Sanogo D, Onwuchekwa U, Manna B, Ramamurthy T, Kanungo S, Ochieng JB, Omore R, Oundo JO, Hossain A, Das SK, Ahmed S, Qureshi S, Quadri F, Adegbola RA, Antonio M: Burden and aetiology of diarrhoeal disease in infants and young children in developing countries (the Global Enteric Multicenter Study, GEMS): a prospective, case–control study. Lancet. 2013, 382 (9888): 209-222. 10.1016/S0140-6736(13)60844-2.View ArticlePubMedGoogle Scholar
- Ewing WH, Carpenter KP: Recommended designations for the subserotypes of Shigella flexneri. Int J Syst Bacteriol. 1966, 16: 145-149. 10.1099/00207713-16-2-145.View ArticleGoogle Scholar
- Sun Q, Lan R, Wang Y, Zhao A, Zhang S, Wang J, Wang Y, Xia S, Jin D, Cui Z, Zhao H, Li Z, Ye C, Zhang S, Jing H, Xu J: Development of a multiplex PCR assay targeting O-antigen modification genes for molecular serotyping of Shigella flexneri. J Clin Microbiol. 2011, 49: 3766-3770. 10.1128/JCM.01259-11.PubMed CentralView ArticlePubMedGoogle Scholar
- Gross RJ, Rowe B: Serotyping of Escherichia coli. The Virulence of Escherichia coli: Reviews and Methods 1985; (Special Publication of the Society for General Microbiology no. 13): 345–360. Edited by: Sussman M. 1985; (Special Publication of the Society for General Microbiology no. 13): 345–360, London: Academic Press.Google Scholar
- Lohse M, Bolger AM, Nagel A, Fernie AR, Lunn JE, Stitt M, Usadel B: RobiNA: a user-friendly, integrated software solution for RNA-Seq-based transcriptomics. Nucleic Acids Res. 2012, 40 (Web Server issue): W622-7-PubMedGoogle Scholar
- Wei J, Goldberg MB, Burland V, Venkatesan MM, Deng W, Fournier G, Mayhew GF, Plunkett G, Rose DJ, Darling A, Mau B, Perna NT, Payne SM, Runyen-Janecky LJ, Zhou S, Schwartz DC, Blattner FR: Complete genome sequence and comparative genomics of Shigella flexneri serotype 2a strain 2457 T. Infect Immun. 2003, 71: 2775-2786. 10.1128/IAI.71.5.2775-2786.2003.PubMed CentralView ArticlePubMedGoogle Scholar
- DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ: A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011, 43: 491-498. 10.1038/ng.806.PubMed CentralView ArticlePubMedGoogle Scholar
- Zerbino DR, Birney E: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008, 18: 821-829. 10.1101/gr.074492.107.PubMed CentralView ArticlePubMedGoogle Scholar
- Slopek S, Mulczyk M: Concerning the classification of Shigella flexneri 6 bacilli. Arch Immunol Ther Exp (Warsz). 1967, 15: 600-603.Google Scholar
- Yang J, Nie H, Chen L, Zhang X, Yang F, Xu X, Zhu Y, Yu J, Jin Q: Revisiting the molecular evolutionary history of Shigella spp. J Mol Evol. 2007, 64: 71-79. 10.1007/s00239-006-0052-8.View ArticlePubMedGoogle Scholar
- Pryamukhina NS, Khomenko NA: Suggestion to supplement Shigella flexneri classification scheme with the subserovar Shigella flexneri 4c: phenotypic characteristics of strains. J Clin Microbiol. 1988, 26: 1147-1149.PubMed CentralPubMedGoogle Scholar
- Sun Q, Knirel YA, Lan R, Wang J, Senchenkova SN, Jin D, Shashkov AS, Xia S, Perepelov AV, Chen Q, Wang Y, Wang H, Xu J: A novel plasmid-encoded serotype conversion mechanism through addition of phosphoethanolamine to the O-antigen of Shigella flexneri. PLoS One. 2012, 7: e46095-10.1371/journal.pone.0046095.PubMed CentralView ArticlePubMedGoogle Scholar
- Sun Q, Lan R, Wang J, Xia S, Wang Y, Wang Y, Jin D, Yu B, Knirel YA, Xu J: Identification and characterization of a novel Shigella flexneri serotype Yv in China. PLoS One. 2013, 8: e70238-10.1371/journal.pone.0070238.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.