Comparative genomic analysis of Clostridium difficile ribotype 027 strains including the newly sequenced strain NCKUH-21 isolated from a patient in Taiwan
© The Author(s) 2017
Received: 13 October 2017
Accepted: 21 November 2017
Published: 29 November 2017
Clostridium difficile is a Gram-positive anaerobe and the leading cause of antibiotic-associated diarrhea worldwide. The emergence of ribotype 027 (RT027) strains is associated with increased incidence of infection and mortality. To further understand the relationship between C. difficile NCKUH-21, a RT027 strain isolated from a patient in Taiwan, and other RT027 strains, we performed whole-genome shotgun sequencing on NCKUH-21 and comparative genomic analyses.
The genome size, G+C content, and gene number for the NCKUH-21 strain were determined to be similar to those for other C. difficile strains. The core genome phylogeny indicated that the five RT027 strains R20291, CD196, NCKUH-21, BI1, and 2007855 formed a clade. A pathogenicity locus, tcdR-tcdB-tcdE-orf-tcdA-tcdC, was conserved in the genome. A genomic region highly similar to the Clostridium phage \(\upvarphi\)CD38-2 was present in the NCKUH-21 strain but absent in the other RT027 strains and designated as the prophage \(\upvarphi\)NCKUH-21. The prophage \(\upvarphi\)NCKUH-21 genes were significantly higher in G+C content than the other genes in the NCKUH-21 genome, indicating that the prophage does not match the base composition of the host genome.
This is the first whole-genome analysis of a RT027 C. difficile strain isolated from Taiwan. Due to the high identity with \(\upvarphi\)CD38-2, the prophage identified in the NCKUH-21 genome has the potential to regulate toxin production. These results provide important information for understanding the pathogenicity of RT027 C. difficile in Taiwan.
Clostridium difficile is a Gram-positive, endospore-forming obligate anaerobe and the current leading cause of antibiotic-associated diarrhea (AAD) within hospital settings worldwide . Estimates have revealed that C. difficile infections (CDI) are responsible for 15–25% of all AAD cases . Onset of CDI can be engendered by disruption of the hosts’ gut microbiota by broad-spectrum antibiotic treatments. Aging, prolonged stay in health care settings, and proton-pump inhibitor use all contribute to increased risk of CDI . Although C. difficile has been characterized for decades, it first gained prominence in 2003 when an outbreak in North America was found to be caused by a strain with toxin hyperproduction capabilities . The rapid spread of C. difficile NAP1/BI/027 strain (PCR ribotype 027 or RT027), which is the same strain characterized with different methods has resulted in outbreaks worldwide, although cases in Asia and Latin America were less reported compared with Europe and North America.
According to a previous case report, NCKUH-21 is the strain isolated from the first severe RT027 CDI in Taiwan, and it contains a deletion of 18 base pairs and a truncated mutation (D117A) in tcdC . To further understand the relationship between NCKUH-21 and other RT027 strains including historic strains and hypervirulent strains, we determined the genome sequence of the C. difficile strain NCKUH-21 (the accession numbers: BDSN01000001–BDSN01000094) and compared it with other sequenced RT027 strains. We assessed the presence of virulence and antibiotic resistance genes for the NCKUH-21 genome. We also compared the genome sequences of the NCKUH-21 strain with its close relatives to investigate the genome synteny, reconstruct the phylogenetic tree, and identify NCKUH-21 strain-specific genes.
Analysis of the genomic features of Clostridium strains
C. difficile R20291
An epidemic strain, UK, 2006
C. difficile CD196
A patient with CDI, France, 1985
C. difficile NCKUH-21
A patient with severe PMC, Taiwan, 2014
C. difficile BI1
A human strain, USA, 1988
C. difficile 2007855
A bovine strain, USA, 2007
C. difficile Z31
A canine NTCD strain, Brazil, 2009
C. difficile CD630
A patient with severe PMC, Switzerland, 1982
C. difficile M68
A human strain, Ireland, 2006
C. difficile M120
A human strain, UK, 2007
C. mangenotii LM2
A reference genome from the rumen microbiome
Genomic DNAs were purified from a pure culture of a single bacterial isolate of NCKUH-21. A BLAST search against a nonredundant database revealed no potential contamination of the genomic libraries.
Results and discussion
Illumina MiSeq sequencing was performed to determine the genome sequence of the C. difficile strain NCKUH-21. The de novo assembly contained 94 contigs of length 4,217,149 bp, with a G+C content of 28.4% with sequencing coverage of 1611×. Genome annotation yielded a total of 3810 protein-coding sequences (CDSs).
Among the C. difficile strains analyzed in this paper, the genome size (Mb) ranged from 4.05 to 4.46, G+C content ranged from 28.4 to 29.2%, and CDS number ranged from 3485 to 4128 (Table 1). The general genomic features for the NCKUH-21 strain were thus similar to those of the other C. difficile strains.
Antibiotic resistance and virulence genes
Antibiotic resistance and virulence genes were searched using ABRicate. Homologous DNA sequences for the binary toxin genes cdtA and cdtB listed in the Virulence Factors Database (accessions of AAF81760 and AAF81761, respectively) were detected in the NCKUH-21 genome . Homologous DNA sequences for the antibiotic resistance genes cdeA, vanRG, and vanG listed in the Comprehensive Antibiotic Resistance Database (accessions of AJ574887.1:371–1697, DQ212986:2259–2967, and DQ212986:5985–7035, respectively) were detected in the NCKUH-21 genome. Although NCKUH-21 showed the genetic potential for becoming resistant to antibiotics, this strain was shown to be susceptible to moxifloxacin (minimum inhibitory concentration 0.5 μg/mL), metronidazole (0.094 μg/mL), and vancomycin (0.5 μg/mL) .
The genetic organization of the pathogenicity locus (PaLoc) of the CD630 strain is tcdR-tcdB-tcdE-orf-tcdA-tcdC (locus_tag: CD630_06590, CD630_06600, CD630_06610, CD630_06620, CD630_06630, and CD630_06640) . The gene order was conserved in the NCKUH-21 genome (the accession number: BDSN01000011; locus_tag: NCKUH21_00647, NCKUH21_00648, NCKUH21_00649, NCKUH21_00650, NCKUH21_00651, and NCKUH21_00652). Moreover, another sequence similar to tcdE (CD630_06610) was found in the NCKUH-21 genome (locus_tag: NCKUH21_03847) with 83% amino acid identity. The genes tcdB and tcdA encoding Toxin B and Toxin A (locus_tag: CD630_06600 and CD630_06630; 2366 and 2710 amino acids in length), respectively, of the CD630 PaLoc were determined to be homologous with 48% amino acid identity; additionally, these two genes partly matched a sequence encoding “N-acetylmuramoyl-l-alanine amidase LytC” (the accession number: BDSN01000021; locus_tag: NCKUH21_02692; 644 amino acids in length) in the NCKUH-21 genome with 177 and 226 alignment length and 32 and 34% amino acid identity values, respectively. The PaLoc gene homologues may contribute to the virulence and pathogenicity for the C. difficile strain NCKUH-21.
NCKUH-21 strain-specific genes
To identify NCKUH-21 strain-specific genes, we searched the NCKUH-21 strain’s protein homologues in the genome sequences of all C. difficile strains by using the gene screen method with TBLASTN in the large-scale blast score ratio (LS-BSR) pipeline. Of the 3810 protein-coding genes identified in NCKUH-21, 3579 were conserved in all the other RT027 strains (R20291, CD196, BI1, and 2007855), and 2832 were conserved in all the C. difficile strains used in this study. Among the strains, the largest numbers of NCKUH-21 genes were conserved in the RT027 strains (R20291, CD196, BI1, and 2007855), ranging from 3592 to 3655, followed by other C. difficile strains (Z31, 630, M68, and M120), ranging from 3153 to 3431, and finally the outgroup LM2 (761).
A total of 140 protein-coding genes were present in the NCKUH-21 strain but absent in the other strains (Additional file 2: Table S1). The NCKUH-21 strain-specific genes could have been gained on the branch leading to the NCKUH-21 strain, and they could thus be linked to its specific phenotype (e.g., virulence and pathogenicity). Of the 140 NCKUH-21 strain-specific genes, 50 were encoded on the 40,525-bp-long contig sequence of the NCKUH-21 genome (the accession number: BDSN01000034), which showed a 99% identity match to the Clostridium phage \(\upvarphi\)CD38-2 (GenBank accession: HM568888). The genomic region highly similar to the Clostridium phage \(\upvarphi\)CD38-2 was designated as the prophage \(\upvarphi\)NCKUH-21.
The prophage \(\upvarphi\) NCKUH-21 detected in the draft genome for the C. difficile strain NCKUH-21 was further confirmed by phage induction examination and electron microscope imaging (data not shown). A previous study suggested that lysogenic \(\upvarphi\)CD38-2 replicates as a circular plasmid and boosts toxin production in C. difficile . The high sequence identity between \(\upvarphi\)NCKUH-21 and \(\upvarphi\)CD38-2 suggests that these prophages have a similar role in C. difficile.
Reports have revealed that bacterial phages tend to be lower in G+C content than their hosts and that viruses match the G+C content of their hosts [11, 12], including the C. difficile bacteriophage \(\upvarphi\)CD119 . Base composition statistics for the NCKUH-21 genes were calculated as the relative frequency of G+C at third codon positions (GC3). The median GC3 value for the prophage \(\upvarphi\)NCKUH-21 genes (0.21) was higher than that for the other genes (0.14) in the NCKUH-21 genome. A Wilcoxon rank sum test, which compared the GC3 values between the two groups of genes, was highly significant (P < 2.2e−16). This suggests that the prophage \(\upvarphi\)NCKUH-21 does not match the base composition of the host genome and may thus have been acquired by horizontal transfer based on the hypothesis of genome amelioration .
From 2013 to 2014, three RT027 C. difficile strains were isolated from patients in Taiwan [5, 15, 16]. Among them, NCKUH-21 is the first strain to have a whole-genome sequence for genome comparison. Whether the other two RT027 isolates also carry a complete prophage, what their phylogenetic relation with NCKUH-21 is, and what the relative toxin production level is between the three isolates are all topics for further research.
HS conducted the bioinformatics analyses and drafted the manuscript. MT managed bioinformatics environments and helped write the manuscript. JWC performed the laboratory experiments and wrote the manuscript. IHH provided experimental suggestions and wrote the manuscript. PJT, WCK, and YPH provided the isolate and clinical characterizations. All authors read and approved the final manuscript.
Computational resources were provided by the Data Integration and Analysis Facility, National Institute for Basic Biology, Japan.
The authors declare that they have no competing interests.
Availability of data and materials
Nucleotide sequence accession numbers: The whole-genome shotgun sequencing data have been deposited in DDBJ/EMBL/GenBank under the accession numbers BDSN01000001–BDSN01000094 (94 entries).
Consent for publication
Ethics approval and consent to participate
This work was supported in part by research funding from Keio University and from Yamagata Prefecture and Tsuruoka City, Japan, and Ministry of Science and Technology, Taiwan, Grants (103-2320-B-006-028-MY2 to JC, 106-2633-B-006-002- to IH).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Ananthakrishnan AN. Clostridium difficile infection: epidemiology, risk factors and management. Nat Rev Gastroenterol Hepatol. 2011;8(1):17–26.View ArticlePubMedGoogle Scholar
- Bartlett JG, Gerding DN. Clinical recognition and diagnosis of Clostridium difficile infection. Clin Infect Dis. 2008;46(Suppl 1):S12–8.View ArticlePubMedGoogle Scholar
- Jump RL. Clostridium difficile infection in older adults. Aging Health. 2013;9(4):403–14.View ArticlePubMedPubMed CentralGoogle Scholar
- O’Connor JR, Johnson S, Gerding DN. Clostridium difficile infection caused by the epidemic BI/NAP1/027 strain. Gastroenterology. 2009;136(6):1913–24.View ArticlePubMedGoogle Scholar
- Hung YP, Cia CT, Tsai BY, Chen PC, Lin HJ, Liu HC, Lee JC, Wu YH, Tsai PJ, Ko WC. The first case of severe Clostridium difficile ribotype 027 infection in Taiwan. J Infect. 2015;70(1):98–101.View ArticlePubMedGoogle Scholar
- Kurka H, Ehrenreich A, Ludwig W, Monot M, Rupnik M, Barbut F, Indra A, Dupuy B, Liebl W. Sequence similarity of Clostridium difficile strains by analysis of conserved genes and genome content is reflected by their ribotype affiliation. PLoS ONE. 2014;9(1):e86535.View ArticlePubMedPubMed CentralGoogle Scholar
- Pereira FL, Oliveira Junior CA, Silva ROS, Dorella FA, Carvalho AF, Almeida GMF, Leal CAG, Lobato FCF, Figueiredo HCP. Complete genome sequence of Peptoclostridium difficile strain Z31. Gut Pathog. 2016;8:11.View ArticlePubMedPubMed CentralGoogle Scholar
- Gerding DN, Johnson S, Rupnik M, Aktories K. Clostridium difficile binary toxin CDT: mechanism, epidemiology, and potential clinical importance. Gut Microbes. 2014;5(1):15–27.View ArticlePubMedGoogle Scholar
- Monot M, Eckert C, Lemire A, Hamiot A, Dubois T, Tessier C, Dumoulard B, Hamel B, Petit A, Lalande V, et al. Clostridium difficile: new insights into the evolution of the pathogenicity locus. Sci Rep. 2015;5:15023.View ArticlePubMedPubMed CentralGoogle Scholar
- Sekulovic O, Meessen-Pinard M, Fortier LC. Prophage-stimulated toxin production in Clostridium difficile NAP1/027 lysogens. J Bacteriol. 2011;193(11):2726–34.View ArticlePubMedPubMed CentralGoogle Scholar
- Rocha EP, Danchin A. Base composition bias might result from competition for metabolic resources. Trends Genet. 2002;18(6):291–4.View ArticlePubMedGoogle Scholar
- Cardinale DJ, Duffy S. Single-stranded genomic architecture constrains optimal codon usage. Bacteriophage. 2011;1(4):219–24.View ArticlePubMedPubMed CentralGoogle Scholar
- Govind R, Fralick JA, Rolfe RD. Genomic organization and molecular characterization of Clostridium difficile bacteriophage PhiCD119. J Bacteriol. 2006;188(7):2568–77.View ArticlePubMedPubMed CentralGoogle Scholar
- Lawrence JG, Ochman H. Amelioration of bacterial genomes: rates of change and exchange. J Mol Evol. 1997;44(4):383–97.View ArticlePubMedGoogle Scholar
- Liao TL, Lin CF, Chiou CS, Shen GH, Wang J. Clostridium difficile PCR ribotype 027 emerges in Taiwan. Jpn J Infect Dis. 2015;68(4):338–40.View ArticlePubMedGoogle Scholar
- Lai MJ, Chiueh TS, Huang ZY, Lin JC. The first Clostridium difficile ribotype 027 strain isolated in Taiwan. J Formos Med Assoc. 2016;115(3):210–2.View ArticlePubMedGoogle Scholar