Analysis of proteomes released from in vitro cultured eight Clostridium difficile PCR ribotypes revealed specific expression in PCR ribotypes 027 and 176 confirming their genetic relatedness and clinical importance at the proteomic level

Background Clostridium difficile is the causative agent of C. difficile infection (CDI) that could be manifested by diarrhea, pseudomembranous colitis or life-threatening toxic megacolon. The spread of certain strains represents a significant economic burden for health-care. The epidemic successful strains are also associated with severe clinical features of CDI. Therefore, a proteomic study has been conducted that comprises proteomes released from in vitro cultured panel of eight different PCR ribotypes (RTs) and employs the combination of shotgun proteomics and label-free quantification (LFQ) approach. Results The comparative semi-quantitative analyses enabled investigation of a total of 662 proteins. Both hierarchical clustering and principal component analysis (PCA) created eight distinctive groups. From these quantifiable proteins, 27 were significantly increased in functional annotations. Among them, several known factors connected with virulence were identified, such as toxin A, B, binary toxin, flagellar proteins, and proteins associated with Pro–Pro endopeptidase (PPEP-1) functional complex. Comparative analysis of protein expression showed a higher expression or unique expression of proteins linked to pathogenicity or iron metabolism in RTs 027 and 176 supporting their genetic relatedness and clinical importance at the proteomic level. Moreover, the absence of putative nitroreductase and the abundance of the Abc-type fe3+ transport system protein were observed as biomarkers for the RTs possessing binary toxin genes (027, 176 and 078). Higher expression of selected flagellar proteins clearly distinguished RTs 027, 176, 005 and 012, confirming the pathogenic role of the assembly in CDI. Finally, the histidine synthesis pathway regulating protein complex HisG/HisZ was observed only in isolates possessing the genes for toxin A and B. Conclusions This study showed the applicability of the LFQ approach and provided the first semi-quantitative insight into the proteomes released from in vitro cultured panel of eight RTs. The observed differences pointed to a new direction for studies focused on the elucidation of the mechanisms underlining the CDI nature. Electronic supplementary material The online version of this article (doi:10.1186/s13099-017-0194-9) contains supplementary material, which is available to authorized users.


Background
Clostridium difficile is a ubiquitous Gram positive sporeforming anaerobic bacterium. Toxin-producing strains of C. difficile can cause infection (CDI) manifested by diarrhea, pseudomembranous colitis, or severe form, toxic megacolon. The spread of certain C. difficile PCR ribotypes in health-care setting has been reported and the global increasing trend of CDI incidence is unfavorable [1].
Several well-studied major virulence factors, such as C. difficile toxins (toxin A-TcdA and toxin B-TcdB) that are influential in severity of the CDI are at the focus of current research [2]. The different levels of toxin expression in vivo was reported [1] as well as the connection between the expression of toxins and flagellar proteins, which are involved in motility and gut colonization [3].
Previous proteomic studies were based on investigation of either the whole cell lysates [4,5], or culture supernatants representing possible secretome in vitro [2,6]. For comparative proteomic analyses difference gel electrophoresis (DIGE) [4] or isotopic labelling of selected proteins [7] were used.
The aforementioned studies enabled the analyses of the large toxins TcdA and TcdB [2] and also highlighted the role of proteins involved in the adhesion and cell surface composition. The differences in the level of expression of proteins including adhesins, S-layer proteins, cell wall proteins as well as a number of S layer protein paralogues [4][5][6] and other potential virulence factors were identified and proposed to play a role in the virulence characteristics of individual isolates.
This proteomic study used the combination of shotgun proteomics-to attain proteomic profile-and the labelfree quantification (LFQ) approach, for semi-quantitative analysis. Shotgun proteomics which uses a high-resolution tandem mass spectrometry enables the analysis of hundreds of proteins in a cost-effective manner.
In spite of its accuracy and sensitivity, the major drawbacks of the gel-based approach are that relatively high amount of the protein samples is needed and its laborintensiveness. Moreover, labelling-based techniques are limited by the need of expensive consumables, an inability to add further samples into the experiment and a limited number of compared groups [8]. On the contrary, LFQ does not require any labelling step in the sample preparation workflow and relies only on spectral counting or MS 1 intensity of the quantified feature. That the use of labels is unnecessary provides this method with several attractive benefits: the implementation cost is low; the lack of additional steps reduces undesirable biases in the analyses; and the number of treatment conditions and sample replicates is basically unrestrained. Collectively, these features allow for flexibility in experimental design. Furthermore, recent developments in label-free quantification software have increased the robustness of label-free quantitation workflows by introducing sophisticated normalization and feature alignment algorithms [9]. In the bacterial proteomics, LFQ has been recently applied to the comparison of lysates [10] the relative abundance of ribosomal proteins in Pseudomonas aeruginosa [11] and to phosphoproteomes of Bacillus subtilis [12].
For the current proteomic comparative analysis, we decided to analyze proteomes released from in vitro cultured panel for the following reasons: (i) the released proteins are of high relevance for Clostridia pathogenicity and virulence; (ii) the complexity of the released fraction is much lower than that of cellular proteome and therefore it is more amenable for the scope of our analysis.
The panel of eight C. difficile isolates for the study was selected from the Czech C. difficile strain collection [13]. RTs 001 and 176 belong to the predominant RTs in the Czech Republic (26.7 and 20.7%, respectively), followed by RTs 014 and 012 (8.0 and 5.8%, respectively), [13]. The occurrence of RTs 027 and 078 in the Czech Republic is rare (0.2, 1.6%, respectively) [13], however, these RTs were suggested as being "hypervirulent" [14,15]. Moreover, a higher expression of TcdA and TcdB in RT 027 was previously observed [1]. RTs 027 (19%), 001 (11%) and 014 (10% together with RT 020) belong to the most frequently found RTs in Europe [16]. The protein expression profile of RT 005 together with RTs 001 and 027 was studied previously [4]. RT 010 was included in the study as a "negative control" due to the absence of a pathogenicity locus.

Clostridium difficile isolates
Eight well characterized clinical isolates of C. difficile were selected from the Czech National C. difficile strain collection (Table 1) [13].
Seven C. difficile isolates were cultured from diarrheal glutamate dehydrogenase (GDH) and toxin A/B positive stool samples of hospitalized patients with CDI. One C. difficile isolate (RT 010, non-toxigenic) was cultured from diarrheal GDH positive and toxin A/B negative Keywords: Clostridium difficile, Label-free quantification, Proteome, PCR ribotype 027, PCR ribotype 176, Binary toxin, Toxins A/B, Flagellins stool sample of patient with Candida-acquired diarrhea. All isolates were sensitive to metronidazole (MTZ).

Culture of C. difficile isolates and supernatant precipitation
Clostridium difficile isolates were recovered from the frozen stocks by inoculating on the Schaedler Anaerobe Agar CM0437 (Oxoid) and cultured for 48 h at 37 °C under anaerobic conditions. Toxin production of all strains in the study was confirmed using commercial immunochromatographic assay (Vidia, Czech Republic) for the detection of free toxins A and B in the stool samples when the bacterial suspension was investigated as a stool sample. The bacterial mass was resuspended in Thioglycolate medium USP (Oxoid) and the number of bacteria (CFU) was analyzed via optical density (OD) analysis at 595 nm (Multiskan Spectrum plate reader, Thermo Fisher Scientific), considering that OD 1 in 1 mL of Thioglycolate medium corresponds to 2.4 × 10 6 CFU. Later, 9 mL of Thioglycolate medium was inoculated in triplicate for each representative strain to OD 1.99 and cultivated for 5 days at 37 °C under anaerobic conditions. OD was also measured at the end of the cultivation and reached comparable values among the cultures (Additional file 1: Table S1). Capillary electrophoresis ribotyping of C. difficile isolates was performed, using primers described elsewhere [17], before resuspension in Thioglycolate medium USP and after 5 days of culture before proteomic analysis.

Sample preparation
Following the pelleting of bacterial cells by centrifugation (18,000g/20 min) to remove all bacterial cells from the proteomes released from in vitro cultures, the pH of the supernatants was adjusted to 3.5 with 3 M sulfuric acid (Sigma-Aldrich). After an overnight precipitation at 4 °C, the pellets were recovered by centrifugation (18,000g/20 min). Because of the potential of interfering substances in the supernatant, the sample preparation workflow was applied and based on FASP (Filter aided sample preparation-FASP) [18]. Pellets were resuspended in 100 mM ammonium bicarbonate (Sigma-Aldrich), and proteins were quantified by bicinchoninic acid assay, (QuantiPro ™ BCA Assay Kit, Sigma-Aldrich), [19]. Data are available in the Additional file 1: Table S1. Resuspended pellets were transferred onto Amicon ® Ultra-10 kDa filters (Millipore) and washed twice by 100 mM ammonium bicarbonate. Subsequently, the samples were denatured by 8 M guanidinium chloride (Sigma-Aldrich), reduced with 100 mM Tris (2-carboxyethyl) phosphine hydrochloride (TCEP, Sigma-Aldrich) and alkylated with 300 mM iodoacetamide (Sigma-Aldrich). Finally, the samples were digested with 2 μg of sequencing grade trypsin (Promega) overnight at 37 °C. Empore ™ SPE Cartridges, C18, standard density, bed I.D. 4 mm (Sigma-Aldrich) were used to desalt peptide mixtures before drying to completion in a speed-vac. Before the mass spectrometry analysis, the samples were resuspended in 30 μL of 2% acetonitrile (ACN)/0.1% trifluoroacteic acid. The samples were furthered analyzed by LC-MS/MS techniques involving targeted mass spectrometry and LFQ. Subcellular localization of the proteins released from in vitro cultured panel was evaluated by bioinformatic tools. Detailed description of procedure is described in the Additional file 1.

Results
Using the LFQ approach, a total of 662 quantifiable proteins were analyzed (Additional file 1: Table S3). The observed quantities of the proteins are depicted as log 2 transformation of LFQ intensities [9]. The values ranging from 22 to 34 reflect the dynamic range of the mass spectrometry based workflow. The LFQ intensities below this value are considered as non-analyzable by the implemented qualitative test [20]. The most shared proteins were observed in RTs 027 and 176 (n = 563) and the lowest rate revealed RT 078 compared to all RTs in the study (n = 454-479), see Fig. 1a. Pathway mapping in KEGG was done using the DAVID classification tool [21] against C. difficile strain 630, and several biological processes were annotated to 40.1% of the quantifiable proteins (Fig. 1b).

Comparison of proteomes of individual RTs
To assess the applicability of the LFQ approach, we examined the similarity of individual proteomes using hierarchical clustering and principal component analysis. Unsupervised cluster analysis of protein expression profiles was performed using Euclidean distances. Statistical procedures were performed using the computational platform Perseus [9]. Both hierarchical clustering ( Fig. 1c) and PCA (Fig. 1d) generated eight distinctive groups encompassing each biological triplicate of analyzed RT representatives and showed the applicability of this workflow. For example, RTs 027 and 176 nearly co-cluster, on the contrary proteomes from RT 078 created a distinctive group.

The selection of proteins significantly increased in functional annotations
The selected proteins were chosen as candidates passing through ANOVA statistical test and subsequent Fisher exact test for increase in functional annotations. The details of the procedures are described below. The imputation of missing values from a normal distribution (Gaussian distribution width 0.3 SD and down-shift 1.8 SD of the original data) was performed, and proteins were annotated by Gene Ontology (GO) terms and Uni-Prot keywords for the strain R20291 or for the strain 630 (downloaded on December 19th 2015). Furthermore, ANOVA (permutation-based FDR 5%, S0 = 0) was used to identify significant differences in protein expression between the RTs. Only ANOVA-significant hits were used for subsequent hierarchical clustering using Euclidean distances to group proteins with similar expression profiles. Finally, Fisher exact test at 2% Benjamini-Hochberg FDR was applied to determine significantly overrepresented functional annotations for each one of the identified clusters. In total, 27 proteins were found to be significantly increased in functional annotations. The MS data of these proteins are shown in the Additional file 1: Table S4; the results for proteins significantly increased in functional annotations are shown in Table 2).

Bioinformatic analysis of subcellular localization of the proteins released from in vitro cultured panel
To identify the subcellular localization of the proteins released from in vitro cultured panel, the bioinformatic analyses focused on Sec pathway and alternative secretion modes markers were performed. The sequences of all identified proteins were processed with SignalP 4.1 (http://www.cbs.dtu.dk/services/SignalP/) [22]. The secretion type of protein identified was predicted as "classical" if a signal peptide was identified with Signal P Score > 0.5. Furthermore, the Secretome P 2.0 tool (http://www.cbs.dtu.dk/services/SecretomeP/) [23] was employed. Using default parameters for Grampositive bacteria and Secretome P Score > 0.5, proteins were predicted as "alternatively secreted". A majority of the proteins found to be significantly increased in functional annotations were predicted to be secreted via sec-dependent secretion pathway or via an alternative secretion system (see Additional file 1: Table S4proteins designated as SigP for classical and SecP for alternative secretion) proposing the overlapping of the supernatant proteome and the secretome. However, Zinc binding protein (C9YLG4), putative nitroreductase C9YRI2, FliC, FliL, HisG, and HisZ were not predicted to be secreted via any secretion system and are reportedly localized intracellularly.

Proteins connected with pathogenicity
Using the targeted mass spectrometry, TcdA was detected as highly produced in RTs 027 and 176. Lower quantities of TcdA were observed in RTs 005 and 012. The protein was not detected in RTs 010, 078, and 001. TcdB was detected exclusively in RTs 027 and 176. Employing heavy labeled peptides and targeted mass spectrometry for TcdA and TcdB, the analysis confirmed these results (Additional file 1: Table S5, Figure S2) and approved the LFQ approach relevancy. The expression differences of selected proteins connected with pathogenicity are depicted in Table 2.
CdtA and CdtB/(binary toxin) were observed in the secreted fractions of RTs 027, 078 and 176, and in high levels in RTs 027 and 176.
The expression of Sigma-54 dependent regulatory protein (C9YK92), responsible for creating swift and precise responses to environmental change [24], was observed among RTs 027, 176, and 005. Furthermore, the high expression of nucleic acid zinc binding protein (C9YLG4) and the cell surface, putative penicillin binding protein, which belongs to the cell wall binding repeat 2 family (C9YLL2), were observed exclusively in RTs 027 and 176.

Proteins connected with nitric group reduction and iron metabolism
The nitroreductase C9YRI2, an enzyme putatively involved in the activation of the MTZ, was expressed at similar levels in RTs 001, 005, 010, 012, and 014, but was not detected in RTs 027, 176, and 078. In contrast, Abctype Fe3+ transport system periplasmic component-like

Proteins involved in the assembly of C. difficile flagellum
A higher expression of flagellar proteins FlgE, G, K, L, Fli C, D, K, and flagellar basal body protein C9YI80 were observed among RTs 027, 176, 005, and 014 with the exception of FlgK and FliD quantifiable also in RT 078. Moreover, FlgC levels were increased only in RTs 027, 176, and 014, whilst FlgM and FliL were found to be increased in RTs 027, 176, and 005 ( Table 2). Interestingly, FliE was expressed only in RTs 027 and 176, and glycosyltransferase C9YI34, which is involved in the posttranslational modification of flagellum, was expressed in RTs 027, 176, 005, and 014.

Pro-Pro endopeptidase (PPEP-1)
The Pro-Pro endopeptidase (PPEP-1) expression was observed in all RTs without relevant differences ( Table 2). In addition, the search for substrates of PPEP-1 was also performed. Since PPEP-1 gene is probably not present in R20291 genome/proteome, the reference strain 630 [25,26] was employed. The substrates CD2831/Q183R6 and CD3246/Q17ZZ0 were detected only in the replicates of RT 078 isolate ( Table 2).

Proteins involved in histidine pathway metabolism
HisG (C9YLE5) a HisZ (C9YLE4) were not detected in non-toxigenic RT 010 (Table 2). In other RTs, the expression of these proteins was observed in high levels but without distinct differences.

Discussion
In this study, we used a combination of MS-based shotgun proteomics and the LFQ approach. Previously published qualitative studies involved less than three C. difficile representatives [4,6]. In contrast, the unlimited number of compared groups in the LFQ technique enabled the semi-quantitative investigation of the proteomes released from in vitro cultured panel of eight C. difficile isolates and the applicability of this workflow was shown by hierarchical clustering and PCA; both methods generated eight distinctive groups encompassing each biological triplicate of RTs analyzed (see Fig. 1). RTs 027 and 176 revealed higher expression of proteins connected to pathogenicity (TcdA, TcdB, CdtA, CdtB, sigma-54 dependent regulatory protein, nucleic acid zinc binding protein, the cell wall binding repeat 2 family), which confirms their clinical importance [1,27] and evolutionary relationship at the proteomic level. Finally, RT 078, also referred to as hyper virulent [15], showed measurable levels of toxins but in lower levels compared to RTs 027 and 176.
Recent studies reported that the orphan response regulator CdtR enhances production not only of CDT from the same locus (CDT locus), but also of TcdA and TcdB from the Pathogenicity Locus. It was confirmed in two RT 027 human strains and also supported in the animal model. Contrary to that, in RT 078 strain where cdtR is a pseudogene, and in RT 012 strain where ctdA/B are pseudogenes, the function of CdtR was not proven [28].
The inability to detect the large C. difficile toxins in RTs 014 and 001 in this proteomic study, in spite of the positivity in the immunochromatographic assay, could be caused by the generally lower sensitivity of even highresolution tandem mass spectrometry compared to tests based on antibodies (the detection limits of immunochromatographic assay used here was 12.5 ng/mL). These findings are in agreement with previously published studies showing that very low levels of toxins produced in vitro by toxigenic strains are hardly detectable by mass spectrometry [2], especially in the non-hypervirulent representatives of C. difficile RTs [4].
RTs 027, 176, and 078 also showed higher expression of Abc-type Fe3+ transport system periplasmic component-like protein C9YQW5. The iron represents a crucial nutrition factor and the competition over its bioavailability plays an essential role within complex microbial communities as well as between bacterial pathogens and their eukaryotic hosts [29]. The pathways of iron inside the cell could involve high-affinity iron chelators known as siderophores translocated by specific ABC transporters [30]. On the other hand, high levels of intracellular iron can increase oxidative damage and therefore, the expression of iron acquisition mechanisms are tightly controlled by transcriptional regulators [31].
The homologues of C9YQW5 are present only in the UniProt proteomes of RTs also expressing the binary toxins. This could indicate the presence of a specific mechanism that is responsible for the iron uptake in RTs possessing binary toxin genes. However, the lack of other differentially expressed proteins participating in the metabolism of the iron does not support the hypothesis of increased ability of the iron uptake in these RTs.
Interestingly, no expression of C9YRI2, a putative nitroreductase, was observed in RTs 027, 176, and 078. However, all isolates involved in this study were susceptible to metronidazole (MTZ). Based on the reference proteomes, the genes for the homologs of nitroreductase C9YRI2 have been reported in RT 027 [25,26] and RT 078 [15] and therefore, the probable absence of expression among these RTs may be imposed by some regulatory mechanism. For that reason, the effect of the reduced levels of C9YRI2 is probably compensated by other unidentified mechanism of activation of the prodrug. This hypothesis is supported by the study that compared MTZ resistant and susceptible North American pulse-field type 1, PCR ribotype 027 (NAP1/027) isolates, where no changes in this protein were observed [32]. Thus, the absence of expression of the protein C9YRI2 probably does not play a crucial role in MTZ resistance.
Despite the shared presence of genes for large and binary toxins, the RTs 027, 176, and 078 represent clearly distinctive entities based on their proteome profiles. The only common characteristics observed are changed expressions of Abc-type Fe3+ transport system protein C9YQW5 and the putative nitroreductase C9YRI2. In spite of the findings in this study, the comparative proteomic study on NAP1/027 clinical isolates resistant to MTZ showed upregulation of ferric uptake regulator (Fur) [32], but the connection between iron uptake regulation and MTZ resistance was not confirmed.
The proteins involved in the assembly of C. difficile flagellum and Pro-Pro endopeptidase were present among other proteins revealing higher expression. The quantification of the proteins involved in the assembly of C. difficile flagellum pointed toward RTs 027 and 176, the main proteins constituting hook-basal-body complex and the rotating filament were observed as overexpressed. The discriminatory proteins with expression characteristic only for RTs 027 and 176 involved FliE protein and glycosyltransferase C9YI34. FliE participates in the normal export of other substrates. However, a very low basal level of export function was previously described even in the absence of FliE. This argues against a vital role for FliE in export and proposes the primary role of FliE as a structural adapter between the annular symmetry of the membrane and supramembrane ring and the helical symmetry of the rod and all subsequent axial structures [33].
A homologue of glycosyltransferase C9YI34 (CD0240 in C. difficile 630) was proven to be involved in the glycosylation process. Inactivation of CD0240 led to loss of the surface-associated flagellin protein and rendered the strain non-motile. However, the strain still produced truncated polymerized flagella filaments [34,35]. In our study, this protein was also observed in RTs 014 and 005. However, the expression levels were lower. Thus, flagellin glycosylation was confirmed to be important in C. difficile flagellum assembly and virulence.
Regarding RT 078, the absence of differential expression of most proteins involved in the assembly of C. difficile flagellum (with exception of FlgK and FliD) could be explained by previously published genomic study which confirmed the complete loss of the F3 flagellar region while retaining the F1 region (containing fliK and fliD genes). This has been corroborated using microarray data from phylogenetic studies [25,26]. The low protein expression of the FliC, FliD and a putative glycosyltransferase (in comparison with RTs 027 and 176), is in agreement with the studies on non-flagellated C. difficile serotypes retaining transcription of fliC and fliD genes reporting the absence of its protein products [36,37].
The question of a correlation between particular flagellins and toxin levels among RTs 027, 176, and 005 could be raised. However, the lower levels of TcdA and TcdB in RT 014 and the inability to detect the flagellar proteins by the mass spectrometry proposes the greater complexity of the of the C. difficile virulence factors.
The Pro-Pro endopeptidase, PPEP-1, CD2830 alias C9YQ56 in strain R20191 reference proteome, was analysed as a highly active secreted metalloprotease and potential marker of virulence. The identification of two C. difficile LPXTG surface proteins CD2831 and CD3246 as highly efficient substrates for PPEP-1 indicated a role for this enzyme in bacterial motility [38]. However, the decreased tendency of the LFQ intensities of PPEP-1 toward RTs 027 and 176 was observed in this study. Clearly, the question of the functionality of these findings remains to be addressed.
Regarding the substrates, PPEP-1 knockout strain was demonstrated to have higher affinity for collagen type I with attenuated virulence in hamsters due to the cleavage of collagen binding protein CD2831/C9YQ57 [39], and this protein was described to be completely released from the cells [40]. In our study the later substrate, CD3246, was observed only in RT 078 probably due to the loss of the corresponding genes in other reference genomes. The evidence of production of high levels of both PPEP-1 substrates in this RT only supports the exclusive role of PPEP-1 in this representative and confirms the distinctive pathophysiological mechanisms from RTs 027 and 176.
l-Histidine biosynthesis is an ancient metabolic pathway present in bacteria, archaea, lower eukaryotes, and plants, and several proteins involved in this synthetic pathway were observed in this study. The pathway is regulated at the first committed step by hetero-oligomeric complex HisG/HisZ. HisG (C9YLE5) acting as aminoacyl-tRNA synthetase catalyzes the condensation of ATP and 5-phosphoribose 1-diphosphate to form N'-(5′phosphoribosyl)-ATP (PR-ATP) and has a crucial role in the pathway because the rate of histidine biosynthesis seems to be controlled primarily by regulation of HisG enzymatic activity [41]. HisZ (C9YLE4) is an ATP phosphoribosyltransferase regulatory subunit essential for the catalytic activity of the whole complex [42]. In addition, the global repressor CodY, responsible also for suppressing of 19.6-kb Pathogenicity Locus, negatively regulates hisZ gene expression [43]. In this study, high levels of both of these proteins were determined in all representative strains possessing tcdA and tcdB genes. Moreover, in non-toxigenic representative strain the levels remained undetected, emphasizing the role of histidine biosynthesis in the virulence of C. difficile.

Conclusions
Comparative proteomic analysis using label-free quantification (LFQ) of proteomes released from in vitro cultured C. difficile RTs 001, 005, 010, 012, 014, 027, 078, and 176 revealed several protein groups displaying varying protein levels between individual PCR ribotypes. These differences point to a new direction for studies aimed at the elucidation of the mechanisms underlining pathogenicity. The higher expression and/or unique expression of proteins linked to pathogenicity or iron metabolism support clinical importance and genetic relatedness of RTs 027 and 176 at the proteomic level. The nucleic acid zinc binding protein, cell wall binding repeat 2 family, Sigma-54 dependent regulatory protein and FliE were suggested as potential novel biomarkers of virulence based on differential expression among PCR ribotypes in this study.