Genome sequencing, assembly, annotation and analysis of Staphylococcus xylosus strain DMB3-Bh1 reveals genes responsible for pathogenicity

Background Staphylococcus xylosus is coagulase-negative staphylococci (CNS), found occasionally on the skin of humans but recurrently on other mammals. Recent reports suggest that this commensal bacterium may cause diseases in humans and other animals. In this study, we present the first report of whole genome sequencing of S. xylosus strain DMB3-Bh1, which was isolated from the stool of a mouse. Results The draft genome of S. xylosus strain DMB3-Bh1 consisted of 2,81,0255 bp with G+C content of 32.7 mol%, 2623 predicted coding sequences (CDSs) and 58 RNAs. The final assembly contained 12 contigs of total size 2,81,0255 bp with N50 contig length of 4,37,962 bp and the largest contig assembled measured 7,61,338 bp. Further, an interspecies comparative genomic analysis through rapid annotation using subsystem technology server was achieved with Staphylococcus aureus RF122 that revealed 36 genes having similarity with S. xylosus DMB3-Bh1. 35 genes encoded for virulence, disease and defense and 1 gene encoded for phages, prophages and transposable elements. Conclusions These results suggest co linearity in genes between S. xylosus DMB3-Bh1 and S. aureus RF122 that contribute to pathogenicity and might be the result of horizontal gene transfer. The study indicates that S. xylosus DMB3-Bh1 may be a potential emerging pathogen for rodents. Electronic supplementary material The online version of this article (doi:10.1186/s13099-016-0139-8) contains supplementary material, which is available to authorized users.


Open Access
Gut Pathogens  [14]. S. xylosus persists as a commensal on the skin of mammals but occasionally in humans. S. xylosus is ubiquitous and can be noticed in diverse niches viz. polluted water, fodder, soil surface, etc. [15]. Since, S. xylosus is increasingly becoming infectious along with the other staphylococci, therefore it is imperative to explore the genome of this commensal [16][17][18][19][20]. In the public domain totally seven strains of S. xylosus have been sequenced at genome level, in which three strains are completely sequenced whereas four strains were having draft genome sequence. Although previous studies in the literature have performed genome annotation and analysis of S. xylosus, but a deep analysis on the pathogenicity of this organism obtained from the genomic information through next generation sequencing (NGS) was absent in the literature. Therefore, we generated the draft genome of S. xylosus strain DMB3-Bh1 which was isolated from the stool of mouse. Although S. xylosus have been reported to be isolated from several sources, we isolated this strain from mouse stool sample during the process of screening several other isolates from mouse stool in order to study mouse gut microbiota. Further a function based comparative genomic analysis of S. xylosus strain DMB3-Bh1 with S. aureus strain RF-122 and S. xylosus strain SMQ-121 were performed using RAST server that revealed genes contributing to pathogenicity in S. xylosus strain DMB3-Bh1.

Strain identification by 16S rRNA gene sequencing
An almost complete 16S rRNA gene sequence of the strain (1477 bp) was obtained and phylogenetic analysis showed that the strain DMB3-Bh1 should be assigned in the genus Staphylococcus. The strain DMB3-Bh1 showed highest degree of similarity with S. xylosus strain ATCC 29971 T (100%) followed by Staphylococcus saprophyticus subsp. saprophyticus strain ATCC 15305 T (99.80%), Staphylococcus saprophyticus subsp. bovis strain GTC 843 T (98.66%). The phylogenetic relationship among the species of genus Staphylococcus in which strain DMB3-Bh1 formed a separate branch along with S. xylosus (Fig. 1). The snp count of the 16S rRNA gene present in the genome of DMB3-Bh1 is 1, its having only a single copy of 16S rRNA gene.

Genomic features of strain DMB3-Bh1
The draft genome of S. xylosus strain DMB3-Bh1 consisted of 28,10,255 bp with G+C content of 32.7 mol%, 2623 predicted CDSs and 58 RNAs. The molar G+C content of the genus Staphylococcus ranges from 32.40 to 32.76%. The final assembly contained 12 contigs of total size 28,10,255 bp with N50 contig length of 4,37,962 bp and the largest contig assembled measured 7,61,338 bp. The strain showed forty-nine virulence genes and two genes encoding sub-category adhesions as revealed by RAST annotation server. No OMP's were detected in the genome of strain DMB3-Bh1. Genome sequencing project information is given in Table 1. Genome Statistics is given in Table 2. Sub-system distribution of S. xylosus strain DMB3-Bh1 is depicted in Fig. 2 based on RAST annotation server. The graphical circular map of the genomes is shown in Fig. 3.

Genes involved in virulence, disease and defense
Whole genome annotation of Staphylococcus xylosus strain DMB3-Bh1 in RAST server revealed a total of 1657 genes. Forty-nine genes encoded for virulence, disease and defense. Some of the genes coding functional proteins are fibronectin binding protein, chaperonin, two component response regulator BceR, bacitracin export ATP binding protein BceA, bacitracin export permease protein BceB, dihydrofolate synthase, folylpolyglutamate synthase, amidophosphoribosyl transferase, acetyl-coenzyme A, carboxyl transferase beta chain, colicin V production protein, tRNA pseudouridine synthase A, copper translocating P type ATPase, MerR family, multidrug resistance protein, membrane component of multidrug resistance system, TetR family regulator protein of MDR cluster, mercuric ion reductase, TcaR arsenical resistance protein ACR3, arsenic efflux pump protein and arsenate reductase (Fig. 4). Numbers of genes associated with the general cluster of orthologous groups (COG) functional categories are given in Table 3.

Phages, prophages, transposable elements, plasmids
A total of 7 protein coding genes were identified in this class. These include, phages DNA binding protein, phage terminase large subunit, phage portal protein, phage tail length tape-measure protein, HNH homing endonuclease and zinc metalloproteinase precursor (Fig. 5).

Function based comparative genomic analysis
Function based comparative genomic analysis of S. xylosus strain DMB3-Bh1, newly sequenced S. xylosus strain SMQ-121 and S. aureus strain RF122 was achieved with RAST server. A total of 67 protein coding genes were obtained among genomes of S. xylosus strain DMB3-Bh1, S. xylosus strain SMQ-121 and S. aureus strain RF122. Out of the several classes, the prime focus was narrowed down to only two classes (1) virulence, disease and defense; (2) phages, prophages, transposable elements, plasmids in all the three strains. A comparative analysis of these two classes showed 42 genes were present in these three strains. Forty-one genes were common in class virulence, disease and defense, where as one gene was common in class phages, prophages, transposable elements and plasmids. The sequence similarity of the 42 potential virulence factors were determined and it ranges from 30 to 98% for strains RF122/DMB3-Bh1 and 71-100% for strains SMQ121/DMB3-Bh1. Detailed similarity values of the virulence genes among the strains were in Additional file 1: Table S1. Common genes present in both the classes were: Chaperonin (heat shock protein 33), Fibronectin-binding protein, Bacitracin export ATPbinding protein BceA, Bacitracin export permease protein BceB, DNA-directed RNA polymerase beta' subunit (EC 2.7.7.6), Translation initiation factor 3, Arsenic efflux pump protein, Beta-lactamase (EC 3.5.2.6), Choloylglycine hydrolase (EC 3.5.1.24), cobalt-zinc-cadmium resistance protein, Transcriptional regulator, MerR family. Remarkably five genes contributing pathogenicity were exclusively present in our strain S. xylosus strain DMB3-Bh1 and were absent in S. xylosus strain SMQ-121. These were Fosfomycin resistance protein FosB, Phage portal protein, Phage tail length tape-measure protein, Phage tail length tape-measure protein and Phage terminase large subunit. The Venn-diagram revealed the number of shared genes in the genome of two closely related strains of genus Staphylococcus i.e. S. xylosus strain DMB3-Bh1 and S. aureus strain RF122. Out of a total of 1498 COG classes, 1294 were common between S. xylosus strain DMB3-Bh1 and S. aureus strain RF122. These 1294 COGs corresponded to 19 unique COG classes (B, C, D, E, F, G, H, I, J, L, M, N, O, P, Q, R, S, T, U) (Fig. 6). 13 hypothetical genes found in the genome of strain DMB3-Bh1. One  As these multiple classes are closely related to similar types of biological process, we verified their biological roles by employing the KEGG database [21] and using the corresponding KEGG IDs (KO) to inspect the respective pathways (data not shown). Further, a comparative genomic analysis strategy was also performed with the virulence genes of S. aureus RF122 obtained from VFDB (http://www.mgc.ac.cn/ cgi-bin/VFs/genus.cgi?Genus=Staphylococcus), which showed the absence of autolysin, toxins and type-VII secretion systems in genome of S. xylosus DMB3-Bh1 compared to S. aureus RF122.

Discussion
This study reports the whole genome sequencing of S. xylosus strain DMB3-Bh1 isolated from the mouse stool. The annotated draft genome of strain DMB3-Bh1 was 28, 10,255 bp in size with 2, 623 protein coding sequences. It has 400 subsystems, in which two major classes' virulence disease and defense; phages, prophages, transposable elements, plasmids were analyzed for genes responsible for infection in mice. This suggests that S. xylosus strain DMB3-Bh1 is a potential mouse pathogen and can cause zoonotic diseases. Pathways involved in the pathogenicity and host resistance in S. xylosus strain DMB3-Bh1 could be analyzed for novel targets for designing antimicrobial drugs and vaccines. Further, a comparative genomic analysis was performed among S. xylosus strain DMB3-Bh1, S. xylosus strain SMQ-121 and S. aureus strain RF122 that revealed 42 genes were present in all the three strains responsible for infection in the host. This demonstrates that there could be frequent acquisition of virulent factors by S. xylosus strain DMB3-Bh1 from S. xylosus strain SMQ-121 and S. aureus strain RF122, through lateral genetic transfer (LGT) via the mechanisms of transduction, transformation and/ or conjugation. The exogenous genetic material can be integrated into the recipient genome through genetic recombination also. In Staphylococcus, phage-mediated conjugation is one of the more common mechanisms of genetic transfer, in which the presence of bacteriophages can increase the adhesiveness of the bacterial cell surface and therefore assist in the conjugative transfer of genetic materials between two organism [22]. There were five genes not present in S. xylosus strain SMQ-121 but present in S. xylosus strain DMB3-Bh1.This signifies that the strain DMB3-Bh1 has more pathogenic potential as compared to S. xylosus strain SMQ-121. The commensal S. xylosus is zoonotic agent and can be a threat to human beings [23]. The current study envisages that its genome sequence information will have significant implications in developing novel drug targets drugs and vaccines.

Conclusions
In the present study we have provided a genomic insight into the genome of S. xylosus strain DMB3-Bh1. We were able to map the virulence profile of this organism using in silico approaches. Several putative factors contributing pathogenicity were found in strain DMB3-Bh1 when a comparative pathogenomics approach was employed with other reference strain of genus Staphylococcus The coverage of the assembly at each base pair can be seen by the grey colored track. Annotation descriptors for "virulent" genes (inner label: red) and "phage" related genes (outer label: blue) are mapped onto their respective contig positions. As many annotation descriptors occupy neighboring positions on the contigs, the descriptors are stacked to allow better visualization from publically available databases. Though the data is a preliminary report on the virulence profile of S. xylosus strain DMB3-Bh1, such data adds to the repository of virulent pathogens and acts as a building platform for the development of novel therapeutics against emerging pathogens.

Isolation of bacterial strains and growth conditions
S. xylosus strain DMB3-Bh1 was isolated from the stool sample of BALB/c. The stool was homogenised in sterile PBS (1X) and centrifuged at 1000 rpm for 2 min to remove debris. Supernatant was serially diluted and plated on tryptone soya agar (TSA, HiMedia, Mumbai, India) and later incubated at 37 °C for 36 h to isolate pure bacterial colonies. Microscopic examination was done to examine cell morphology, motility and sporulation. Cells of strain DMB3-Bh1 are Gram-positive, 0.7-1.0 μm in size (Fig. 7). Different biochemical features were tested. Strain DMB3-Bh1 was positive for urease, nitrate reduction and catalase. Negative for tween 80 and aesculin hydrolysis; oxidase. Acid production was also determined by using different sugars and the strain is positive for glucose, fructose, maltose, xylose, lactose, mannitol, arabinose and negative for cellobiose, galactose, salicin, adonitol, fucose and raffinose. Capable of growing between 20 and 42 °C and tolerant up to 8.0% NaCl. Classification and general features of strain DMB3-Bh1 are accordance with the MIGS recommendations shown in Table 4.

Phylogenetic analysis of strain DMB3-Bh1
Phylogenetic neighbor identification and the computation of pairwise 16S rRNA gene sequence similarities were achieved using the EzTaxon server and alignment was performed using MEGA version 6.0 [25,26]. The neighbor-joining, maximum parsimony and maximum likelihood algorithms were used to construct phylogenetic trees. Bootstrap analysis was carried out to assess the confidence limits of the branching.

Table 4 Classification and general features of Staphylococcus xylosus strain DMB3-Bh1 accordance with the MIGS recommendations
Evidence codes-IDA inferred from direct assay (first time in publication), TAS traceable author statement (i.e., a direct report exists in the literature); If the evidence code is IDA, then the property was observed by one of the authors or an expert mentioned in the acknowledgements

Genome sequencing, assembly and annotation
The genome of S. xylosus strain DMB3-Bh1 was sequenced using a standard run of Illumina HiSeq 1000 sequencing technology at c-CAMP by Next Generation Genomic Facility (Bengaluru, India http://www.ccamp. res.in). The genome of strain DMB3-Bh1 formed a total of 2,91,86,504 paired-end reads [paired distance (insert size) ~330 bp] of 101 bp. The data was preprocessed to trim and remove low quality sequences using CLC Bio Workbench v6.0.4 (CLC Bio, Aarhus, Denmark). A total of 2,90,47,554 high quality, vector filtered reads (~1145 times coverage) were employed for assembly (at word size of 45 and bubble size of 98). The genome coverage was 1145.46× and calculated using formula coverage = (read count × read length)/total genome size. Final draft genome was used for genome annotation employing RAST server [27] and RNAmmer 1.2 server [28]. The graphical circular map of the genomes was made by [29]. COG analysis was performed using the Reversed Position Specific BLAST (RPS BLAST using NCBI COG version 2/2/2011) [30] on the prokaryotic protein database with e-value of 0.001. MS Excel was used to compare the output COG ids between different strains. [31]. The Venn-diagram was made by using [32]. For comparative genomic analysis, the annotated genes were extracted from RAST server into an excel table and manually compared for genomic features. [33].

Sequence data access
This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession AURW00000000. The version described in this paper is the first version, AURW01000000.ftp://ncbi. nlm.nih.gov/genomes/ASSEMBLY_REPORTS/All/ GCF_000467225.1.assembly.txt.

Authors' contributions
Performed experiments: GK, AA, SS, NM and SV. Planned and executed experiments analyzed data and wrote manuscript: SM and JNA. All authors read and approved the final manuscript.