Complete genome of Vibrio parahaemolyticus FORC014 isolated from the toothfish
- Sojin Ahn†1, 2,
- Han Young Chung†2, 3,
- Sooyeon Lim4,
- Kwondo Kim1, 5,
- Suyeon Kim2, 3,
- Eun Jung Na2, 3,
- Kelsey Caetano-Anolles6,
- Ju-Hoon Lee2, 7,
- Sangryeol Ryu2, 3,
- Sang Ho Choi2, 3 and
- Heebal Kim1, 2, 5, 6Email author
© The Author(s) 2016
Received: 15 August 2016
Accepted: 19 October 2016
Published: 17 November 2016
Foodborne illness can occur due to various pathogenic bacteria such as Staphylococcus aureus, Escherichia coli and Vibrio parahaemolyticus, and can cause severe gastroenteritis symptoms. In this study, we completed the genome sequence of a foodborne pathogen V. parahaemolyticus FORC_014, which was isolated from suspected contaminated toothfish from South Korea. Additionally, we extended our knowledge of genomic characteristics of the FORC_014 strain through comparative analysis using the complete sequences of other V. parahaemolyticus strains whose complete genomes have previously been reported.
The complete genome sequence of V. parahaemolyticus FORC_014 was generated using the PacBio RS platform with single molecule, real-time (SMRT) sequencing. The FORC_014 strain consists of two circular chromosomes (3,241,330 bp for chromosome 1 and 1,997,247 bp for chromosome 2), one plasmid (51,383 bp), and one putative phage sequence (96,896 bp). The genome contains a total of 4274 putative protein coding sequences, 126 tRNA genes and 34 rRNA genes. Furthermore, we found 33 type III secretion system 1 (T3SS1) related proteins and 15 type III secretion system 2 (T3SS2) related proteins on chromosome 1. This is the first reported result of Type III secretion system 2 located on chromosome 1 of V. parahaemolyticus without thermostable direct hemolysin (tdh) and thermostable direct hemolysin-related hemolysin (trh).
Through investigation of the complete genome sequence of V. parahaemolyticus FORC_014, which differs from previously reported strains, we revealed two type III secretion systems (T3SS1, T3SS2) located on chromosome 1 which do not include tdh and trh genes. We also identified several virulence factors carried by our strain, including iron uptake system, hemolysin and secretion system. This result suggests that the FORC_014 strain may be one pathogen responsible for foodborne illness outbreak. Our results provide significant genomic clues which will assist in future understanding of virulence at the genomic level and help distinguish between clinical and non-clinical isolates.
KeywordsVibrio parahaemolyticus Type III secretion system-2 Whole genome sequencing Comparative genomics
Vibrio parahaemolyticus is an important gastrointestinal pathogen which is characterized by a gram-negative, rod shaped, and halophilic organism which causes food borne illness. When people eat oysters, shrimps, fish and other seafood contaminated with V. parahaemolyticus, they may develop a foodborne illness with serious gastroenteritis symptoms such as acute gastroenteritis, vomiting and even death .
The initial spread of V. parahaemolyticus caused an outbreak of foodborne illness in Japan in the early 1950s . From that point on, food poisoning outbreaks caused by V. parahaemolyticus began to occur frequently worldwide . With the goal of better understanding the spread of disease and prevention, numerous studies have been performed on V. parahaemolyticus, particularly focusing on how its toxins associate with food poisoning. While environmental strains rarely contain pathogenic genes thermostable direct hemolysin (tdh) and thermostable direct hemolysin-related hemolysin (trh), clinical strains which create foodborne illness, possess virulence factor including tdh, and trh. Therefore, tdh, and trh have been considered as the indicators of V. parahaemolyticus pathogenicity, which has an enterotoxic effect on the intestinal cells of the affected mammal [4, 5]. Recent studies, however, announced that some clinical strains identified negative for tdh and trh genes [4, 5]. In addition to the two previously mentioned pathogenicity indicators, T3SS2, which is required for intestinal colonization, has been speculated to be a possible indicator of V. parahaemolyticus pathogenicity [5–8]. However, major virulence indicators of V. parahaemolyticus at the genomic level are still unclear despite the many studies which have been performed which attempted to identify them.
In this study, we sequenced the putative clinical strain V. parahaemolyticus FORC_014, which was isolated from toothfish which was suspected to have caused a spread of foodborne illness in South Korea. The whole genome sequences of V. parahaemolyticus will help to understand genetic variation between non-pathogenic strain and pathogenic strains. In addition, we performed comparative analysis on the FORC_014 strain with eight other complete genome sequences from public databases to gain genomic level information and greater understanding of this strain.
Genomic DNA preparation and whole genome sequencing
Vibrio parahaemolyticus FORC_014, a strain of V. parahaemolyticus which was isolated from contaminated fried toothfish in Busan, South Korea, was received from the Ministry of Food and Drug Safety. Total genomic DNA preparation was performed using a Qiagen blood and tissue kit following manufacturer’s protocol.
Approximately 5 μg of DNA was fragmented to 8–12 kbp using the Hydroshear system and assembly of DNA was performed at a shearing speed of 9 for 20 cycles. PacBio DNA Template Prep Kit 2.0 (3–10 kbps), used for SMRT Sequencing with C2 chemistry on PacBio RS, was used for SMRTbell library preparation following manufacturer’s instructions. The size distribution of the purified DNA template was measured using an Agilent 12,000 DNA kit and the concentration of the template was measured using Invitrogen Qubit. Primers were annealed to the template and DNA polymerase C2 was added following the manufacturer’s recommendations. Enzyme-template complexes were set up with DNA/Polymerase Binding Kit P4 (PacBio) on the 75,000 zero-mode waveguides (ZMWs). DNA sequencing Reagent 2.0 kit (Pacific Bioscience) was used for SMRTbell library sequencing with a long (1 × 120 min) sequence capture protocol for maximizing read length with PacBio RS II. The summary of sequencing result is included Additional file 1.
Genome assembly and annotation
Sequencing reads were assembled within the SMRT portal system . The whole genome was assembled using HGAP assembly version 3 algorithm with curation of genome size parameter which was set to 5,100,000 bp. The more statistics information from HGAP assembly is provided Additional file 1. Re-sequencing and variant polishing was performed on contigs which were generated after first draft assembly to resolve the problem of high error using the PacBio RS II sequencing system. Determination of orientation and the direction of assembled sequence was performed using the Basic Local Alignment Search Tool (BLAST) and MUMmer analysis by comparison with the reference genome, V. parahaemolyticus CDC_K4557 . The polished sequence was manually curated using Bioedit software .
Rapid Annotation of Prokaryotic Genomes(PROKKA), which includes prediction tools such as Prodigal , RNAmmer , Aragorn , SignalP , and infernal , was used for Open Reading Frame, tRNA and rRNA prediction of V. parahaemolyticus FORC_014 . We also used Rapid Annotation through the Subsystem Technology server in order to confirm ORFs . After gene prediction, we characterized gene function based on Cluster of Orthologous Groups (COG) annotation using the Web server for fast Metagenomic Sequence Analysis (WebMGA) with default options and for subsystem functional categorization , SEED annotation was performed using the SEED viewer within the RAST server. Sequences of virulence factors from the in Virulence Factor Database (VFDB; www.mgc.ac.cn/VFs/) were used for defining virulence factors in all strains, except for the well-defined strain RIMD2210633, using BLASTn method (identity ≥0.90; query coverage ≥0.90).
Comparative genome analysis
In this study, the complete genome sequences of eight V. parahaemolyticus strains: RIMD2210633, CDC_K4557, BB22OP, FORC_008, UCM-V493, FORC_006, FORC_004, and FDA_R31 were downloaded from NCBI (http://www.ncbi.nlm.nih.gov/genome/genomes/691) and used for comparative analysis.
For calculation of the Average Nucleotide identity (ANI) value among 9 strains, the Jspecies tool based on the BLAST algorithm was used . Each of query genome was cut into small fragments of 1020 bp and high scoring pairs between two sequences were selected using the BLAST algorithm for calculating ANI values . After that, a genome tree was constructed using the unweighted pair group method in R software. After selection of the genome using ANI values, comparison of the genome sequence was performed using the Artemis comparison tool (ACT) and confirmed unmatched regions .
Also, BLAST search was used to predict virulence factors of FORC_014 strain. The Virulence Factors Database (www.mgc.ac.cn/VFs/) was used as subject sequence database and FORC_14 strain sequence used as query sequence.
The 16 s rRNA gene of V. parahaemolyticus FORC_014 was isolated from the completely assembled sequence using RNAmmer within the PROKKA annotation tool. The complete genome sequence of the same species was used to calculate the distance through comparison of ANI values.
Genomic features of V.parahaemolyticus FORC_ 014
Genome size (bp)
GC contents (%)
Open reading frames
Results and discussion
Genome tree analysis was performed on 8 complete genomes of V. parahaemolyticus strains gathered from the NCBI database. Average nucleotide identity values (ANI) were calculated with these 8 strains and a dendrogram was constructed using ANI values. All of values among strains are higher than 95% identity which known as criteria of the same species. As a result, the FORC_014 strain was found to be clustered with FORC_006 and UCM_V493 strain. The FORC_006 strain was isolated from South Korea and UCM_V493 strain was environmentally isolated in Spain . This comparison data is shown as a dendrogram and table in Additional file 2. We notice that our strain scored slightly higher with UCM_V493 strain than other clinical strains.
Our results also revealed that the FORC_014 strain does not encode tdh and trh genes, which are known to be major virulence factors of V. parahaemolyticus. However, we detected that FORC_014 strain encoded various virulence factors including two type 3 secretion systems (T3SSs) using the BLAST method (Additional file 4). FORC_014 contains various iron uptake-associated genes (Enterobactin receptors; irgA, and vdtA, Periplasmic binding protein-dependent ABC transport systems; vctP, vctD, vctG, and vctC, Heme receptors; hutA, and hutR, vibrioferrin associated; pvuA,B,C,D,E, pvsA,B,C,D,E, and psuA), and hemolysin (tlh;FORC14_3316). Additionally, we performed LDH release assay using the INT-407 cells for testing cytotoxicity activity (Additional file 5). The test result supported that FORC_014 strain has pathogenesis activity. Based on these results, we suggest that FORC_014 is pathogenic, even though it is tdh and trh negative [5, 6, 27, 28].
In conclusion, we completed genomic sequencing of V. parahaemolyticus FORC_014, which is considered a leading cause of foodborne illness from comparative studies with already published strains. As a result, we found pathogenic island regions of FORC_014 that clustered T3SS1 related genes and T3SS2 related genes on chromosome 1. Our findings provide not only new information about virulence related genes, especially T3SS2 on Chromosome 1 of V. parahaemolyticus, but also could support results of previous studies on the pathogenicity of tdh and trh negative clinical strains. Further comparative genome studies of clinical and environmental isolates with our V. parahaemolyticus strain will provide information crucial to revealing the major pathogenic mechanism.
Food-borne Pathogen Omics Research Center
Korean Research Institute of Bioscience and Biotechnology
type III secretion system 1
type III secretion system 2
thermostable direct hemolysin
thermostable direct hemolysin-related hemolysin
artemis comparison tool
average nucleotide identity
open reading frame
Cluster of Orthologous Groups
Web server for fast Metagenomic Sequence Analysis
Basic Local Alignment Search Tool
JHL, SR, SHC, and HK designed and led the study. SA, HYC, SL, and KK drafted manuscript. SA and KK performed assembly and annotation sequencing data. HYC, EJN, SK performed experiments. SA, HYC, EJN, and SK analyzed the sequencing data and interpreted result. SL, KCA, JHL, SR, and SHC contributed to the interpretation of the result. SA, HYC, and HK discussed the results and wrote the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Availability of data and materials
The genome sequence of Vibirio parahaemolyticus FORC014 has been deposited in NCBI Genbank server under the accession number CP011406–CP011408 for chromosome 1, chromosome 2 and plasmid.
This research was supported by a Grant (14162MFDS972) from Ministry of Food and Drug Safety, Korea in 2016.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Su YC, Liu C. Vibrio parahaemolyticus: a concern of seafood safety. Food Microbiol. 2007;24(6):549–58.View ArticlePubMedGoogle Scholar
- Fujino T, Okuno Y, Nakada D, Aoyama A, Fukai K, Mukai T, Ueho T. On the bacteriological examination of shirasu food poisoning. Med J Osaka Univ. 1953;4:299–304.Google Scholar
- DePaola A, Ulaszek J, Kaysner CA, Tenge BJ, Nordstrom JL, Wells J, Puhr N, Gendel SM. Molecular, serological, and virulence characteristics of Vibrio parahaemolyticus isolated from environmental, food, and clinical sources in North America and Asia. Appl Environ Microbiol. 2003;69(7):3999–4005.View ArticlePubMedPubMed CentralGoogle Scholar
- Lüdeke CH, Kong N, Weimer BC, Fischer M, Jones JL. Complete genome sequences of a clinical isolate and an environmental isolate of Vibrio parahaemolyticus. Genome Announc. 2015;3(2):e00216–20.View ArticlePubMedPubMed CentralGoogle Scholar
- Ottaviani D, Leoni F, Serra R, Serracca L, Decastelli L, et al. Nontoxigenic Vibrio parahaemolyticus strains causing acute gastroenteritis. J Clin Microbiol. 2012;50(12):4141–3.View ArticlePubMedPubMed CentralGoogle Scholar
- Caburlotto G, Lleò MM, Hilton T, Huq A, Colwell RR, Kaper JB. Effect on human cells of environmental Vibrio parahaemolyticus strains carrying type III secretion system 2. Infect Immun. 2010;78(7):3280–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Broberg CA, Calder TJ, Orth K. Vibrio parahaemolyticus cell biology and pathogenicity determinants. Microbes Infect. 2011;13(12):992–1001.View ArticlePubMedPubMed CentralGoogle Scholar
- Ritchie JM, Rui H, Zhou X, Iida T, Kodoma T, Ito S, Davis BM, Bronson RT, Waldor MK. Inflammation and disintegration of intestinal villi in an experimental model for Vibrio parahaemolyticus-induced diarrhea. PLoS Pathog. 2012;8(3):e1002593.View ArticlePubMedPubMed CentralGoogle Scholar
- Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10(6):563–9.View ArticlePubMedGoogle Scholar
- Delcher AL, Salzberg SL, Phillippy AM. Using MUMmer to identify similar regions in large sequence sets. Curr Protoc Bioinform. 2003;10(3):1–3.Google Scholar
- Hall TA. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser. 1999;41:95–8.Google Scholar
- Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinform. 2010;11(1):1–11.View ArticleGoogle Scholar
- Lagesen K, Hallin P, Rødland EA, Stærfeldt H-H, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35(9):3100–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Laslett D, Canback B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 2004;32(1):11–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011;8(10):785–6.View ArticlePubMedGoogle Scholar
- Kolbe DL, Eddy SR. Fast filtering for RNA homology search. Bioinformatics. 2011;27(22):3102–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England). 2014;30(14):2068–9.View ArticleGoogle Scholar
- Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, et al. The RAST Server: rapid annotations using subsystems technology. BMC Genom. 2008;9:75.View ArticleGoogle Scholar
- Wu S, Zhu Z, Fu L, Niu B, Li W. WebMGA: a customizable web server for fast metagenomic sequence analysis. BMC Genom. 2011;12:444.View ArticleGoogle Scholar
- Richter M, Rossello-Mora R. Shifting the genomic gold standard for the prokaryotic species definition. Proc Nat Acad Sci USA. 2009;106(45):19126–31.View ArticlePubMedPubMed CentralGoogle Scholar
- Goris J, Konstantinidis KT, Klappenbach JA, Coenye T, Vandamme P, Tiedje JM. DNA–DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evolut Microbiol. 2007;57(1):81–91.View ArticleGoogle Scholar
- Carver TJ, Rutherford KM, Berriman M, Rajandream MA, Barrell BG, Parkhill J. ACT: the Artemis comparison tool. Bioinformatics. 2005;21(16):3422–3.View ArticlePubMedGoogle Scholar
- Kalburge S, Polson S, Crotty KB, Katz L, Turnsek M, Tarr C, Martinez-Urtaza J, Boyd E. Complete genome sequence of Vibrio parahaemolyticus environmental strain UCM-V493. Genome Announc. 2014;2(2):e00159–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Makino K, Oshima K, Kurokawa K, Yokoyama K, Uda T, et al. Genome sequence of Vibrio parahaemolyticus: a pathogenic mechanism distinct from that of V. cholerae. Lancet. 2003;361(9359):743–9.View ArticlePubMedGoogle Scholar
- Hubbard TP, Chao MC, Abel S, Blondel CJ, zur Wiesch PA, Zhou X, Davis BM, Waldor MK. Genetic analysis of Vibrio parahaemolyticus intestinal colonization. Proc Nat Acad Sci. 2016;113(22):6283–8.View ArticlePubMedGoogle Scholar
- NorieaIii N, Johnson C, Griffitt K, Grimes DJ. Distribution of type III secretion systems in Vibrio parahaemolyticus from the northern Gulf of Mexico. J Appl Microbiol. 2010;109(3):953–62.View ArticleGoogle Scholar
- Park K-S, Ono T, Rokuda M, Jang M-H, Okada K, Iida T, Honda T. Functional characterization of two type III secretion systems of Vibrio parahaemolyticus. Infect Immun. 2004;72(11):6659–65.View ArticlePubMedPubMed CentralGoogle Scholar
- Ham H, Orth K. The role of type III secretion system 2 in Vibrio parahaemolyticus pathogenicity. J Microbiol. 2012;50(5):719–25.View ArticlePubMedGoogle Scholar