RNA-seq analysis of Macrobrachium rosenbergii hepatopancreas in response to Vibrio parahaemolyticus infection

Background The Malaysian giant freshwater prawn, Macrobrachium rosenbergii, is an economically important crustacean worldwide. However, production of this prawn is facing a serious threat from Vibriosis disease caused by Vibrio species such as Vibrio parahaemolyticus. Unfortunately, the mechanisms involved in the immune response of this species to bacterial infection are not fully understood. We therefore used a high-throughput deep sequencing technology to investigate the transcriptome and comparative expression profiles of the hepatopancreas from this freshwater prawn infected with V. parahaemolyticus to gain an increased understanding of the molecular mechanisms underlying the species’ immune response to this pathogenic bacteria. Result A total of 59,122,940 raw reads were obtained from the control group, and 58,385,094 reads from the Vibrio-infected group. Via de novo assembly by Trinity assembler, 59,050 control unigenes and 73,946 Vibrio-infected group unigenes were obtained. By clustering unigenes from both libraries, a total of 64,411 standard unigenes were produced. The standard unigenes were annotated against the NCBI non-redundant, Swiss-Prot, Kyoto Encyclopaedia of Genes and Genome pathway (KEGG) and Orthologous Groups of Proteins (COG) databases, with 19,799 (30.73%), 16,832 (26.13%), 14,706 (22.83%) and 7,856 (12.19%) hits respectively, giving a final total of 22,455 significant hits (34.86% of all unigenes). A Gene Ontology (GO) analysis search using the Blast2GO program resulted in 6,007 unigenes (9.32%) being categorized into 55 functional groups. A differential gene expression analysis produced a total of 14,569 unigenes aberrantly expressed, with 11,446 unigenes significantly up-regulated and 3,103 unigenes significantly down-regulated. The differentially expressed immune genes fall under various processes of the animal immune system. Conclusion This study provided an insight into the antibacterial mechanism in M. rosenbergii and the role of differentially expressed immune genes in response to V. parahaemolyticus infection. Furthermore, this study has generated an abundant list of transcript from M.rosenbergii which will provide a fundamental basis for future genomics research in this field. Electronic supplementary material The online version of this article (doi:10.1186/s13099-015-0052-6) contains supplementary material, which is available to authorized users.


Background
The Malaysian giant freshwater prawn, Macrobrachium rosenbergii (locally known as 'udang galah'), belongs to the genus Macrobrachium, which is the largest genus of the family Palaemonidae [1]. They are found in most inland freshwater areas, including lakes, rivers, swamps, estuarine areas, ponds, canals as well as in irrigation ducts [2]. M. rosenbergii spends its adult life in fresh water, but requires brackish water during the initial stages of its life cycle [3]. High demand from the aquaculture industry has led to large-scale farming of this prawn in many countries; the major producers being Bangladesh, Brazil, China, Ecuador, India, Thailand, Taiwan Province of China, and Malaysia [4].
The global production of this prawn had increased to over 200 000 tonnes/year by 2002, and income in Asia alone is now worth US$1 billion per annum [5,6]. In Malaysia, the production of cultured M. rosenbergii reached 281 metric tonnes by 1998 [4]. Generally, M. rosenbergii is assumed to be less resistant towards diseases than penaeid shrimp [7]. However, with the rise of large-scale high density prawn aquaculture techniques, production of this prawn worldwide is facing a serious threat from fatal diseases caused by nodaviruses and bacteria, particularly from the Vibrio species [8,9]. The emergence of these pathogens has had a detrimental impact on the M. rosenbergii farming industry, causing considerable economic losses.
Vibrio is a Gram-negative halophilic bacterium found abundantly in marine and estuarine environments [10,11]. Among the different species, Vibrio parahaemolyticus has emerged as an important pathogen for M. rosenbergii [12]. Several other marine shrimps such as Penaeus monodon, Penaeus japonicas and Litopenaeus vannamei have also been found to be susceptible to Vibrio infection [13]. Severe V. parahaemolyticus infection in prawns leads to a disease known as 'Vibriosis' [14,15]. M. rosenbergii suffering from vibriosis may appear black in colour on the carapace, with red discolouration of the exoskeleton and loss of appendages within six days, leading to an 80% mortality rate [12].
Acquiring and establishing knowledge regarding host pathogen interactions is necessary to unlock the pathogenesis of a particular disease. Host pathogen interactions can result in acute and adaptive immune responses against an invader; however, this has been lacking in M. rosenbergii [16]. The species defends itself against pathogen invasion using an innate immune system involving the cellular and humoral mechanisms [17,18]. Recently, some progress has been made in analysing the molecular mechanisms of shrimp-pathogen interactions, and several immune genes from shrimp have been discovered such as lectins, antimicrobial peptides, prophenoloxidase and manganese superoxide dismutase, using methods such as suppression subtractive hybridization (SSH) and expressed sequence tags (EST) [19][20][21]. However, these two methods have been found to be laborious and costly, which limits their use for the production of large-scale transcripts [22].
A cutting edge technology has emerged recently, known as Next Generation Sequencing technology (NGS). Currently, there are four established platforms which uses NGS technology: the Illumina Genome Analyzer, the Roche/454 Genome Sequencer FLX Instrument, and the ABI SOLiD System [23,24]. These platforms have proven versatile and cost-effective tools for advanced research in various genomic areas, such as genome sequencing and re-sequencing, DNA methylation analysis, miRNA expression profiling, and also in non-model organisms as the de novo transcriptome sequencing [25]. By using the NGS platform, transcriptome analysis can be performed faster and more easily, because it does not require any bacterial cloning of cDNAs [26]. NGS sequencing has the further advantage of generating greater depth of short reads with minimum error rates [27]. Moreover, it is more reliable and efficient than previous methods in measuring transcriptome composition, revealing RNA expression patterns, and discovering new genes on a larger scale [28]. The superiority of this technology also lies in its sensitivity, which allows the detection of low-abundance transcripts [29].
Previous studies have been performed on whole transcriptome sequencing of the hepatopancreas, gill and muscle tissues of M. rosenbergii using the Illumina Genome Analyzer IIx platform (Illumina). They successfully produced a comprehensive transcript data for this freshwater prawn, leading to the discovery of new genes [30]. This present study utilised a similar approach to analyse transcriptome data obtained from the hepatopancreas of M. rosenbergii experimentally infected with V. parahaemolyticus. The aim was to discover, and determine the role of, immune genes in M. rosenbergii involved in V. parahaemolyticus infection, which in turn could provide insights into the hostpathogen interactions between these two organisms.

Material and methods
M. rosenbergii and V. parahaemolyticus PCV08-7 challenge M. rosenbergii prawns (5-8 g body weight) purchased from a local hatchery (Kuala Kangsar, Perak, Malaysia) were acclimatized at 28 ± 1°C in aerated and filtered freshwater for one week prior to challenge with V. parahaemolyticus. During the challenge experiment, the prawns (n = 10) were intramuscularly injected with 100 μl 1X10 5 cfu cultured V. parahaemolyticus [31] whereas another batch of prawns (n = 10) were injected with 100 μl 2% NaCl (1:10, w/v) solution which serves as negative control group. The hepatopancreas tissues of the prawns were dissected at 12 hours post-infection. The tissues were rapidly frozen in liquid nitrogen and stored at −80°C until total RNA extraction. The 12 hour time point was chosen based on our previous work regarding immune related genes from M. rosenbergii in response to pathogen such as viruses showing significant gene expression at this time point [32][33][34][35].
Total RNA extraction and next-generation sequencing Total RNA (~20 mg) was isolated from both the V. parahaemolyticus-challenged and negative control group hepatopancreases. The RNA extraction process was performed by using the Macherey-Nagel NucleoSpin RNA II extraction kit in accordance with the manufacturer's protocols and stored at −80°C prior to RNA sequencing. The purity and integrity of the RNA was assessed by using the Bioanalyzer 2100 (Agilent technologies, USA). In each group, the total RNA samples were pooled from 10 prawns after which cDNA was synthesized followed by sequencing. The sequencing run was conducted on an Illumina HiSeq™ 2000 platform at the Beijing Genome Institute, Shenzhen, China. The sequencing data constituted 90 bp paired end read data, with~117 million raw reads.

Assembly and functional annotation
The raw reads were primarily quality filtered to remove adaptor sequences followed by removal of ambiguous 'N' nucleotides (with a ratio of 'N' more than 10%) and sequences with a phred quality score of less than 20 before proceeding to de novo assembly by using the Trinity software [36]. The Trinity programme assembles the reads into contigs and these contigs were assembled to unigenes. Finally, the TIGR Gene Indices clustering tools (TGICL) [37] with default parameters was applied to cluster the unigenes from both groups which produces nonredundant unigenes.
The non-redundant unigene sequences were aligned to databases which included NCBI non-redundant (Nr), Swissprot [38], Cluster of Orthologous Groups (COG) [39] and Kyoto Encyclopaedia of Genes and Genome (KEGG) [40] using BLASTX [41] with an E-value cut-off of 10 −5 . Gene Ontology (GO) was conducted utilizing default parameters using the BLAST2GO software [42,43]. It was from the above mentioned databases that the gene direction of the unigenes which were annotated and the coding sequence were determined from the BLAST results. The prediction for the coding sequence and the gene direction was performed by ESTscan [44] for those sequences with no defined annotation by using BLAST predicted coding sequence data as the training set.

Identification of differentially expressed unigenes
The FPKM method (Fragments Per kb per Million fragments) was used to calculate the transcript expression levels [45]. An FDR (false discovery rate) of <0.001 was used as the threshold p-value in multiple tests to judge the degree of differences in gene expression [46]. In a given library when the p-value was less than 0.001 and when the expression level showed greater than two-fold change between two groups genes were considered as differentially expressed.

Quantitative RT-PCR analysis
We selected seven differentially expressed M. rosenbergii unigenes (arginine kinase 1, anti-lipopolysaccharide factor, inhibitor of apoptosis protein, caspase, heat shock protein 21, lectin 1, and NF-kappa B inhibitor alpha) for quantitative RT-PCR analysis (qRT-PCR) to evaluate our Illumina sequencing result. The primer design for the seven unigenes was performed by using Primer3 software http:// www.bioinformatics.nl/cgi-bin/primer3plus/primer3plus.cgi/ and listed in Additional file 1. Using 1 μg of RNA, first strand cDNA synthesis was carried out (similar to the sample used for transcriptome sequencing) by using the ImProm-II™ Reverse Transcription System (Promega). The qRT-PCR reaction (20 μl) consisted of a 10 μl TaqMan Universal RT-PCR Master Mix (Applied Biosystems, Foster City, CA, USA), a 1 μl of primers/probe set containing 900 nM of forward reverse primers, a 300 nM probe and 2 μl of template cDNA. The qRT-PCR program was set with an incubation step at 50°C for 2 min, 40 cycles at 95°C for 10 min, 95°C for 15 sec, and 60°C for 1 min, carried out by using Step One Plus Real-Time PCR System® (Applied Biosystems). Similar qRT-PCR cycle profile was applied for the internal control gene, Elongation factor 1-alpha (primer sequences are listed in Additional file 1). The expression level of the seven immune genes were analysed by using the comparative CT method (2 -ΔΔCT method) [47].

Illumina sequencing and assembly
The task of profiling all the immune-related genes involved in V .parahaemolyticus infection began with sequencing the two cDNA libraries prepared from pooled mRNAs obtained from the hepatopancreases of the control and infected groups using the Illumina HiSeq™ 2000 platform. A total of 59,122,940 raw reads were obtained from the control group, and 58,385,094 reads from the Vibrio-infected group. The raw reads were further filtered to remove adaptor sequences, ambiguous reads and low quality reads, thereby generating 90-bp of 54,708,014 and 54,295,342 clean reads for the control and infected groups respectively ( Q 20~9 8% and percentage of unknown nucleotide is 0%). All sequencing reads were deposited into the Short Read Archive (SRA) of the National Centre for Biotechnology Information (NCBI), and can be accessed under the accession number SRR1424572 for control and SRR1424574 for Vibrio-infected ones.
All the clean reads were subjected to de novo assembly using the Trinity program which uses three independent software modules -Inchworm, Chrysalis, and Butterfly applied sequentially to process the huge sequencing data of RNA-seq reads. The assembly of the reads produced 95,645 contigs (with an N 50 of 467 bp and mean length of 313 bp) for the control group and 123,141 contigs (with an N 50 of 482 bp and mean length of 318 bp) for Vibrio-infected group. These contigs were further assembled into unigenes, producing 59,050 control unigenes (with an N 50 of 685 bp and mean length of 479 bp) and 73,946 infected group unigenes (with an N 50 of 829 bp and mean length of 532 bp). The length distribution of control and Vibrio-infected contigs and unigenes are shown in Additional file 2. By clustering unigenes from both libraries, a total of 64,411 standard unigenes were produced, with a mean size of 698 bp and an N 50 of 1137 bp. An overview of the sequencing and assembly is shown in Table 1.
The standard unigenes were annotated by searching the sequences using BLASTX against the NCBI nonredundant, Swiss-Prot, Kyoto Encyclopaedia of Genes and Genome pathway (KEGG) and Orthologous Groups of Proteins (COG) databases, which produced 19,799 (30.73%), 16,832 (26.13%), 14,706 (22.83%) and 7,856 (12.19%) hits respectively, giving a final total of 22,455 significant hits (34.86% of all unigenes). The size distribution profile for the coding sequences (CDS) and identified proteins are shown in Additional file 2. A Gene Ontology (GO) analysis search using the Blast2GO program resulted in 6,007 unigenes (9.32%) being categorized into 55 functional groups. The unigenes without hits using the BLASTX analysis were subjected to an ESTScan, producing 4,977 unigenes (7.82%) predicted to contain coding sequences. The size distribution of the ESTs and proteins are shown in Additional file 2.
The species distribution of the unigenes using the BLASTX results is shown in Figure 1. The M. rosenbergii unigenes were matched against Daphnia pulex sequences (10.3%), Tribolium castaneum (6.1%) and Pediculus humanus corporis (4.2%). The unigenes showed a match with those of D. pulex and T. castaneum probably because of their closer phylogenetic relationship and the availability of vast genomic information. The remaining unigenes (66.5%) which matched were similar to other species due to limited genome information in crustaceans.

Functional annotation
The standard unigenes were analysed using the COG database to classify them and predict their functions. A total of 7,856 unigenes were assigned to COG classifications and functionally classified into 25 protein families which mainly were involved in cellular structure, biochemistry metabolism, molecular processing, and signal transduction ( Figure 2). The cluster predicted for general function (3,431, 43.67%) emerged as the largest group, followed by translation, ribosomal structure, biogenesis cluster (1,760, 22.40%) and the replication, recombination, repair clusters (1,413, 17.98%). The clusters with the lowest number of unigenes were nuclear structure and extracellular structures (<1% in each cluster).
The standard unigenes with Nr annotations were subjected to Gene Ontology (GO) analysis, which provides a dynamic controlled vocabulary and hierarchical relationship for the representation of information on molecular functions, cellular components and biological processes, allowing a coherent annotation of gene products. The GO annotations produced 6,007 unigenes (biological process: 2,177 unigenes; cellular component: 1,563 unigenes; and molecular function: 2,267 unigenes) which were assigned to 55 GO ontology sub-categories ( Figure 3). In the molecular function category, most of the genes fell into the 'binding' and 'catalytic activity' groups, whereas in the biological process category, the majority of the genes fell into the categories of 'cellular processes' , 'metabolic processes' and 'single-organism processes'. Finally, in the cellular component category, a high percentage of the genes fell into the 'cell', 'cell part' and 'organelle' subcategories.
The Kyoto Encyclopaedia of Genes and Genomes (KEGG) Pathway is a collection of manually drawn pathway maps representing knowledge on molecular interactions and reaction networks. Pathway-based analysis are helpful in identifying biological functions and gene interactions in the pathway. Using the KEGG database, 14,706 unigenes were grouped into 253 pathways. The majority of the unigenes fell into the categories of "metabolic   pathways" (2192 members, 14.91%), "regulation of actin cytoskeleton" (557 members, 3.79%), "spliceosome" (509 members, 3.46%), "RNA transport" (498 members, 3.38%) and "focal adhesion" (475 members, 3.22% each). The least represented pathways, with less than 10 unigenes categorized in each pathway, were "biotin metabolism", "phenylalanine, tyrosine and tryptophan biosynthesis", "vitamin B6 metabolism", "lipoic acid metabolism" and "thiamine metabolism". The top twenty of these KEGG biological pathway classifications are shown in Figure 4.

Identification of aberrantly expressed genes
We identified differentially expressed genes between these two groups by comparing the relative transcript abundance in each unigene by using the FPKM method (Fragments Per kb per Million fragments). A total of 14,569 unigenes were found to be aberrantly expressed; 11,446 of these were significantly up-regulated, whereas 3,103 unigenes were significantly down-regulated ( Figure 5). The differentially expressed genes were annotated against the NR, Swiss-Prot, GO, COG and KEGG databases by BLASTX with a cut-off E-value of 10 −5 . The annotation analysis is presented in Additional file 3. The 9,469 (65%) unigenes containing low sequence homology to known sequences in public databases could represent non-coding RNA, misassembled unigenes or unknown genes of M. rosenbergii which responded to V. parahaemolyticus-infection.
For validation of the Illumina sequencing result, seven unigenes were chosen randomly for quantitative real time-PCR (qRT-PCR) analysis. The qRT-PCR results showed similar trends for all the genes to the sequencing data ( Figure 6). For example, based on the Illumina sequencing analysis, arginine kinase 1, antilipopolysaccharide factor, inhibitor of apoptosis protein, caspase, heat shock protein 21, lectin 1 and NF-kappa B inhibitor alpha were up-regulated 4.67, 4.13, 1.02, 3.08, 4.61, 4.4 and 3.11 log2-fold respectively; and the same elements showed 2.5, 4.4, 1.5, 2.1, 4.8, 3.6 and 2.7 log2-fold change respectively in the qRT-PCR analysis. While the results from these two analyses did not match perfectly, perhaps due to sequencing biases, the qRT-PCR analysis broadly confirmed the direction of change obtained from the Illumina sequencing analysis.
Potential immune-related genes involved in M. rosenbergii immune response Many of the aberrantly expressed genes found in the V. parahaemolyticus-infected groups compared to the control group are known to belong to various processes clustered under the animal immune system ( Table 2). These immune genes are grouped into 11 functions, including antimicrobial proteins, proteases and proteinases, signal transduction, blood clotting system, cell death, cytoskeletal, heat shock proteins, oxidative stress, pathogen recognition immune receptors, prophenoloxidase system and other immune genes. The role of these groups is described in more detail in the next (Discussion) section.

Discussion
Apart from viral diseases, Vibrio infections causing Vibriosis is another factor hindering the shrimp aquaculture industry worldwide [9]. This fatal disease has contributed to mass mortality and severe economic losses in India, Thailand, Philippines, Japan, Ecuador, Peru, Colombia and Central America [13]. Knowledge about the interaction between M. rosenbergii and Vibrio species is in its infancy, and in-depth study is urgently needed to address this issue. Discovery of the molecular mechanisms surrounding the innate immune system against Vibrio infection in freshwater prawns should be beneficial to both scientific research and the aquaculture industry. Identifying and quantifying immune-related gene expression on a large scale is a promising method to investigate the host response against pathogens and provide a platform for further studies in this area.
Microarray and EST analyses have long been used to study the molecular mechanisms underlying the innate immune system and to identify genes aberrantly expressed during infection [48][49][50]. However, the most recent NGS platforms such as the Illumina HiSeq™ 2000 appear much better at quantifying transcripts expressed at low levels than microarrays or EST analysis [29]. This is because this revolutionary technique verifies direct transcript profiling without compromise or bias, unlike previous methods [27]. Furthermore, this technology has been successfully used in transcriptome profiling studies on non-model organisms where there is no complete genome database [51][52][53]. The introduction of NGS technology has led to various studies on host-viral interactions in shrimps to identify potential immune-related genes [54-56]but not so far on the interaction between the freshwater prawn and Vibrio species. To our knowledge, this study could be the very first to use the Illumina HiSeq™ 2000 platform to explore the immune-related gene response in M. rosenbergii against V. parahaemolyticus.
Taking advantage of the Illumina HiSeq™ 2000 platform's capability to sequence with a high throughput data providing more candidate genes, the total RNA extracted at the 12th hour time point from a pool of control and infected hepatopancreases was sequenced and assembled using the Trinity assembler. The overall analysis yielded 14,549 differentially expressed unigenes, with 11,446 unigenes significantly up-regulated and 3,103 unigenes significantly down-regulated. The sequencing data analyses obtained clearly showed a significant impact of V. parahaemolyticus infection on the M. rosenbergii transcriptome.
M. rosenbergii possesses an innate immune system, consisting of cellular and humoral components which work individually or cooperatively to protect the species from invading pathogens such as V. parahaemolyticus [17,18]. This immune response is activated when the animal detects the invading pathogen through pattern recognition proteins (PRPs) [57]. Two important PRPs molecules identified in shrimp are lectins and the beta-1,3-glucan binding protein (LGBP) [58][59][60][61]. Besides pathogen recognition, lectins are also involved in phagocytosis through opsonisation in crustaceans [62]. Our data    showed that challenge by V. parahaemolyticus greatly affects the expression of PRPs, as observed in previous studies using different Vibrio strains [63][64][65][66][67]. The activation of LGBP molecules triggers the melanisation process, a prophenoloxidase-activating system (proPO-AS) which is an enzymatic cascade involving several enzymes, including the key enzyme phenoloxidase (PO) [68][69][70]. The active PO converts phenols into quinones. These build a nonspecific crosslink between neighbouring molecules to form melanin, which provides defence against invading microorganisms [71]. Increased activity of prophenoloxidase against Vibrio species has also been noted in Fenneropenaeus indicus [72], L. vannamei [73] and P. monodon [74]. Antimicrobial peptides (AMPs) play a pivotal role in killing or clearing infected pathogens, especially Vibrio species [75]. Notable shrimp AMPs, such as penaeidins, lysozymes, crustins, anti-lipopolysaccharide factors (ALFs) and stylicins have been identified and characterized previously in shrimps [76][77][78][79][80]. However, only four types of AMPslysozyme, crustin, NF-kappa B inhibitor alpha and ALF (3 isoforms)were detected in our transcriptome data and found to be highly expressed. The up-regulation of these AMPs correlated with previous studies showing their antimicrobial properties against Vibrio species and other bacteria [81][82][83][84]. Blood clotting is vital in crustaceans to prevent excess blood loss from a wound and prevent micro-organisms from invading the wound [85]. We found four molecules of the blood clotting systemtransglutaminase, clottable protein, proclotting enzyme and coagulation factor XIIto be highly induced in our transcriptome data after challenge by V. parahaemolyticus. A similar expression of these molecules after bacterial challenge has been reported in previous studies [72,86,87].
Stress conditions such as bacterial infections lead to an accumulation of reactive oxygen species (ROS) in a cell [88]. Increased levels of ROS causes oxidative damage to important cellular macromolecules (lipids, proteins, carbohydrates and nucleotides) which are components of the membranes, cellular enzymes and DNA [89]. In order to restrict the production of ROS, antioxidant genes are activated to produce antioxidant enzymes which eliminate ROS. Several antioxidant enzymes have been isolated and characterised in the penaeid shrimp in previous studies [90][91][92]. In this study, we found six antioxidant unigenes to be over-expressed after V. parahaemolyticus challenge the exception being glutamine synthetase. High expression levels of these genes have similarly been observed in other shrimps and scallops after Vibrio challenge [93][94][95][96]. The up-regulation of actin and tubulin genes play a crucial role in a wide range of cellular functions such as nodule formation, phagocytosis, encapsulation, as well as cell shape change, cell motility and adhesion, all of which may aid in clearing the pathogen [97].
Heat shock proteins (HSPs) are highly generated when induced by stress. They are known to play a major role in protein folding, the protection of proteins from denaturation or aggregation, and aiding protein transport through membrane channels [98,99]. In addition to molecular chaperones, HSPs have been reported to play important roles in innate immune responses, and have been well studied in crustaceans [100][101][102]. In this study, we noted higher levels of expression of all heat shock proteins in M. rosenbergii when challenged by V. parahaemolyticus. The increased expression of these HSPs is in line with previous reports, which tends to confirm the important role of these proteins in protecting this species from the stress induced by Vibrio challenge [97,103,104]. The general higher expression of proteinases and their inhibitors was to be expected in our data, as these are known to modulate elements of the innate immune system such as haemolymph coagulation, antimicrobial peptide synthesis, cell adhesion, and melanisation [105].
Bacteria like Vibrio are known to induce cell apoptosis through a variety of mechanisms such as pore-forming proteins, secretion of protein synthesis inhibitors, molecules activating the endogenous death machinery in an infected cell, lipopolysaccharides, and other superantigens [106]. Increased levels of apoptosis contribute to the degradation of DNA and RNA, which may contribute to shrimp mortality [107]. In our transcriptome data, an up-regulation of genes involved in apoptosis was observed, similar to the trends reported in previous studies [74,108,109]. Apoptosis may also serve as host defence against bacterium by allowing other healthy cells to phagocytise apoptotic bodies containing bacteria from target cells, effectively clearing the pathogen [110].
The signalling pathways involved in the M. rosenbergii innate immune response against V. parahaemolyticus were observed to be highly induced in our transcriptome data. The Toll protein initially identified in Drosophila had been reported to play a key role in the anti-fungal and anti-Gram-positive bacterial responses of flies in the Toll pathway [111]. Other Toll components have also been found to be activated in penaeid shrimps when challenged with Vibrio species [109,112,113]. In Caenorhabditis elegans, the mitogen-activated protein kinase (MAPK) pathways are transcriptionally up-regulated by the pore-forming toxin released by Bacillus thuringiensis, which provide a cellular defence against this toxin [114]. This could explain the higher expression of this signalling pathway in our data, as V. parahaemolyticus is known to release a thermostable direct haemolysin, which is a pore-forming toxin [115]. The Janus kinase (JAK) and signal transducer and activator of transcription (STAT) pathways have been reported to be activated when Fenneropenaeus chinensis is challenged with Vibrio anguillarum, which suggest that these pathways are important for immune responses against bacteria [116]. In addition, Rab-related proteins have been reported to regulate the hemocytic phagocytosis of bacteria in Marsupenaeus japonicas [117]. Several genes in the other immune gene group were found to be aberrantly expressed in our transcriptome data. Calmodulin, which plays an important role in calcium-dependent signal transduction pathways, was over-expressed in our dataas was the case in L. vannamei when challenged with V. parahaemolyticus [118]. Ferritin, an iron storage protein crucial for the metabolism of iron and maintaining iron homeostasis in a cell, was found to be down-regulated. The reduced expression of this gene could possibly lead to prawn mortality, as increased expression of this gene has been found to protect P. monodon from Vibrio harveyi [119]. Arginine kinase (AK), a phosphagen kinase in the invertebrate energy metabolism, has previously been reported to play an immune role against viral infection [32]. However, we observed a higher expression of this gene in M. rosenbergii after challenge by V. parahaemolyticus, which could suggest that AK plays a similar role in bacterial infection. Haemocyanin, an important immune gene in crustaceans, is involved in prophenoloxidase activity [120]. It has antiviral properties against WSSV [121], and its increased expression in our transcriptome data tends to bear out its importance as a defence molecule against challenge by V. parahaemolyticus. Finally, metallothioneins, a metalbinding protein, was found to be highly expressed in our transcriptome data. The increased expression of this gene was to be expected, as it is known as a scavenger of reactive oxygen intermediates and generally shows higher expression levels during immune responses in invertebrates against pathogens [49].

Conclusion
We utilised the Illumina HiSeq™ 2000 platform and Trinity assembler package to perform a de novo transcriptome profiling of the hepatopancreases isolated from M. rosenbergii challenged with V. parahaemolyticus. The differential expression analysis between V. parahaemolyticus-infected and control groups revealed significant differences in the gene expression with 11,446 unigenes found to be significantly up-regulated and 3,103 unigenes observed to be significantly down-regulated. This study provided a valuable insight into antibacterial mechanisms of freshwater prawn against V. parahaemolyticus with majority of the differentially expressed unigenes were grouped into 11 animal immune system categories. Furthermore, this study has generated an abundant list of transcript from M.rosenbergii which will provide a fundamental basis for future genomics research in this field.

Additional files
Additional file 1: Table S1. List of genes, primers and probes used in real-time TaqMan PCR assay.
Additional file 2: Figure S1. Overview of the control and V. parahaemolyticus infected transcriptome assembly. (A) The size distribution of the contigs obtained from our denovo assembly of high-quality clean reads. (B) The size distribution of the unigenes produced from further assembly of contigs (i.e., contig joining, gap filling, and scaffold clustering). (C) Size distributions of the coding sequences (CDS) and identified proteins and (D) Size distributions of the ESTs and proteins obtained from the ESTScan results. For unigene CDS that had no hits in the databases (Nr, SwissProt, KEGG and COG), the BLAST results were subjected to ESTScans and then translated into peptide sequences.