Phylogenetic characterization of norovirus strains detected from sporadic gastroenteritis in Seoul during 2014–2016

Background Phylogenetic analysis of norovirus (NoV) is efficient for tracking NoV transmission. To determine the widespread NoV strains in Seoul, we conducted an extensive phylogenetic characterization of NoV-positives from 1659 diarrheal specimens collected in 2014–2016 for the Seoul NoV-surveillance. Results When the large numbers of NoV partial VP1 genome sequences were analyzed in acute gastroenteritis patients along with the phylogenetic characterization, we could identify molecular epidemiologic patterns based on the genetic characteristics of sporadic NoV strains circulating in Seoul, which could provide a detailed description of the genome-wide and community-wide NoV evolution in each genotype. The average NoV detection rate in our study period was 16.34% that was increased by 7.44% from 13.17% in 2014 to 20.61% in 2016. Prevalence of NoV GI and GII was 4.43% and 93.36%, respectively, and the GII.4, GII.17, and GII.3 were found to be the major type among 17 genotypes of NoV. The most prevalent one was GII.4 (50.92%) that was followed by GII.17 (18.08%) and GII.3 (9.96%). According to an extensive phylogenetic analysis based on partial VP1 sequences of 1008 NoV (276 sporadic, 518 outbreak and 214 reference), pandemic strains of GII.17, GII.4 and GII.3 have emerged in succession during the 2014-2016 Seoul NoV-surveillance. GII.17 emerged as GII.17|Kawasaki323 in 2014, and became the predominant genotype in 2015 with GII.17|2014_Kawasaki lineages (CUHK-NS-616/Kawasaki308). The formerly predominant GII.4 remained high-level with GII.4|2012_Sydney in 2014 and internally replaced to GII.4|2016_Kawasaki194 lineage (NOR-2565/NOR-2558/OH16002) that caused the sporadic NoV explosion since December 2015. Sporadically prevalent GII.3|Hu/Aichio334-13/2013 failed to develop any outbreaks, whereas sporadic GII.3|Hu/3-28/2015/HNZZ/CHN caused heavy outbreaks in Seoul without preparation time since November 2016. Conclusions This is the first extensive phylogenetic study revealing the important events of NoV strains circulating in Seoul. Particularly, our study period from 2014 to 2016 was very dynamic with the emergences of the three main NoV strains (GII.17|2014_Kawasaki, GII.4|2016_Kawasaki194 and GII.3|Hu/3-28/2015/HNZZ/CHN) every year. We are sure that it is hard to detect above findings by simple conventional analysis. Our present study reports a future paradigm of the NoV molecular epidemiology, which might be highly valuable to track new strains and predict oncoming outbreaks. Electronic supplementary material The online version of this article (10.1186/s13099-018-0263-8) contains supplementary material, which is available to authorized users.


Background
Acute gastroenteritis (AGE) causes one of the major public health problems [1], and NoV has been reported as the most common cause of AGE [2]. NoV is a nonenveloped, positive-sense, single-stranded RNA virus with a linear genome (7.5-7.7 kb), which belongs to the family Caliciviridae with three open reading frames (ORFs) encoding nine structural and nonstructural proteins [3][4][5]. ORF1 encodes nonstructural proteins such as NTPase, protease, and RNA-dependent RNA polymerase (RdRp). ORF2 overlaps ORF1 by a short region and encodes the major capsid protein, VP1. ORF3 encodes the minor capsid protein, VP2 [6]. NoVs are highly diverse and currently sub-divided into six genomic groups (GI/GII/GIII/GIV/GV/GVI) with more than 40 genotypes based on their VP1 sequences [7,8].
It has been reported that NoV caused at least six pandemics of AGE (defined as taking place on at least three continents over a similar time-frame) since 1995; [1995][1996] [8]. NoV exhibits over 40 genotypes cocirculating within the population, however, GII.4 has emerged only as novel variants about every 2-4 years, massive outbreaks, and pandemics [9]. Like influenza virus, population immunity may drive the evolution of NoV and the emergences of its new variants [10], which undergoes genetic and antigenic evolution through accumulation of point mutations and intra-and intergenotype recombinations [11].
As awareness and knowledge about the growth of Seoul NoV epidemiology, the question has been raised how to effectively track the emergence of new NoV strains, and how to monitor the spread of them. We, therefore, tried to create a more extensive phylogenetic characterization of our data obtained from the Seoul NoV surveillance. This NoV-surveillance system aimed at controlling the spread of future NoV outbreaks by monitoring the circulating strains. Here, we presented the widespread-and newly emerged-NoV strains in Seoul and tried to characterize their molecular epidemiology in the 2014-2016 Seoul surveillance. During the 3-year study period, we have observed novel epidemic strains found in global distribution, however, their sub-lineages showed different scale and impact in distribution or prevalence along with co-circulating strains in Seoul. Here, we also reported sporadic strains developed into outbreaks in our NoV-surveillance system. Total 1008 sequences were analyzed phylogenetically in the five different NoV models (GI/GII.4/GII.17/ GII.3/other types of GII).

Ethics statement
All the processes from sample collection to diagnosis of NoV were followed by National Norovirus Surveillance System in Korea (K-CaliciNet) and the "Guideline for water and foodborne diseases prevention and control" [12] under the Korean "Enforcement Regulations of the Infectious Disease Control and Prevention Act". All data were handled based on the Korea Centers for Disease Control and Prevention (KCDC) regulations. Present study was carried out after the diagnosis of NoV as an anonymous epidemiological data and a phylogenetic characterization of NoV sequences. According to the Human Subjects Institutional Review Board (IRB) of Korea National Institute for Bioethics Policy, our present study is excluded from the subject to deliberation.

Specimens and diagnosis
Diarrheal fecal specimens, total 1659, were collected from AGE patients with age range 0-84 from ten local hospitals in Seoul during 2014-2016 (Fig. 1). The samples were weekly collected and pretreated on arrival to test by qRT-PCR using PowerCheck ™ Norovirus GI/GII (Kogene, Korea). For phylogenetic analysis, partial regions of VP1 for GI and GII genotypes were amplified by RT-PCR and semi-nested RT-PCR with the primers described in Table 1 [13] using the HyQ ™ One-step RT-PCR kit (SNC, Korea). To apply one-step RT-PCR, the specific primer pairs (GI-F1M/GI-R1M and GII-F1M/GII-R1M) targeting VP1 in ORF2 were applied. Semi-nested PCR was performed using the one-step RT-PCR product (2 μl) and the primers (GI-F2/GI-R1M and GII-F3/GII-R1M). Before we request DNA sequencing in Macrogen, Inc. (Seoul, Korea), the RT-PCR products were purified using the QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany). Analysis of nucleotide sequences was carried out by Macrogen using the Big Dye Dideoxy cycle sequencing kit and the ABI PRISM 3730XL Analyzer (Applied Biosystem, USA). The diagnostic practice for NoV detection was conducted according to the guidelines of the KCDC [12] and the manufacturer's instructions.
In our present study, total 4073 diarrheal specimens (1659 sporadic, 2414 outbreak) of AGE patients could cover in part the population in Seoul to represent NoV trend in Korea. In addition, we have expanded the study period of Seoul NoV epidemicity from the 3 year (2014-2016) to 10 year (2007-2016) by reanalyzing the unreported-NoV surveillance data (Fig. 3).
To detect sporadic strains developed into outbreaks in the Seoul NoV-surveillance system, 518 strains out of 2414 AGE patients obtained from outbreaks in Seoul during January 2014-June 2017 were compared with 276 strains out of 1659 AGE patients obtained from the surveillance in January 2014-December 2016. To examine the phylogenetic relationship with reference strains, 126 candidate standard strains (Additional file 1: Table S1) and 88 global strains (selected only as highly similar to our strains among updated NCBI sequences in each genotype) of NoV GI and GII were collected from NoroNet and GenBank, and then phylogenetically analyzed above-mentioned 794 (518 + 276) strains in the five NoV models, GI/GII.4/GII.17/GII.3/other types of GII (Fig. 2). To investigate presence or absence of unreported-NoV epidemic curves, we reanalyzed the data [14,15] that had been previously obtained from the Seoul NoV-surveillance system in 2007-2013 (Fig. 3).

Phylogenetic analysis
With rapid accumulation of huge norovirus sequence data [16] and an attempt to establish a unified norovirus classification in the CDC [17], the complete VP1 was replaced by the partial nucleotide sequence of a highlyvariable N-terminal region in the VP1 (277-nucleotides).  Therefore, the partial VP1 capsid region in ORF2 has routinely been used since 2006 to investigate genotyping and presence of norovirus variants by using a web-based automated typing tool [18]. For convenience of analysis, the primer sets [13] targeting a partial VP1 region have been recommended for PCR cloning and sequencing in the National Norovirus Surveillance Guideline in Korea [12]. In the five phylogenetic trees (GI/GII.4/ GII.17/GII.3/other types of GII), 276 sporadic sequences and 518 outbreaks were aligned with each candidate standard strains (Additional file 1: Table S1) and additional global strains (Fig. 2). Phylogenetic analysis was performed using MEGA7.0 ( Fig. 2) based on the partial VP1 sequences (289 nucleotides at the 5360-5648 with reference to NC_001959 for GI, and 279-293 nucleotides from the 5085 to 5363-5377 with reference to NC_029646 for GII). All aligned sequences were trimmed at the above-mentioned nucleotide positions as that of each NoV GI and GII strain. Maximum Likelihood phylogenetic trees constructed in MEGA 7.0 program were inferred from the 1000 replicates based on the Tamura-Nei model with a bootstrap consensus tree, and the bootstrap values were given above 70%.

Prevalence of NoV
NoV was detected in the 271 specimens out of 1659 (16.34%) and the annual detection rate was increased  Table 2).

Discussion
To characterize phylogenetic epidemiology of NoV strains circulating in Seoul, we first conducted an extensive phylogenetic analysis based on partial VP1 sequences of total 1008 NoV (794 NoV positives from 4073 AGE specimens and 214 global references from NoroNet and GenBank). This extensive phylogenetic analysis revealed In the GI phylogenetic tree, the three sporadic sequences (GI.4|Valetta, GI.3|Beijing55042 and GI.5|Musgrove) found in 2014 were identical to the nine outbreaks in 2015-2016. They were also identical to the global strains (AB545482 and KT383937 in GI.4|Valetta,  (Fig. 4). The data can suggest that the origin of Seoul sporadic GI strains may come from Japan.
In the three major phylogenetic trees of GII.4, GII.17 and GII.3, all the strains reported in the present study were the same as the global strains reported in Noro-Net and GenBank. In the GII.17 phylogenetic tree, the first occurrence was detected as GII.17|Kawasaki323 (AB983218) in July 2014, which was identical to the new GII.P17-GII.17 reported in March 2014 in Japan [19]. The new GII.17 was 96% homologous in the amino acid sequences with the GII.17 strain reported in Korea by KCDC [20]. Since first emergence of GII.17|Kawasaki323, GII.17|CUHK-NS-616 and GII.17|Kawasaki308 sharply increased during the winter in 2014-2015, and the GII.17 became the first dominant genotype in Seoul in January 2015 (Figs. 3 and 6). This epidemicity was well accordant with the global trend of the new GII.17 (GII.17|Kawasaki). According to the previous report by KCDC, the GII.17, previously considered as a minor type in Korea, is the predominant NoV since December 2014 [20]. During the winter in 2014-2015, the new GII.17 also emerged and became the predominant genotype in Japan [19], several major cities in mainland China [21,22], and HongKong [23]. The new GII.P17-GII.17 was also detected sporadically outside of Asia such as Italy, Romania, and the United States [24][25][26]. In our previous

5).
In the NoV GII.4 phylogenetic tree, all the sporadic GII.4 sequences were tightly clustered together, and largely sub-divided into two sub-clusters (GII.4|2012 and GII.4|2016) (Figs. 2 and 5). These sub-clusters showed distinct difference from the sporadic strains to outbreaks and seasonal epidemics. According to the previous report about GII.4|2012_Sydney, it was the most frequently found sub-genotype (60.4%) during November 2012 and January 2013 in Korea [27]. GII.4|2012_Sydney was still the most frequently found sub-genotype (51.69%, 46/89 positives) of the Seoul NoV-surveillance in 2014 (Fig. 5). Since November 2015, GII.4|2012 was replaced internally by novel GII.4|2016_Kawasaki194 (NOR-2565/NOR-2558/OH16002) that was the variants of GII.4|2012_ Sydney (JX459908). The sharply increased sporadic NoV in the winter of 2015 (Fig. 3) was mainly caused by II.4|2016_Kawasaki194 (LC175468) that was first detected from AGE patients in Kawasaki City in 2016 [28]. Since its first detection, GII.4|2016_Kawasaki194 spread very rapidly and caused sporadic NoV explosion, which raised the average NoV detecion rate from 13.17% (89/676 samples in 2014) to 20.61% (108/524 samples in 2016) during the present study period. Due to unusually high NoV activity, seasonal epidemic curve in 2016 was skewed to the novel "right-sided W-shaped" curve ( Fig. 3). By reanalyzing the Seoul NoV-surveillance data in 2007-2013, we found that the epidemic curve in 2007 was similar with our novel "right-sided W-shape" curve in 2016 (dotted box in Fig. 3). Additional evolution studies are required to investigate why the estimated GII.4|2006b and GII.4|2016_Kawasaki194 strains spreaded more rapidly and caused heavy explosions in Seoul in 2007 and 2016 compared to other pandemic strains. In the II.4|2016_Kawasaki194 sub-cluster, our 11 sporadic sequences were identical to four global strains (KY887601, KY905335, KX764665 and KX764664), which were collected from human stools and stream waters in January and September 2016. Above-mentioned four global strains and the II.4|2016_Kawasaki194 strain were all reported as GII.P16-GII.4 Sydney2012 recombinants [29,30]. The first appearance of the Seoul sporadic Kawasaki194 strain was in November 2015, which is at least a few months ahead of identical global strains and the II.4|2016_Kawasaki194 candidate standard strain.

Limitation
We agree with the presence of several limitations in the present study to determine the geographical distribution and mechanism of the NoV outbreak. However, this study was abided by the K-CaliciNet and the national norovirus surveillance guidelines [12]. The followings may explain potential reasons of our present limitation; First, present analysis of target sequences was confined to the partial VP1 region. Although it is not long enough to detect whole genome of NoV, the VP1 was designated as the NoV surveillance target sequences based on its high variability in sequences and efficiency. The sequencing of partial VP1 also allows large quantities of NoV to be genotyped economically with epidemiologic trends. A novel NoV lineage containing the GII.P16 polymerase and pandemic GII.4 Sydney and other GII capsid were recently detected in Asia and Europe during the winter in 2016 to 2017 [29]. To examine NoV evolutions in recombination and surface-exposed antigenic regions, future study should be focused on exploring the large target sequences covering RdRp and complete VP1 region. Second, the 10 hospitals employed for the present study were not evenly distributed in Seoul (three out of 10 hospitals were localized in one administrative district-Gu) (Fig. 1). Although we could track new strains through phylogenetic analysis with outbreaks and global strains, it was insufficient to cover the detailed NoV transmission routes. Lastly, our sample collection was limited to the patients with symptomatic infection, however, over 30% of NoV infection is asymptomatic with shedding virus [33].

Conclusions
During 2014-2016, we determined 17 NoV genotypes and their sub-genotypes widespread in Seoul. By the first extensive phylogenetic characterization of 1008 specimens, we could track the emergence of new NoV strains that is able to cause massive outbreak or sporadic AGE infection globally. Most of them were found to be the novel variants of three major genotypes (GII. By analyzing the development from sporadic strains to outbreaks in various phylogenetic trees, we can show distinctly different patterns depending on each NoV lineages. Our report has an important implication in the understanding NoV incidence and developing a treatment vaccine against NoV.

Additional file
Additional file 1: Table S1. Candidate standard strains for genotyping and sub-clustering.