Detection and genomic characterization of coronaviruses among migratory birds in Guangdong Province, China

The recent Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic highlights the significant threat coronaviruses (CoVs) pose to public health. With their extensive cross-continental movements, migratory birds have the potential to serve as reservoirs and vectors for CoVs. This study aimed to investigate the prevalence of CoVs in birds in densely populated areas of Guangdong Province, China. Of the 128 samples collected from birds, six tested positive for CoVs (4.7%, 95% CI: 1.7–9.9%)


Main text
Coronaviruses (CoVs) are considered major threats to humans and animals after we have experienced several zoonotic epidemics originating from animal reservoirs (Shi and Hu 2008;Woo et al. 2009;Forni et al. 2017).For example, SARS-CoV-2 is likely to originate from bats; it has infected more than 775 million people and has killed close to 7 million people worldwide (WHO n.d.;Decaro and Lorusso 2020;Lu et al. 2020;Huang et al. 2020).Thus, monitoring the existence of potentially pathogenic CoVs in animals is important for predicting and preventing viral epidemics.
CoVs are a family of enveloped, positive-sense, singlestranded RNA viruses that belong to the Coronaviridae family and can be divided into three subfamilies, one of which is the subfamily Orthocoronavirinae and includes four genera: Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus (Woo et al. 2012;Gorbalenya et al. 2006).CoVs from Alphacoronavirus and Betacoronavirus mainly infect bats, humans and other mammals.Birds are among the major hosts of Gammacoronavirus and Deltacoronavirus.The representative Gammacoronavirus in birds is infectious bronchitis virus (IBV), which can damage the respiratory, reproductive, and urinary systems of chickens, resulting in economic losses to the poultry industry (Sjaak de Wit et al. 2011).
Recently, multiple novel CoVs, including Gammacoronavirus, such as goose coronavirus CB17 (Papineau et al. 2019) and duck coronavirus 2714 (Zhuang et al. 2015), and Deltacoronavirus, such as bulbul coronavirus HKU11 (Woo et al. 2009) and white eye coronavirus HKU16 (Woo et al. 2012), were discovered in birds.These findings indicated that some undiscovered CoVs may circulate through the migration of birds.Wild birds are both natural and intermediate hosts of multiple viruses and play important roles in long-distance viral transmission through their migratory habits (Rahman et al. 2021).However, only a few studies have investigated CoVs in wild birds in Guangdong Province.We performed this study to discover CoVs carried by wild birds, rare birds, and free-range poultry in Guangdong Province, China.

Viral metagenomics
Among the six positive samples, only three passed RNA quality control for viral metagenomics (Supplemental Table A4).Then, these three samples were subjected to viral metagenomics.In the XN11 library, a total of 122,975 reads were mapped to the night heron coronavirus HKU19-6918 template (JQ065047).In the XN18 library, 1,304,623 reads were assembled into a complete sequence with the highest similarity to the IBV ck/CH/LHB/130578 (KP118890) sequence.In the DWY40 library, 1,408 reads were mapped to the Anser fabalis coronavirus NCN2 template (MW436465).Then, the full genomes of the three CoVs were obtained after confirmation by Sanger sequencing with overlapping primers and then remapping with the libraries (Supplemental Figure A4).The three viruses were named Chinese pond heron coronavirus XN11 (CP_XN11, GenBank No. OR208630), Mallard infectious bronchitis virus XN18 (MD_XN18, GenBank No. OR208629), and Swan goose coronavirus DWY40 (SG_DWY40, GenBank No. OR346994).The sequence read archive (SRA) data and genomes of all three viruses were submitted to NCBI (Supplemental Table A4).

Phylogenetic analysis
Figure 2 shows that both MD_XN18 and SG_DWY40 were clustered with Gammacoronavirus.CP_XN11 clustered with the night heron coronavirus HKU19 strain and belongs to the genus Deltacoronavirus.MD_XN18 clustered with multiple avian IBVs rather than with duck CoVs.Further analysis demonstrated that IBV XN18 clustered with IBV Moroccan-G/83, which is a representative strain of the IBV GI-13 genotype (Supplemental Figure A3).Therefore, the strain obtained in this study belongs to the GI-13 genotype.

The Chinese pond heron coronavirus XN11
CP_XN11 is a new Deltacoronavirus strain discovered in China.Deltacoronaviruses were first identified in 2009 and were subsequently detected worldwide in multiple hosts, including pigs, quails, sparrows, houbaras, and falcons.These findings suggested that viruses from Deltacoronavirus have the potential to be transmitted from avian to avian and from avian to mammalian (Lau et al. 2018).Currently, the sequences of deltacoronaviruses are still rarer than those of other genera of CoVs.In this study, we identified CP_XN11 in the Chinese pond heron, which is a migrating bird in East Asia.The genome of CP_XN11 spans 26,077 base pairs (bp) in length, with a GC content of 38.50%.It includes a 5' untranslated region (UTR) of 481 bp and a 3' UTR of 201 bp (Supplemental Figure A2 and Supplemental Table A5).The complete genomes, genes, and proteins of CP_XN11 shared 95.2-98.8%similarity with those of the night heron coronavirus HKU19 strain (Table 1).
Fig. 2 The phylogenetic tree was constructed from the whole viral genomic sequences of gammacoronaviruses (marked with green, upper branch) and deltacoronaviruses (marked with blue, lower branch).The viruses discovered in this study are marked in red.The viruses isolated from similar hosts are marked by black lines with a host sketch on the right

Mallard infectious bronchitis virus XN18
IBV is a highly contagious viral disease that has caused significant economic losses in the poultry industry worldwide (Zhao et al. 2023).The migratory birds are suspected to be wild reservoirs of IBVs, which are spreading IBV strains worldwide; however, direct evidence of this phenomenon has rarely been reported.Recently, Hemnani et al. found some partial IBV-like 406-nt sequences in Anseriformes in Portugal (Hemnani et al. 2022).In this study, we identified the complete genome sequence of MD_XN18, which is 27,645 bp long, with a GC content of 38.19%, a 5'UTR of 519 bp, and a 3'UTR of 257 bp (Supplemental Figure A2 and Supplemental Table A5).The complete genomes, genes, and proteins of MD_XN18 were more similar to those of chicken IBVs than to those of duck CoVs and shared 95-99% similarity with those of the Gammacoronavirus sp.DLSL21 strain, which was discovered in chickens in Jilin Province, Northeast China (Table 1).Further analysis revealed that there were approximately 199 amino acid differences in viral structural proteins between the MD_XN18 and Gammacoronavirus sp.DLSL21 strains, which may have caused cross-species transmission of avian IBV in mallards (Supplemental Table A6).The phylogenetic tree showed that the MD_XN18 obtained in this study belonged to the GI-13 genotype, which is one of the three most circulating IBV strains in China in the past 30 years (Fan et al. 2022).This finding may provide direct evidence that migratory mallards can spread avian IBVs.

Swan goose coronavirus DWY40
Viruses belonging to the Canada goose coronavirus CB17 species have been found in Canadian geese, snow geese, tundra swans, greater white fronted geese, Indian spot-billed ducks, and bean geese (Papineau et al. 2019;Zhu et al. 2021).This study was the first case of the Canada goose coronavirus CB17 virus infecting the swan goose (Anser cygnoides) in Asia.The complete genome sequence of SG_DWY40 is 28,474 bp long, the GC content is 38.20%, the 5'UTR length is 520 bp, and the 3'UTR length is 626 bp (Supplemental Figure A2 and Supplemental Table A5).The complete genomes, genes, and proteins from SG_DWY40 shared 93.6-99% similarity with those of the NW436456 Anser fabalis-NCN2 strain, except for the S gene and S protein.The S gene and S protein of SG_DWY40 shared 90.1% and 92.6% similarity with those of the goose coronavirus CB17 strain but shared only 77.2% and 69.8% similarity with those of the Anser fabalis-NCN2 strain (Table 1).
The results of the recombination analysis based on the SG_DWY40 genome are shown in Fig. 3A. he 1-20,000 nt sequence was most similar to that of the Answer fabalis-NCN2, followed by the 23,238-23,514 nt sequence, which was most similar to that of the Canada goose CB17.Interestingly, two recombinations with unknown sources may have occurred at nucleotides 19,997-21,238 and 23,514-24,122 in the SG_DWY40 genome, impacting the expression of the S protein and proteins between the S gene and the E gene.
Thus, the putative viral proteins were identified by ORF Finder, and their functional domains, such as transmembrane domains, signal sequences, and N-linked glycosylation sites, were subsequently predicted.The results are shown in Fig. 3B.We found that two genome regions near the S gene of SG_DWY40 may originate from an unknown parent sequence.Interestingly, two putative proteins of SG_DWY40 may be differentially expressed from those of other Canada goose CoVs.One is the ORF3a protein, which can play a role in host responses, viral replication, virus pathogenicity, and host virus interactions during coronavirus infection (Si et al. 2023).Compared with those of Canada goose CB17, the S and ORF3b proteins of SG_DWY40 have similar functional domains.However, compared with those of other Canada goose CB17 viruses, the ORF 3a protein is smaller and contains an extra N-linked glycosylation site and transmembrane domain.The other was the putative NS3.5 protein, which is a novel putative viral protein that was predicted to be located between the ORF3a and ORF4a proteins in the SG_DWY40 genome.NS3.5 has 148 amino acid sequences with two N-linked glycosylation sites and one transmembrane domain (Fig. 3B).The existence and functions of the putative NS3.5 protein need further confirmation.

Conclusions
This study revealed 3 CoVs from 128 samples from wild and domestic birds in Guangdong Province.The complete genomes of three CoVs, CP_XN11, MD_XN18, and SG_DWY40, were obtained.MD_XN18 belongs to the GI-13 viral lineage and can be found in migratory mallards, harboring high genomic similarity with chicken IBV strains, providing direct evidence that migratory mallards can spread avian IBVs.SG_DWY40 has two genome regions that have recombined from an unknown parent sequence, which may cause a change in the functional domain of the ORF3a protein and the expression of a novel protein, putative NS3.5, in coronavirus.

Subsection sample collection and preprocessing
Stool samples were collected in Guangzhou and Jiangmen cities from 2021 to 2023.Bird stool samples were collected from migratory bird aggregation sites and zoos.Poultry stool samples were collected from free-ranging poultry in suburban villages.After collection, the samples were frozen at -20°C and then transported to the laboratory within 24 h before total RNA was extracted.The samples were thawed, and an equal volume of PBS was added and shaken for 1 min.Then, the centrifuge tubes were centrifuged at 3600 × g at 4°C for 10 min.

RNA extraction and positive screening for CoVs
To identify coronavirus-positive samples, total RNA was extracted from the samples by using TRIzol RNAiso Plus 9109 (Takara, Japan).Then, based on the method recommended by Festa (Drzewniokova et al. 2021), the Prime-Script ™ One-Step RT-PCR Kit V. 2 (Takara, Japan) was used for reverse transcription PCR, and Premix Taq ™ (TaKaRa Taq ™ V。 2.0 plus dye, Takara, Japan) was used for the second round of PCR.The first round of PCR primers was Hu.F and Hu.R (Hu et al. 2018) The primers used for the second round were obtained from Poon.F (Woo et al. 2005) and Chu DKW (Chu et al. 2006).The primers used in this research are listed in Supplemental Table A1.

Species identification
Species information was obtained for samples after morphological or DNA identification.Samples for which no

Fig. 1
Fig. 1 Sampling locations in two cities (Jiangmen and Guangzhou) of Guangdong Province, China.The censor code of this map is GS (2016)1598

Fig. 3
Fig. 3 Recombination and protein analysis of the SG_DWY40 genome.A Similarity plots of the genome sequences of SG_DWY40, Answer fabalis coronavirus NCN2 (in blue) and goose coronavirus CB17 (in orange) are shown.The predicted genome structure of SG_DWY40 is shown above the similarity plot.The recombination issue with unknown parent sequences is marked by a red box with their nucleotide position above.B The conserved domains of the predicted proteins in the three viruses are shown

Table 1
Comparison of nucleotide and amino acid homology between the discovered viruses and closely related viruses