Skip to main content
  • Short Communication
  • Open access
  • Published:

Detection and genomic characterization of coronaviruses among migratory birds in Guangdong Province, China

Abstract

The recent Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic highlights the significant threat coronaviruses (CoVs) pose to public health. With their extensive cross-continental movements, migratory birds have the potential to serve as reservoirs and vectors for CoVs. This study aimed to investigate the prevalence of CoVs in birds in densely populated areas of Guangdong Province, China. Of the 128 samples collected from birds, six tested positive for CoVs (4.7%, 95% CI: 1.7–9.9%), and three complete viral genomes were obtained through viral metagenomics and PCR. Phylogenetic analysis revealed that two CoVs (MD_XN18 and SG_DWY40) belonged to the Gammacoronavirus genus, while one (CP_XN11) belonged to the Deltacoronavirus genus. Homology analysis revealed that the MD_XN18 strain discovered in mallards shares 95.6–97.4% sequence similarity with chicken infectious bronchitis viruses (IBVs), providing direct evidence that migratory mallards can transmit avian IBVs. Recombination analysis suggested that two genomic regions of SG_DWY40 could originate from unknown sources through recombination, potentially leading to the expression of a novel viral protein, provisionally named NS3.5. These findings underscore the ongoing transmission and evolution of CoVs among birds in cities near Guangdong Province, emphasizing the need for continued monitoring and research.

Main text

Coronaviruses (CoVs) are considered major threats to humans and animals after we have experienced several zoonotic epidemics originating from animal reservoirs (Shi and Hu 2008; Woo et al. 2009; Forni et al. 2017). For example, SARS-CoV-2 is likely to originate from bats; it has infected more than 775 million people and has killed close to 7 million people worldwide (WHO n.d.; Decaro and Lorusso 2020; Lu et al. 2020; Huang et al. 2020). Thus, monitoring the existence of potentially pathogenic CoVs in animals is important for predicting and preventing viral epidemics.

CoVs are a family of enveloped, positive-sense, single-stranded RNA viruses that belong to the Coronaviridae family and can be divided into three subfamilies, one of which is the subfamily Orthocoronavirinae and includes four genera: Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus (Woo et al. 2012; Gorbalenya et al. 2006). CoVs from Alphacoronavirus and Betacoronavirus mainly infect bats, humans and other mammals. Birds are among the major hosts of Gammacoronavirus and Deltacoronavirus. The representative Gammacoronavirus in birds is infectious bronchitis virus (IBV), which can damage the respiratory, reproductive, and urinary systems of chickens, resulting in economic losses to the poultry industry (Sjaak de Wit et al. 2011).

Recently, multiple novel CoVs, including Gammacoronavirus, such as goose coronavirus CB17 (Papineau et al. 2019) and duck coronavirus 2714 (Zhuang et al. 2015), and Deltacoronavirus, such as bulbul coronavirus HKU11 (Woo et al. 2009) and white eye coronavirus HKU16 (Woo et al. 2012), were discovered in birds. These findings indicated that some undiscovered CoVs may circulate through the migration of birds. Wild birds are both natural and intermediate hosts of multiple viruses and play important roles in long-distance viral transmission through their migratory habits (Rahman et al. 2021). However, only a few studies have investigated CoVs in wild birds in Guangdong Province. We performed this study to discover CoVs carried by wild birds, rare birds, and free-range poultry in Guangdong Province, China.

Bird and poultry stool sample collection

This study collected 128 stool samples, which included 87 samples from Guangzhou and 41 samples from Jiangmen (Fig. 1, Supplemental Figure A1, Supplemental Table A3). Through reverse transcription PCR, we detected six coronavirus-positive samples (6/128, 4.7%, 95% CI: 1.7–9.9%), three from Guangzhou (3/87, 3.4%, 95% CI: 0.7–9.7%) and three from Jiangmen (3/41, 7.3%, 95% CI: 1.5–19.9%).

Fig. 1
figure 1

Sampling locations in two cities (Jiangmen and Guangzhou) of Guangdong Province, China.The censor code of this map is GS (2016)1598

Viral metagenomics

Among the six positive samples, only three passed RNA quality control for viral metagenomics (Supplemental Table A4). Then, these three samples were subjected to viral metagenomics. In the XN11 library, a total of 122,975 reads were mapped to the night heron coronavirus HKU19-6918 template (JQ065047). In the XN18 library, 1,304,623 reads were assembled into a complete sequence with the highest similarity to the IBV ck/CH/LHB/130578 (KP118890) sequence. In the DWY40 library, 1,408 reads were mapped to the Anser fabalis coronavirus NCN2 template (MW436465). Then, the full genomes of the three CoVs were obtained after confirmation by Sanger sequencing with overlapping primers and then remapping with the libraries (Supplemental Figure A4). The three viruses were named Chinese pond heron coronavirus XN11 (CP_XN11, GenBank No. OR208630), Mallard infectious bronchitis virus XN18 (MD_XN18, GenBank No. OR208629), and Swan goose coronavirus DWY40 (SG_DWY40, GenBank No. OR346994). The sequence read archive (SRA) data and genomes of all three viruses were submitted to NCBI (Supplemental Table A4).

Phylogenetic analysis

Figure 2 shows that both MD_XN18 and SG_DWY40 were clustered with Gammacoronavirus. CP_XN11 clustered with the night heron coronavirus HKU19 strain and belongs to the genus Deltacoronavirus. MD_XN18 clustered with multiple avian IBVs rather than with duck CoVs. Further analysis demonstrated that IBV XN18 clustered with IBV Moroccan-G/83, which is a representative strain of the IBV GI-13 genotype (Supplemental Figure A3). Therefore, the strain obtained in this study belongs to the GI-13 genotype.

Fig. 2
figure 2

The phylogenetic tree was constructed from the whole viral genomic sequences of gammacoronaviruses (marked with green, upper branch) and deltacoronaviruses (marked with blue, lower branch). The viruses discovered in this study are marked in red. The viruses isolated from similar hosts are marked by black lines with a host sketch on the right

The Chinese pond heron coronavirus XN11

CP_XN11 is a new Deltacoronavirus strain discovered in China. Deltacoronaviruses were first identified in 2009 and were subsequently detected worldwide in multiple hosts, including pigs, quails, sparrows, houbaras, and falcons. These findings suggested that viruses from Deltacoronavirus have the potential to be transmitted from avian to avian and from avian to mammalian (Lau et al. 2018). Currently, the sequences of deltacoronaviruses are still rarer than those of other genera of CoVs. In this study, we identified CP_XN11 in the Chinese pond heron, which is a migrating bird in East Asia. The genome of CP_XN11 spans 26,077 base pairs (bp) in length, with a GC content of 38.50%. It includes a 5' untranslated region (UTR) of 481 bp and a 3' UTR of 201 bp (Supplemental Figure A2 and Supplemental Table A5). The complete genomes, genes, and proteins of CP_XN11 shared 95.2–98.8% similarity with those of the night heron coronavirus HKU19 strain (Table 1).

Table 1 Comparison of nucleotide and amino acid homology between the discovered viruses and closely related viruses

Mallard infectious bronchitis virus XN18

IBV is a highly contagious viral disease that has caused significant economic losses in the poultry industry worldwide (Zhao et al. 2023). The migratory birds are suspected to be wild reservoirs of IBVs, which are spreading IBV strains worldwide; however, direct evidence of this phenomenon has rarely been reported. Recently, Hemnani et al. found some partial IBV-like 406-nt sequences in Anseriformes in Portugal (Hemnani et al. 2022). In this study, we identified the complete genome sequence of MD_XN18, which is 27,645 bp long, with a GC content of 38.19%, a 5’UTR of 519 bp, and a 3’UTR of 257 bp (Supplemental Figure A2 and Supplemental Table A5). The complete genomes, genes, and proteins of MD_XN18 were more similar to those of chicken IBVs than to those of duck CoVs and shared 95–99% similarity with those of the Gammacoronavirus sp. DLSL21 strain, which was discovered in chickens in Jilin Province, Northeast China (Table 1). Further analysis revealed that there were approximately 199 amino acid differences in viral structural proteins between the MD_XN18 and Gammacoronavirus sp. DLSL21 strains, which may have caused cross-species transmission of avian IBV in mallards (Supplemental Table A6). The phylogenetic tree showed that the MD_XN18 obtained in this study belonged to the GI-13 genotype, which is one of the three most circulating IBV strains in China in the past 30 years (Fan et al. 2022). This finding may provide direct evidence that migratory mallards can spread avian IBVs.

Swan goose coronavirus DWY40

Viruses belonging to the Canada goose coronavirus CB17 species have been found in Canadian geese, snow geese, tundra swans, greater white fronted geese, Indian spot-billed ducks, and bean geese (Papineau et al. 2019; Zhu et al. 2021). This study was the first case of the Canada goose coronavirus CB17 virus infecting the swan goose (Anser cygnoides) in Asia. The complete genome sequence of SG_DWY40 is 28,474 bp long, the GC content is 38.20%, the 5’UTR length is 520 bp, and the 3’UTR length is 626 bp (Supplemental Figure A2 and Supplemental Table A5). The complete genomes, genes, and proteins from SG_DWY40 shared 93.6–99% similarity with those of the NW436456 Anser fabalis-NCN2 strain, except for the S gene and S protein. The S gene and S protein of SG_DWY40 shared 90.1% and 92.6% similarity with those of the goose coronavirus CB17 strain but shared only 77.2% and 69.8% similarity with those of the Anser fabalis-NCN2 strain (Table 1).

The results of the recombination analysis based on the SG_DWY40 genome are shown in Fig. 3A. he 1–20,000 nt sequence was most similar to that of the Answer fabalis- NCN2, followed by the 23,238–23,514 nt sequence, which was most similar to that of the Canada goose CB17. Interestingly, two recombinations with unknown sources may have occurred at nucleotides 19,997–21,238 and 23,514–24,122 in the SG_DWY40 genome, impacting the expression of the S protein and proteins between the S gene and the E gene.

Fig. 3
figure 3

Recombination and protein analysis of the SG_DWY40 genome. A Similarity plots of the genome sequences of SG_DWY40, Answer fabalis coronavirus NCN2 (in blue) and goose coronavirus CB17 (in orange) are shown. The predicted genome structure of SG_DWY40 is shown above the similarity plot. The recombination issue with unknown parent sequences is marked by a red box with their nucleotide position above. B The conserved domains of the predicted proteins in the three viruses are shown

Thus, the putative viral proteins were identified by ORF Finder, and their functional domains, such as transmembrane domains, signal sequences, and N-linked glycosylation sites, were subsequently predicted. The results are shown in Fig. 3B. We found that two genome regions near the S gene of SG_DWY40 may originate from an unknown parent sequence. Interestingly, two putative proteins of SG_DWY40 may be differentially expressed from those of other Canada goose CoVs. One is the ORF3a protein, which can play a role in host responses, viral replication, virus pathogenicity, and host virus interactions during coronavirus infection (Si et al. 2023). Compared with those of Canada goose CB17, the S and ORF3b proteins of SG_DWY40 have similar functional domains. However, compared with those of other Canada goose CB17 viruses, the ORF 3a protein is smaller and contains an extra N-linked glycosylation site and transmembrane domain. The other was the putative NS3.5 protein, which is a novel putative viral protein that was predicted to be located between the ORF3a and ORF4a proteins in the SG_DWY40 genome. NS3.5 has 148 amino acid sequences with two N-linked glycosylation sites and one transmembrane domain (Fig. 3B). The existence and functions of the putative NS3.5 protein need further confirmation.

Conclusions

This study revealed 3 CoVs from 128 samples from wild and domestic birds in Guangdong Province. The complete genomes of three CoVs, CP_XN11, MD_XN18, and SG_DWY40, were obtained. MD_XN18 belongs to the GI-13 viral lineage and can be found in migratory mallards, harboring high genomic similarity with chicken IBV strains, providing direct evidence that migratory mallards can spread avian IBVs. SG_DWY40 has two genome regions that have recombined from an unknown parent sequence, which may cause a change in the functional domain of the ORF3a protein and the expression of a novel protein, putative NS3.5, in coronavirus.

Methods

Subsection sample collection and preprocessing

Stool samples were collected in Guangzhou and Jiangmen cities from 2021 to 2023. Bird stool samples were collected from migratory bird aggregation sites and zoos. Poultry stool samples were collected from free-ranging poultry in suburban villages. After collection, the samples were frozen at -20°C and then transported to the laboratory within 24 h before total RNA was extracted. The samples were thawed, and an equal volume of PBS was added and shaken for 1 min. Then, the centrifuge tubes were centrifuged at 3600 × g at 4°C for 10 min.

RNA extraction and positive screening for CoVs

To identify coronavirus-positive samples, total RNA was extracted from the samples by using TRIzol RNAiso Plus 9109 (Takara, Japan). Then, based on the method recommended by Festa (Drzewniokova et al. 2021), the PrimeScript™ One-Step RT‒PCR Kit V. 2 (Takara, Japan) was used for reverse transcription PCR, and Premix Taq™ (TaKaRa Taq™ V。 2.0 plus dye, Takara, Japan) was used for the second round of PCR. The first round of PCR primers was Hu. F and Hu. R (Hu et al. 2018) The primers used for the second round were obtained from Poon. F (Woo et al. 2005) and Chu DKW (Chu et al. 2006). The primers used in this research are listed in Supplemental Table A1.

Species identification

Species information was obtained for samples after morphological or DNA identification. Samples for which no species information was obtained were excluded from the study, and this fraction of samples did not affect the results.

A tissue DNA extraction kit (OMEGA, Guangzhou) was used to extract DNA. Nested primers were used to amplify the avian mitochondrial cytochrome C subunit (mt-COI), and sequencing was performed to obtain species information (Supplemental Table A1).

Viral metagenomics and bioinformatics analyses

Total RNA was extracted from each positive sample using the VAMNE Magnetic Pathogen DNA/RNA Kit (Vazyme, China). The libraries were prepared using the VAHTS DNA & RNA Library Prep Kit for MGI (Vazyme, China). The next-generation sequencing (NGS) MGI platform was used for metagenomics to obtain complete viral genome sequences.

Sequencing reads were processed by using Trimmomatic to remove adapter contamination and filter low-quality reads. The filtered reads were assembled by MEGA-HIT to obtain high-quality contigs. The CoV contigs were identified using DIAMOND and BLAST. The complete genome sequence was then generated based on these contigs. Bowtie 2 was used to map the reads onto the template to obtain the full-length sequence. Primer 6 was used to design overlapping primers to confirm each contig. The primers used in this research are listed in Supplemental Table A1-A2-.

Genomic and phylogenetic analysis

The NCBI ORF finder was used to predict the open reading frame of the sequences before alignment with the reference sequence to confirm the start point of expression of different genes and obtain the sequences of various corresponding genes. MAFFT (V.4.475) was used with default parameters to align the complete genome or partial gene sequences of the viruses. MegAlign (V.7.1.0) in DNAStar was used for sequence alignment and homology analyses. IBS (version 1.0) was used to plot the genome maps. MEGAX (Kumar et al. 2018) (V.10.1.8) software was used, and maximum likelihood (ML) was used to construct phylogenetic trees with the General Time Reversible model (GTR + G + I). Simplot software (V.3.51) and RDP4 (V.4.39) were used for recombination analysis of the sequences. The criteria for recombination were a p value < 10–10 for the three methods in RDP4 and identical recombination sites in Simplot.

For each of the putative viral protein sequences, we used TMHMM v2.0 (http://www.cbs.dtu.dk/services/TMHMM/) to predict the transmembrane domains, SignalP v4.0 (http://www.cbs.dtu.dk/services/SignalP/) to determine signal sequences, and NetNGlyc v1.0 (http://www.cbs.dtu.dk/services/NetNGlyc/) to identify N-linked glycosylation sites.

Availability of data and materials

Not applicable.

References

Download references

Acknowledgements

Not applicable.

Funding

This research was funded by the Key Project of Agricultural and Social Development Science and Technology Projects of Guangzhou, grant number 202103000008; the Guangdong Modern Agricultural Industry Technology System Innovation Team Construction Project 2023, grant number BKS209152; and the Guangdong Basic and Applied Basic Research Foundation, grant number 2022A1515110357.

Author information

Authors and Affiliations

Authors

Contributions

Writing-original draft preparation, Y.H. and Y.L.; writing-review and editing, W.Z. and L.X.; funding acquisition, Q.L.; methodology, Z.W.; software, R.S. and X.P.; conceptualization, S.C. and J.L.; visualization, J.M. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Jun Ma or Juntao Li.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Handling editor: Zhong Peng.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lian, Y., Huang, Y., Xie, L. et al. Detection and genomic characterization of coronaviruses among migratory birds in Guangdong Province, China. Animal Diseases 4, 26 (2024). https://doi.org/10.1186/s44149-024-00129-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s44149-024-00129-8

Keywords