Pathology and genetic connectedness of the mangrove crab (Aratus pisonii) – a foundation for understanding mangrove disease ecology

Mangrove forests are productive ecosystems, acting as a sink for CO2, a habitat for a diverse array of terrestrial and marine species, and as a natural barrier to coastline erosion. The species that reside within mangrove ecosystems have important roles to play, including litter decomposition and the recycling of nutrients. Crustacea are important detritivores in such ecosystems and understanding their limitations (i.e. disease) is an important endeavour when considering the larger ecological services provided. Histology and metagenomics were used to identify viral (Nudiviridae, Alphaflexiviridae), bacterial (Paracoccus sp., 'Candidatus Gracilibacteria sp.’, and Pseudoalteromonas sp.), protozoan, fungal, and metazoan diversity that compose the symbiome of the mangrove crab, Aratus pisonii. The symbiotic groups were observed at varying prevalence under histology: nudivirus (6.5%), putative gut epithelial virus (3.2%), ciliated protozoa (35.5%), gonad fungus (3.2%), gill ectoparasitic metazoan (6.5%). Metagenomic analysis of one specimen exhibiting a nudivirus infection provided the complete host mitochondrial genome (15,642 bp), nudivirus genome (108,981 bp), and the genome of a Cassava common mosaic virus isolate (6387 bp). Our phylogenetic analyses group the novel nudivirus with the Gammanudivirus and protein similarity searches indicate that Carcinus maenas nudivrius is the most similar to the new isolate. The mitochondrial genome were used to mine short fragments used in population genetic studies to gauge an idea of diversity in this host species across the USA, Caribbean, and central and southern America. This study report several new symbionts based on their pathology, taxonomy, and genomics (where available) and discuss what effect they may have on the crab population. The role of mangrove crabs from a OneHealth perspective were explored, since their pathobiome includes cassava-infecting viruses. Finally, given that this species is abundant in mangrove forests and now boasts a well-described pathogen profile, we posit that A. pisonii is a valuable model system for understanding mangrove disease ecology.


Introduction
Mangrove forests are some of the world's most productive ecosystems (Sandilyan and Kathiresan 2012), growing at the vegetation boundary between land and sea, and line~70% of tropical coasts (Conde et al. 2000;Rogers et al. 2021). This unique ecosystem provides refuge for a diverse array of invertebrate species (Sandilyan and Kathiresan 2012). Across the broad diversity of inhabitants, mangrove crabs are an important keystone group (Smith III et al. 1991;Conde et al. 2000). Aratus pisonii, the neotropical mangrove crab, inhabits the supralittoral zone of multiple mangrove species (Rhizophora mangle; Avicennia germinans; Lagunculuria racemose; Pelliceria rhizophorae) (Díaz and Conde 1989;Conde et al. 2000). Its distribution ranges from Eastern Florida to Northern Brazil, including the Caribbean, as well as between Nicaragua and Peru on the Pacific Coast. The habitat range is thought to be under expansion, responding to a warming climate (Riley et al. 2014).
Data pertaining to mangrove ecologies often lack the important factor of disease prevalence and distribution, including the exploration and ecological perception of how parasites and pathogens contribute to the natural functioning of an ecological system (i.e., Disease Ecology). Such concepts are vital since changes in disease occurrence or prevalence can profoundly alter the species composition of an ecosystem. For example, the lack of a disease can cause certain species to over-compete (Strauss et al. 2019) or disease outbreaks can cause trophic cascades and change community composition (Behrens and Lafferty 2004). To date, the mangrove crab, A. pisonii, is associated with few parasitic groups, restricted only to fungal observations by Mattson (Mattson 1988), where Eccrinaceous fungi (Trichomycetes) were observed in the hindgut of A. pisonii from Tampa Bay, Florida. These fungal observations were quantitatively associated with the crab's detritivorous and herbivorous diet. Considering this species is a keystone element of mangrove forests, with an ecological role disproportionately large relative to its abundance, it is pertinent to explore any associated diseases, such as viral, bacterial, and other microparasitic groups.
The availability of novel genetic and genomic technologies allowed us to better map and describe the associated pathobiome of animals and plants ). An added benefit to such technology is the additional genetic information gained from the host organism. Mitochondrial genetic data has been collected for A. pisonii across its range, outlining some haplotype variation that deserves further exploration (Riley and Griffen 2017;NCBI). Of benefit to pathological understanding, techniques such as histology provide a method to generally screen for parasite groups, allowing one to find a pathology of interest and then follow-up with more accurate tools to provide taxonomic detail (Bojko et al. 2017;Warren et al. 2022).
Using histopathological and metagenomic technology, we provide an up to date understanding of the virology, bacteriology, and general parasitology of A. pisonii from the Florida Keys, while providing an additional series of genetic resources and community analysis for this host species, including the mitochondrial genome. The availability of such data brings this species forward as in intriguing disease ecology model for understanding the role of disease in mangrove ecosystems.

Histopathology
Five different symbiotic groups were identified histologically, including gill ectoparasitic organisms ( . Ectoparasitic organisms were observed between the gill lamellae of 2/31 (6.5%) hosts, with distinguishable muscle, nerve and gonad tissue (Fig. 1A). Ciliated protozoans were also observed between the gill lamellae of the host (11/31; 35.5%), consisting of basophilic staining ciliates with U-shaped nuclei (Fig. 1B) in some cases the ciliates were stalked. A single animal was infected with a fungal parasite observable in the male gonad (1/31; 3.2%) (Fig. 1C). High magnification images of the infection revealed a network of hyphal-like strands extending into the host testes (Fig. 1D). A novel nudivirus was noted to produce a basophilic viroplasm that resulted in nuclear hypertrophy in affected hepatopancreatocytes in 2/31 (6.5%) hosts, which also displayed an eosinophilic staining occlusion (resembling a thin red shard) as well as causing margination of host chromatin (Fig. 1E). The host hepatopancreas housing this virus was further analysed using metagenomics outlined below. Finally, an eosinophilic shard, similar to that seen in the hepatopancreas of the crabs infected with the nudivirus, was also present in the nuclei of host gut epithelia in one crab (1/31; 3.2%); however, the crab with this infection did not have the nudiviral pathology. Gut material was not preserved in ethanol and no further confirmation of this putative viral pathology could be conducted with the material available.

Cassava common mosaic virus (Alphaflexiviridae) and
Nudiviridae in a mangrove crab One contiguous sequence mined from the metagenomic data (acquired from host hepatopancreas DNA extract) (1678 bp, X7 coverage) represented the 18S gene of Manihot esculenta (Cassava) (XR_006349510: coverage: 99%, similarity 99.4%, e-value: 0.0), the primary host of Cassava common mosaic virus (CsCMV). In addition, a second contiguous DNA sequence conferred to the complete genome of the RNA virus CsCMV (6387 bp, X49 coverage) ( Fig. 2; Table 1). The sequence of the new isolate included all the expected proteins and showed high similarity at both the protein and genetic scale (Table 1). Phylogenetic comparison to other CsCMV isolates places this new isolate as an early diverging strain when comparing complete genome data (Fig. 2). Phylogenetic comparison of the replicase and capsid proteins among strains confirmed on both occasions that the more similar strain was CsCMV strain Arg127 (KT002439), from Argentina. The novel isolate is stored in GenBank under accession: OM927720.
The metagenomic data also included a complete, circular, genome corresponding to the Nudiviridae (108,981 bp, X158 coverage). The genome included 91 protein coding genes, with all expected conserved genes   (Table 2). Two proteins showed some similarity to proteins encoded by decapods, including a serine/threonine kinase (ApNV_020) (Homarus americanus, 30.1%, 4e-10) and 'GWK47_016212' (ApNV_043) (Chionoecetes opilio, Fig. 2 Genomic annotation and similarity assessment of a novel Cassava Common Mosaic Virus (CsCMV) strain, mined from the Aratus pisonii metagenome data. The viral genome was identified from DNA data, suggesting it was reverse transcribed in the host cell. The genome represents all five of the open reading frames common to CsCMV, along with a short PolyA tail. Single nucleotide polymorphism comparison across the genome is presented, ranging from green (dissimilar) to black (similar). Whole genome nucleotide phylogenetics, replicase protein phylogenetics and capsid protein phylogenetics are present, as well as a pairwise comparison of both the replicase and capsid proteins across the strains. Graphics developed in CLC genomics workbench V. 20. Phylogenetic trees developed in IQtree. Similarity tables developed using Sequence Demarcation Tool V. 1.2    A multi-gene phylogeny consisting of 17 single copy conserved genes, including the new virus and other representative Nudiviridae, determined that ApNV groups within the Gammanudivirus genus most closely to PmNV, HgNV, and CmNV (bootstrap confidence: 100) (Fig. 4). A single-gene tree using all available DNA polymerase protein sequences for the Nudiviridae also placed the new virus in the same topological position. The novel isolate is stored in GenBank under accession: ON061174.

Metaxa2 resultseukaryotic and bacterial symbiont diversity
The metaxa2 results, which involved mining the assembled metagenomic data from the host hepatopancreas for SSU and LSU sequences resulted in no eukaryotic sequences outside of the host (Aratus) or plant. Eight relatively low-coverage sequences referred to several bacterial associations in the host hepatopancreas (Table 3). Most results showed greatest similarity to bacterial clones from other metagenomic studies including: the microbiome of Palinurid phyllosoma (Payne et al. 2008); bacterial symbionts of Artemia spp. in Israel (Tkavc et al. 2011); gut microbiome of Eriocheir sinensis (Chen et al. 2015); and bacteria from ground water (Luef et al. 2015). Paracoccus sp. (88.6% similarity), 'Candidatus Gracilibacteria sp.' (90.5% similarity), and Pseudoalteromonas sp. (96.5% similarity) were the most discernible taxonomically identified genera (Table 3). The sequence data are stored under accession numbers: ON063416-ON063422.

Genetic data for Aratus pisonii
The mitochondrial genome of A. pisonii is a closed circular molecule of 15,642 bp (X98 coverage) (accession: OM935816). The mitochondrial genome encodes 13 protein coding genes, 22 tRNAs and 2 rRNAs. A single control region is also present (CoRe). Aratus pisonii exhibits the SesGO gene order shown in Fig. 5 (Basso et al. 2017). The protein coding regions encompass 7 NADH dehydrogenases (nad1-nad6, nad4L), three cytochrome c oxidases (cox1-cox3), 2 ATPases (atp6 and atp8) and 1 cytochrome b (cob). All the protein coding genes, including a high proportion of the non-coding RNA genes (ncRNA), showed a high level of similarity with other crabs belonging to the Sesarmidae family (Additional file 1).
The two haplotype networks generated in this study indicate that high levels of genetic divergence exist amongst A. pisonii populations throughout the Americas and the Caribbean (Fig. 6). The star-like groupings present across both networks are evidence of haplotype structuring (Fig. 6B). The cox1 network has star-like groupings restricted to individual geographic regions, with relate to rare haplotypes projecting from more frequent ones.

Discussion
The population structure of A. pisonii, inferred from the haplotype networks, suggests the formation of four regional haplotypes: North America, Central America, the Caribbean, and South America. This genetic population structure may result from the high dispersive potential of the mangrove tree crab and barriers to gene flow that arise from the mangrove and marine environment. The genetic structure seen here is supported by previous phylogeographic study of the mangrove crab (Buranelli and Mantelatto 2019). We provide histological confirmation of several parasites from a North American (Florida) population, including: a metazoan ectoparasite, a ciliated protozoan, a fungal parasite, and two viral pathologies. Metagenomic data further identified multiple bacterial species, a novel nudivirus genome, and the presence of a reverse transcribed (complete) Alphaflexiviridae genome similar to a strain of CsCMV. Fig. 4 Maximum-likelihood phylogenetic trees of the Nudiviridae, including Aratus pisonii nudivirus (ApNV). The radial plot (A) represents all available Nudiviridae DNA polymerase protein sequences and highlights the potential for two sub-families suggested in recent literature. B ApNV groups in the Gammanudivirus most closely with Homarus gammarus nudivirus (United Kingdom) and Carcinus maenas nudivirus (invasive host, Canada). Lonomia obliqua nucleopolyhedrovirus (Baculoviridae) is used as an outgroup to root the concatenated protein tree, which was developed from the following proteins: 38 k, ac81, DNApol, Helicase, lef-4, lef-5, lef-8, lef-9, p74, pif-1, pif-2, pif-3, pif-4, pif-5, pif-6, vp39 and vp91. The tree was inferred using IQ-Tree  The Aratus pisonii symbiome Prior to this study, fungi were the only symbiotic group identified from A. pisonii (Mattson 1988 64). B 133 cox1 nucleotide sequences from GenBank and the specimen sequenced in this study (total = 134). The specimen sequenced in this study is highlighted using an arrow from crab (Bateman et al. 2021), and one from an amphipod (Allain et al. 2020). We now provide the seventh crustacean nudivirus from A. pisonii, a mangrove crab from the Florida Keys, USA. In each of these hosts, nudiviruses have caused nuclear hypertrophy in hepatopancreatic epithelia, infecting multiple cell-types and often causing cells to slough off into the hepatopancreatic lumen, resulting in degradation of the organ (Bateman et al. 2021). The possible effect for the mangrove crab include the same potential in heavily infected specimens. Since the affected organ functions as a digestive gland for the animal, infection may impede the capacity for digestion of leaf litter and other consumed material. ApNV shares greatest amino acid similarity with the other crab virus, CmNV. This suggests that the two crabinfecting nudiviruses share a closer evolutionary history relative to other viruses in this group. However, our multi-gene and single-gene phylogenetic analyses suggest that HgNV shares a more similar evolutionary history with CmNV vs ApNV. The three viruses group together with high support, but ApNV is more of an outlier relative to the other two viruses, which group together and have a shorter branch length distance (Fig. 5). CmNV encodes 98 predicted genes and HgNV encodes 97 predicted genes, whereas ApNV encodes fewer predicted genes (91). All three viruses encode all conserved genes for the nudivirus family; however, this difference in number of genes might relate the largest difference between ApNV and the other two most closely related nudiviruses. The missing genes are all hypothetical predicted genes in the other genomes, and their function is unknown. Aratus pisonii is a tropical semi-terrestrial species, where both C. maenas and H. gammarus are cold-water, coastal, European species. These environmental differences may relate to the genomic variation we see between ApNV and CmNV/HgNV. As more crab viruses are sequenced, we may see greater evidence for genome diversity among similar host groups.
Secondly, an alphaflexivirus was identified using the metagenomic approach. An oddity here is that alphaflexiviruses are positive strand RNA viruses (Kreuze et al. 2020) and we identified a DNA genome, suggesting that some reverse transcription may have taken place. The genome we identified was~98% similar to CsCMV (Collavino et al. 2021). The collected metagenomic data also included a fragment of the Cassava 18S rRNA gene, the plant host of this virus. Together, these data suggest that the discovery of this virus in the crab hepatopancreas may be most likely a result of the Cassava plant identified in the metagenomic data; however, since novel hosts of this virus outside of the plant and whitefly are important to study, we suggest that further work be done to determine if the mangrove crab is a true sink for this virus.
The other groups we identified included bacterial, protozoan, fungal and metazoan symbionts. For the bacteria, taxonomic detail was achieved using a metagenomic approach, which was limited by lack of similar sequences in some cases but allowed for the identification of several genera common to the crustacean gut microbiome (Payne et al. 2008;Tkavc et al. 2011;Chen et al. 2015) or aquatic/terrestrial habitat (Luef et al. 2015). Paracoccus sp., 'Candidatus Gracilibacteria sp.', and Pseudoalteromonas sp., were determined to genus level in the A. pisonii hepatopancreatic microbiome. The bacterium Paracoccus sp. have been associated with pyridine degradation, and the presence of this bacterium may have a mutualistic benefit for the crab host (Wang et al. 2018). Other bacteria in the host microbiome have been found to produce bioactive compounds (e..g Pseudoalteromonas sp.; Bowman 2007), and may also share a mutualistic relationship. The crustacean hepatopancreas is known to host a range of symbiotic or parasitic species and more experimental information would be needed to determine what the role of the above bacteria would be in the A. pisonii hepatopancreas (Bojko et al. 2018;Bojko et al. 2022).
The fungus identified in the host testes was not identified to a lower taxon; however, its observation in the gonad of the male host may possibly have resulted in reproductive impairment. The fungus was noted to occupy the entire histological section of testis, and no spermatozoa were observable in the infected organ. We are unfamiliar with other fungal species that parasitise the gonad of crustacean hosts and this may be a relatively novel finding requiring taxonomic detail.
The final two groups, gill protozoans (tentatively Cilliophora) and ectoparasitic metazoans (tentatively Arthropoda) appear more commensal in their symbiotic role with the crab. In all cases the two were not associated with any melanisation or risk to gill structure under histology.
In all, the symbiome of the crab consists of a diverse array of viruses, bacteria, protozoa, fungi, and metazoans, which are likely to impact the health and breeding success of their host. These symbionts are likely to have an impact on the mangrove disease ecology, limiting the population growth of the crab. Further study into the specifics of the disease ecology relationships can now be addressed, using the discoveries presented in this study.

Conclusions
This research have provided the first in-depth exploration of the parasites associated with the mangrove crab A. pisonii, increasing the number of known viral, bacterial, protozoan, fungal and metazoan parasites of this host species. The additional host haplotype data, alongside new genetic resources for this host (mitochondrial genome), reveal a high level of local connectivity between populations in certain regions, but limited connection between others, suggesting that parasite assemblages in one population/location may not be the same relative to other locations that are more distant.
Our description of a novel nudivirus increases the known members of the Gammanudivirus genus to seven, providing further taxonomic clarification via multi-gene phylogenetics. In addition, we found a reverse transcribed alphaflexivirus in the pathobiome of this host. Since this virus was detected in a DNA state, it seems unlikely that it was infecting the crab and more likely to be associated with a plant consumed by the crab. The natural host range of CsCMV has not been studied to date, but it is quite plausible the virus infects related species in the family Euphorbiaceae, which may grow in or near mangrove forests. Wild reservoirs of this virus are an important area to study further and this species should be explored using metatranscriptomic techniques.
Finally, the detailed host haplotype network and discovery of several parasites provide a good foundation for this mangrove crab to become a useful disease ecology model for a semi-marine, semi-terrestrial, tropical species, to understand the role of disease in mangrove ecosystems.

Collection, identification, and histopathology
Mangrove crabs (n = 31), A. pisonii, were collected from Long Key, Florida Keys, USA (Lat 24.82°N, Long 80.81°W) in January 2018. A boat was navigated through tight mangrove channels, shaking the trees to knock crabs loose. The animals were anesthetised prior to dissection by placing into a − 20°C freezer for 10 min. A pea-sized amount of hepatopancreas, muscle and gill were also biopsied and submerged in 99% molecular grade ethanol. The same organs with the addition of the gut, gonad, cuticle, and heart were placed into separate labelled cassettes and submerged in Davidson's saltwater fixative for histological analysis.
Davidson's-fixed specimens were given a short decalcification using 'Rapid Decalcifier II' for 20 min before being processed into liquid paraffin wax. The tissues from each cassette were arranged and set into a wax block and left to solidify. Blocks were trimmed and sectioned using a microtome and water bath, allowing the sections to dry onto glass slides prior to staining with haemotoxylin and alcoholic eosin and rehydration. The slides were cover slipped, read, and imaged using standard light microscopy and an integrated Leica camera.

Metagenomics
DNA was extracted from a single ethanol-fixed A. pisonii hepatopancreas biopsy exhibiting viral infection via histopathology. Proteinase K in Lifton's Buffer solution was used to digest the ethanol-fixed hepatopancreas, prior to using a Zymo DNA extraction kit following the manufacturers protocol. The DNA extract was frozen at − 80°C and then transported on dry ice to Novogene, California, for sequencing. The library was loaded onto an Illumina NovaSeq 6000 using the 150 bp NovaSeq 600 SP reagent kit (300 cycles) for paired-end metagenomic sequencing (<10Gb of data). The resulting raw reads included 3,007,453 forward and 3,219,600 reverse reads. The raw reads were trimmed and quality checked using Trimmomatic V. 0.36 (LEADING: 3 TRAILING: 3 SLIDINGWINDOW: 4: 15 MINLEN: 36) (Bolger et al. 2014). Assembly was carried out using SPAdes V. 3.15.3 with default parameters (Bankevich et al. 2012), resulting in an N50 of 1162 for the assembled metagenome. This resulted in three complete genomes: a novel member of the Nudiviridae (circular, 108,981 bp), a novel Alphaflexiviridae strain (linear 6403 bp) and the host mitochondrial genome (circular, 15,642 bp). Genome coverage and continuity was confirmed by mapping all paired and unpaired reads in CLC genomics v.20 (Qiagen).
Further exploration for bacterial and eukaryotic symbionts present in the mangrove crab pathobiome was conducted using Metaxa2 (Bengtsson-Palme et al. 2015). Metaxa2 was applied to the assembled data (> 500 bp) to detect the presence of mitochondrial and bacterial SSU and LSU sequences, as well as eukaryotic SSU and LSU sequences. Each sequence was put though NCBItools to check for chimeric sequences, which were removed during submission, and the remaining sequences were compared to existing data on GenBank using BLASTn.

Mitochondrial and viral genome annotation
The mitogenome of the host was collected to provide additional data on host variation and support for future studies exploring this particular species and model disease system. It was annotated using MITOS (Bernt et al. 2013). The location of the cox1 gene was determined and the genome reorientated with the cox1 gene at the start of the genome. The annotation was manually checked to identify any obscure elements or misannotation via MITOS. The nucleotide and protein similarity data were obtained using BLASTn and BLASTp (NCBI). The nudivirus genome and alphaflexivirus genome were annotated using GeneMarkS (Besemer et al. 2001). The annotated nudivirus and mitochondrial genomes were graphically represented using CIRCA (http://omgenomics.com/circa/).

Phylogenetics
Twelve brachyuran cox1, cox2, and cox3 sequences were obtained by conducting a BLASTn similarity search to the A. pisonii mitogenome nucleotide sequence. The sequences were trimmed and individually aligned in MAFFT (XSEDE 7.402) available through the CIPRES science gateway (Miller et al. 2012) before being manually concatenated. Maximum-Likelihood (ML) phylogenetic analysis was performed in IQ-Tree, which computed the most appropriate evolutionary model (GTR + F + I + G4) according to Bayesian information criterion (BIC) and 1000 bootstrap replicates (Minh et al. 2015). The resulting tree was annotated using FigTree V. 1.4.3 (http://tree.bio.ed.ac.uk/software/figtree/).
The novel Alphaflexiviridae genome (nucleotide), replicase and capsid proteins were each compared to five available genomes from this species complex (KT002435, U23414, MN243731, MN428639, KT002439) and Turtle Grass X virus (MH077559) (outgroup). The genes were separated into FASTA files and aligned using MAFFT (XSEDE 7.402) (Miller et al. 2012) before conducting Maximum-Likelihood (ML) phylogenetic analysis in IQ-Tree (replicase -1000 bootstraps, BIC: TN + F + I) (capsid -1000 bootstraps, BIC: JTT). The resulting trees were annotated using FigTree V. 1.4.3. The whole genome tree was inferred using a neighbour-joining approach in CLC Genomics v.20 (Qiagen). Capsid and replicase amino acid sequences were also compared using the sequence demarcation tool V. 1.2 (Muhire et al. 2014). An alignment of the six Cassava common mosaic virus (CsCMV) genomes was conducted using MUSCLE in CLC Genomics V. 20 (Qiagen), including a similarity colour chart.