A GENOMIC APPROACH TO THE COMPLEX RELATIONSHIP BETWEEN AN APICOMPLEXAN ENDOSYMBIONT AND ITS HOST

Nephromyces, a genus in the phylum Apicomplexa, has recently been described as having a mutualistic relationship with its host: tunicates in the Molgulidae family (Saffo et al. 2010). If true, Nephromyces would be the only known example of a mutualistic apicomplexan genus. In addition to the possible switch to mutualism, Nephromyces is one of a few apicomplexan groups containing bacterial endosymbionts. To test the hypothesis that endosymbiotic bacteria facilitated the transition of Nephromyces from parasitism, the metabolic capabilities of Nephromyces and its bacterial endosymbionts need to be determined. The transition from obligate parasite to endosymbiont is predicted to involve different selective pressures leading to wide spread genomic changes. Identifying these changes will lead to a better understanding of the dynamics between the different biological players in this system. Using data from Illumina HiSeq, we have assembled and annotated the transcriptomes of Nephromyces and Cardiosporidium cionae. Using data from a combination of platforms; Illumina MiSeq, HiSeq, and Pacific Biosciences, we have partially assembled a pan-genome for Nephromyces and have assembled the genomes of its bacterial endosymbionts. Using amplicon sequencing, we have estimated the genetic diversity and prevalence of multispecies infections of Nephromyces and its bacterial endosymbionts in its host Molgula manhattensis. In addition to the implementation of next-generation sequencing technologies, this work is also based on laboratory cultures and species isolation experiments. With the aforementioned data we are able to describe the transcriptome of Nephromyces and Cardiosporidium as well as the genomes of all three bacterial endosymbionts, providing a basic overview of the metabolism of this system. Nephromyces and Cardiosporidium both encode a complete purine degradation pathway, which enables them to break uric acid into pyruvate and glycine, additionally Nephromyces is also able to create malate from uric acid. This could represent the primary route of carbon, nitrogen and energy acquisition in Nephromyces. The genomes of the bacterial endosymbionts are severely reduced, but relatively enriched for vitamin and amino acid biosynthesis (at least in the Betaproteobacteria and Bacteroidetes symbionts). It is likely that the bacterial endosymbionts are supplementing vitamins and amino acids to the limited diet of uric acid found in Nephromyces. Our amplicon data reveals that nearly all M. manhattensis are infected with multiple species of Nephromyces. The community of Nephromyces forms a tightly integrated system of metabolic interdependencies based of the different bacterial endosymbionts.

Obligate parasites face different challenges than free-living organisms, resulting in different evolutionary pressures and unusual life histories as well as dramatic genomic changes in parasitic lineages (Janouskovec & Keeling 2016).
One of the problems faced by parasites is the host's immune system. The need to evade the host's immune system results in a complex evolutionary arms race between host and parasite, and is a core component of the red queen hypothesis (van Valen 1973). In addition to evasion of the host's immune system, intracellular parasites have ready access to an abundance of pre-formed metabolites. Access to these pre-formed metabolites leads to one of the most common and pronounced consequences of parasite genome evolution, the loss of many basic biosynthesis pathways, which are critical in free-living organisms (Janouskovec & Keeling 2016). This loss of biosynthesis capabilities is particularly pronounced in Apicomplexa. Amino acid biosynthesis, vitamin and cofactor biosynthesis, purine synthesis, purine degradation, and fatty acid biosynthesis have all been lost in Apicomplexa (Woo et al. 2015). Because these losses are observed throughout the phylum, it was believed that these losses occurred early in apicomplexan evolution. However, there has been a recent proposal that these losses are a continuous gradual process (Zarowiecki & Berriman 2015). In either case, loss of biosynthetic pathways creates a dependence on the host for not only primary carbon and nitrogen, but also a dependence on salvaging the hosts premade metabolites. This has led to a number of intricate and elegant strategies for host manipulation.
As a consequence of the extraction of nutrients and metabolites, a parasite's growth and reproduction imposes a cost to their host. The total impact on the host by the parasite is known as virulence (Read 1994). If a parasite's virulence is too great the host will die, either directly due to the parasite, or as a consequence of being weakened; i.e., the host is too weak to find food, the host becomes easy prey for predators, or the weakened immune system makes the host vulnerable to other pathogens. If the host dies before the parasite can complete its life cycle, or before it can infect a new host, then the parasites fitness falls to zero. This leads to a complex balance between parasite growth and transmission to a new host. Factors involved in transmission and virulence include host genotype, parasite genotype, host health, parasite load, as well as external factors such as other parasites infecting the same host simultaneously (Frank 1996a). The interplay of these factors creates a dynamic relationship between host and parasite and has led to a wide variety of strategies (Alizon et al. 2009). One common strategy, adopted by many parasite species, is to lower their virulence to the host (Cressler et al. 2016). Parasites often achieve this by lowering their reproduction levels. Lower parasite density means lower virulence to the host, and as long as there is still good transmission to other hosts, this low-density strategy is often successful. A different strategy is exemplified by Plasmodium. Plasmodium merozoites proliferate within the host's liver cells through schizogony by simultaneously inhibiting cell death (thereby avoiding immunity) until parasite levels are high enough to cause cell death, releasing release sporozoites into the bloodstream. The high density of sporozoites overwhelms the immune system and creates a high likelihood that a mosquito feeding on the host will ingest blood with sporozoites, maximizing transmission success. After simultaneous release sporozoites re-infect the liver to repeat the cycle again. This cycle causes the episodic fevers seen in malaria patients. The episodic overwhelming of the host immune system causes the high virulence found with Plasmodium infections. Plasmodium falciparum is particularly virulent, even among Plasmodium species. Some of the virulence of P.
falciparum has been attributed to the large number of other human parasites found in the same locations. With the presence of other parasites the likelihood of multiple parasitic infections increases, which then leads to the virulence of all pathogens present being cumulative. Rather than a long sustained infection with low probability of transmission over a longer time, with the possibility of host death from other pathogens, P. falciparum, has adopted a high density, high virulence infection combined with periods hidden from the immune system, which maximizes transmission over a short period of time.
Although transmission strategies are diverse there are predictable patterns of a disease's epidemiology and the type of strategy a parasite is using.
Parasites with high virulence are characterized by low prevalence in a population or by low prevalence with sporadic outbreaks (Frank 1996b). Parasites using this strategy may reach high cellular densities in an effort to maximize transmission before the host dies. Alternatively, high-sustained prevalence in a host population indicates low virulence. Parasites using this strategy often maintain low cell densities with lower transmission rates over a longer period of time. Parasite virulence and transmission strategies are not dichotomous, but rather a continuum. Parasites with the highest, sustained prevalence being the least virulent and parasites with the lowest prevalence being the most virulent. Of course such predictions about virulence only apply to parasites that have coevolved with their host, and does not apply to parasites infecting a new or incidental host. In these instances, virulence is often extremely high and often results in the host death before transmission, leading to a self-limiting infection pattern, i.e. Ebola in humans.
The relationship between prevalence and virulence is important for this work because Nephromyces mutualistic relationship with their hosts, Molgula tunicates was solely based on the nearly 100% year round infection prevalence  Nephromyces reaches over an order of magnitude higher cell densities than Ciona.
In two such closely related organisms with closely related hosts with similar epidemiology in other respects, the difference in relative cell densities is striking. Typically, the higher the parasite load the greater the virulence, but paradoxically Nephromyces can remain avirulent and reach extremely high cell densities. Rather than focus on the proposed mutualistic relationship between Nephromyces and its host, which remains unclear, this work will focus on the apparent paradox in Nephromyces' epidemiology.
In order to consider the unusual epidemiology of Nephromyces, it is necessary to examine its other life history traits. The phylum Apicomplexa has a tremendous amount of variation in hosts, cell types infected, transmission methods, host manipulation strategies, life cycles, reproduction, and morphology (Roos 2005). Even with so much diversity, Nephromyces stands out as unusual for an apicomplexan. One of the most unusual aspects of Nephromyces' biology is where it lives. Nephromyces is only found and completes its entire life cycle inside the renal sac (Saffo 1982). Specifically, Nephromyces is found in the lumen of the renal sac and, unlike other apicomplexans, is extra-cellular, with no part of its life cycle inside or joined to its host's cells. The renal sac is a large, ductless, structure present only in tunicates in the Molgulidae family (Goodbody 1965). The function of the renal sac has not been determined, and despite its name, the renal sac does not function as a typical kidney . The renal sac was named for the large deposits of uric acid and calcium oxalate, nitrogenous compounds that are the major constituents of kidney stones . Localized deposits of uric acid are not exclusive to Molgula tunicates and many ascidians have crystallized uric acid deposits located in various tissues, but the deposits in Molgula are by far the largest (Lambert et al. 1998).
Another unusual aspect about Nephromyces is the presence of bacterial endosymbionts. Even though it is not unusual for Eukaryotes to have bacterial endosymbionts, it is unusual in the phylum Apicomplexa. The only other apicomplexan known to harbor a bacterial endosymbiont is C. ciona. Bacterial endosymbionts are a common way for an organism to add novel functionality to its metabolism, and the acquisition and maintenance of bacterial endosymbionts is a major driver of eukaryotic evolution Prominent examples include the alphaproteobacterium that became the mitochondria and the cyanobacterium that gave rise to the chloroplast (John & Whatley 1975;Mereschkowsky 1905).
More recent bacterial endosymbionts provide their hosts with a wide variety of metabolic capabilities including vitamin and co-factor biosynthesis, amino acid biosynthesis, methanogenesis, photosynthesis, and protection from parasitoids. (Moran et al. 2005;Gijzen et al. 1991;Hansen et al. 2012).
These functions allow their hosts to colonize new habitats and take advantage of novel food sources.
One of the consequences of a parasitic lifestyle is the loss of biosynthetic capabilities, and bacterial endosymbionts can supplement a host's metabolism. It was hypothesized that Nephromyces bacterial endosymbionts were an important factor in Nephromyces' colonization of the renal sac and its paradoxical epidemiology . Therefore, it was necessary to determine how Nephromyces bacterial endosymbionts were contributing to the host's metabolome. One hypothesis suggested that bacterial endosymbionts of Nephromyces are capable of degrading the abundant amounts of uric acid in the renal sac . The degradation of uric acid was also proposed as the host benefit that made Nephromyces mutualistic instead of parasitic.
A previous study had found three different bacterial endosymbionts in Nephromyces: an alphaproteobacteria, a betaproteobacteria, and a bacteroidetes ). This study also detailed how these different bacterial endosymbionts were never found together in the same Nephromyces cell. No explanation of how a species of a single-celled organism could maintain three different endosymbionts without the endosymbionts ever being together was given. What this study failed to recognize is there were multiple species of Nephromyces inside the same renal sac, and that different Nephromyces species contained a single type of bacterial endosymbiont (Chapter 4).
Organisms harboring multiple endosymbionts are not uncommon (Bennett & Moran 2013;Moran et al. 2008;Gruwell et al. 2010). Many organisms that are dependent on bacterial endosymbionts contain two or three different endosymbionts. Multiple endosymbionts are often required due to the evolutionary consequences of a free-living bacteria becoming an endosymbiont (Wernegreen 2017(Wernegreen , 2015Mccutcheon & Moran 2011;Moran 1996). One driver of bacterial endosymbionts' evolution is a tiny population size relative to freeliving bacterial. Another is when only a few, or just a single bacterium, is vertically transmitted to subsequent host generations. Small population size, coupled with extreme bottlenecks repeated every host generation, produces profound effects from genetic drift and results in an accelerated Muller's ratchet (Moran 1996). One of the consequences of the accelerated Muller's ratchet on bacterial endosymbionts is a severe reduction of all non-essential genes. Some of the genes commonly lost are DNA repair genes (Kuwahara et al. 2007). The loss of DNA repair genes combined with the effects of genetic drift leads to high mutation rates, a low ratio of synonymous/non-synonymous mutations, and an AT bias. Over time this results in endosymbiont genomes, which are small, gene poor, and AT rich (Moran 2002).
The genomic instabilities of bacterial endosymbionts can quickly make them a burden and a liability to the host. As bacterial endosymbionts decrease in function, the host must support their symbionts to a greater and greater degree. Nephromyces, a derived apicomplexan genus of uncertain phylogenetic placement, appears to be an exception to both of these traits.
Nephromyces was misclassified as a fungus for more than a 100 years, based on long hyphal-like cell structures, flagellated spores interpreted by some as chytrid zoospores and cell walls made of a chitin (Giard 1888). It was not until the application of molecular methods that Nephromyces was confirmed as a member of the derived apicomplexans . Although some analyses have tentatively placed it sister to adeleids, coccidia, or piroplasmida, the precise phylogenetic position of Nephromyces remains unresolved Janouškovec et al. 2015). Nephromyces species are monoxenous (infecting a single host) and are found exclusively in the Molgulidae family of tunicates (Saffo & Davis 1982). In a phylum composed of obligate parasites, the feature that distinguishes Nephromyces is its apparent mutualistic relationship with its tunicate hosts. The mutualistic relationship has been inferred based primarily on the nearly 100% infection rate and lack of clearance from the host , 1990 A shift in lifestyle from obligate parasite to mutualistic symbiont is quite rare, and completely unknown from deep within a eukaryotic lineage with such a long evolutionary history of parasitism. One common consequence of a parasitic lifestyle is a loss of genes essential to free living organisms (Greganova et al. 2013;Janouškovec et al. 2015;Zarowiecki & Berriman 2015;Petersen et al. 2015). In an intracellular environment, if precursor molecules can be scavenged, there is less selective pressure to maintain biosynthesis pathways, and many are consequently lost (Keeling 2004;Sakharkar et al. 2004;Morrison et al. 2007). In phyla such as Apicomplexa, these losses can be extreme and over half of the genes found in their photosynthetic sister group, chromerids, have been lost in apicomplexans (Woo et al. 2015).
With so many basic metabolic functions lost, and with such dependence on the host, it is difficult to see how the relationship between host and parasite could change to a mutualistic interaction. However, one way for an organism to rapidly change its metabolic capabilities is to take on a bacterial symbiont.
Nephromyces has done just that, leading to the hypothesis that bacterial endosymbionts inside Nephromyces perform some of the metabolic functions lost in Apicomplexa, and potentially contribute something beneficial to the tunicate host (Saffo 1990; (Urakawa et al. 2005), and photosynthesis , to name a few.
A tempting hypothesis for the functional role of Nephromyces bacterial endosymbionts is the break down of purines to urea in the purine degradation pathway (Saffo 1990). In support of this hypothesis Nephromyces infected tunicates have quite high levels of the enzyme urate oxidase, which catalyzes conversion of uric acid to 5-hydroxyisourate, but the enzyme is undetectable in uninfected tunicates (Mahler et al. 1955;. Coupled with the fact that all known apicomplexans and tunicates have lost the purine degradation pathway, these data were suggestive of a bacterial contribution to purine degradation. In a yet unexplained quirk of tunicate biology, many tunicate species have localized deposits of uric acid (Lambert et al. 1998;Goodbody 1965). Storage as a form of excretion, nitrogen storage for future release, and structural support, are among the proposed functions of tunicate urate deposits (Goodbody 1965;Lambert et al. 1998). Tunicates in the Molgulidae family have the largest uric acid deposits, which are localized to a specialized, ductless structure, called a renal sac .
These uric acid deposits occur regardless of infection status, indicating a tunicate origin of these purine deposits. Despite the name, the renal sac has many features (most notably, the absence of any ducts or macroscopic openings) atypical for an excretory organ, and its biological function has yet to be determined.
Nephromyces infects feeding molgulid tunicates after the post-metamorphic onset of host feeding and completes its entire lifecycle within the renal sac. Four factors led to the conclusion that the bacterial endosymbionts within Nephromyces are the source of urate oxidase activity in this system: 1) the colonization of Nephromyces within a structure with high concentrations of urate, 2) the absence of urate oxidase activity in the molgulid hosts (Saffo, , 1991, 3) the high urate oxidase activity found in Nephromyces (including its bacterial symbionts: Saffo, , 1991, coupled with 4) the lack of obvious ultrastructural evidence of peroxisomes in Nephromyces (Saffo, 1990).
It is logical to think that the addition of bacterial endosymbionts to Nephromyces might have been key to colonizing this novel purine-rich niche, and is how Nephromyces escaped the "evolutionary dead end" of a parasitic lifestyle.
In order to test this directly, and examine the metabolic relationships between the tunicate host, Nephromyces, and its bacterial endosymbionts, we sequenced the community transcriptome. To identify possible evolutionary or physiological changes involved in coevolution of Nephromyces with its molgulid hosts, we also sequenced the transcriptome of a sister taxon of Nephromyces, Cardiosporidium cionae , an apicomplexan parasite found in the blood in a broad range of non-molgulid ascidian hosts, including Ciona intestinalis, Styela clava, Halocynthia roretzi, and Ascidiella aspersa Dong et al. 2006). Interestingly, Cardiosporidium cionae also harbors bacterial endosymbionts, which allows for a more direct comparison between Nephromyces and Cardiosporidium.
Here for the formation of peroxisomes in these organisms. Although direct evidence is still absent, both studies point to (Lige et al. 2009)and their identification of peroxisome-like vesicles in T. gondii, for possible microscopic support.
Our data demonstrate that Nephromyces encodes a complete purine degradation pathway and a number of proteins predicted to be targeted to, or involved in, peroxisome biogenesis, maintenance and protein import, providing novel support of peroxisomes in Apicomplexa. Additionally, we propose the functional significance of purine degradation in Nephromyces, and reject the hypothesis that bacterial endosymbionts facilitated an escape from parasitism by providing genes in the purine degradation pathway.

Molgula manhattensis collection and laboratory culture
Molgula manhattensis tunicates were collected from a dock in Greenwich Bay, Rhode Island (41°39'22.7"N 71°26'53.9"W) on July 2014. For transcriptomic analysis, a single renal sac was separated from one tunicate, and all extraneous tissue removed. The intact renal sac was placed in liquid nitrogen for 5 min and then stored at -80°C for later RNA extraction. Gonads were dissected from five, sexually mature, M. manhattensis, collected from the same population in Greenwich Bay, Rhode Island August 2014. Eggs and sperm were mixed with sterile seawater and divided evenly between two petri dishes. Plates were incubated at room temperature for two days with daily 100% water changes.
Tunicate larvae attached to the bottom and sides of the petri dishes by day three.
By day four, larvae had metamorphosed into adults and were actively feeding.
Plates were moved to an incubator at 18° C with a 24 hr. dark cycle to limit growth of contaminants. Tunicates were fed by 100% water exchange with cultures of Isochrysis galbana and Chaetoceros gracilis three days a week. After several weeks tunicates were moved to aerated beakers to meet their increased nutrient and gas exchange requirements. Feeding regimen remained the same except that food volume was increased with tunicate growth. Tunicates were grown for six months until they were ~10mm across. Each renal sac was placed into a 1.5ml Eppendorf tube and flash frozen in liquid nitrogen. PCR screens confirmed Nephromyces was absent from lab-raised individuals. Lab grown tunicates were split into two groups. Renal sacs were harvested from three tunicates to use as transcriptome controls. A second group was infected with Nephromyces oocysts. Oocysts were collected from a wild M. manhattenensis and serially diluted by 50x to limit co-infections from multiple species, and raised for genomic analysis. was assessed with Busco v3 against the Eukaryotic reference data sets (Simão et al. 2015).

Genomic DNA Extraction
The renal sacs from 8 lab grown M. manhattensis individuals were dissected and their renal fluid was pooled in a 1.5ml Eppendorf tube. Contents were centrifuged at 8000 g for 5 min to pellet Nephromyces cells, and following centrifugation the renal fluid was discarded. Five hundred microliters of CTAB buffer with 5ul of proteinase K and ceramic beads were added to the pelleted Nephromyces cells. The sample was placed in a bead beater for 3 min. and then on a rotator for 1.5hrs at room temp. Five hundred microliters of chloroform was added, mixed gently and centrifuged for 5 min. The top layer was removed and 2x the sample volume of ice cold 100% EtOH and 10% sample volume of 3M sodium acetate were added to the sample and incubated a -20 o C overnight. The sample was centrifuged at 16000Xg for 30min. and the liquid was removed. Ice cold 70% EtOH was added and centrifuged at 16000xg for 15min. Liquid was removed and sample air dried for 2 min. DNA was re-eluted in 50ul of deionized water.

Illumina Sequencing
A nanodrop (2000c, ThermoScientific) was used to assess DNA purity and DNA concentration, and an agarose gel was run to assess genomic DNA fragmentation. Following quality control, an Illumina library was constructed.
Library prep and sequencing were done at the URI Genomics and Sequencing Center (URIGSC). The completed library was sequenced on the Illumina MiSeq platform at the URIGSC and the HiSeq platform at the University of Baltimore sequencing center on three lanes.

Pacific Biosciences Sequencing
Using the contents of 150 (done in batches of 10 then pooled) M. manhattensis renal sacs, the same DNA extraction protocol was performed as for Illumina sequencing. DNA was sequenced using three SMRT cells on the Pacific Biosciences platform at the University of Baltimore sequencing center.

Illumina sequence data assembly
One MiSeq lane and three lanes of HiSeq, all from the same library, were trimmed using Trimmomatic (Bolger et al. 2014) and then assembled using Spades assembler (Bankevich et al. 2012) on the URI server BlueWaves.
Pacific Biosciences sequence data assembly Pacific Biosciences reads were error corrected using pbsuite/15.8.24 (English et al. 2012) on the Brown University server, Oscar. Reads were then assembled using Canu (Koren et al. 2014  Cardiosporidium encode more peroxisome-associated proteins than Plasmodium, and nearly the same complement of genes encoded by Toxoplasma (Table 2).
There are a few notable differences between Toxoplasma and Nephromyces/Cardiosporidium, including the absence of PEX3, PEX16, VLACS,  Table 1. Genomic context of the annotated purine degradation genes and malate synthase, in the Nephromyces genomic assembly. The phylogenetic affiliation of neighboring genes on each contig was identified by top hit against the NCBI nr database using BLASTp. Every contig encoding a target gene included other apicomplexan genes, and genes that did not hit apicomplexans had no strong affinity for other organisms.

Discussion
The recent scrutiny by Moog et al (2017), and Ludewig-Klingner et al.
(2018) has built a case for the presence of peroxisomes in some apicomplexan lineages. While some apicomplexans may have lost peroxisomes, it seems likely that this loss is not a universally shared trait in the phylum. Despite the extensive search for peroxisome-associated functions in apicomplexans, no genes involved in purine degradation were found in other sequenced apicomplexan genomes, with the lone exception of allantoicase in Plasmodium (Gardner et al. 2002). Our

in silico predictions indicate a complete purine degradation pathway in
Nephromyces and Cardiosporidium. In addition to highly expressed transcripts for the genes involved, all of the identified purine degradation genes and MLS have been located on genomic contigs from Nephromyces. Based on neighboring genes and the presence of introns in the Nephromyces genes matching the expressed transcripts, these contigs almost certainly originate from the Nephromyces genome (Table 1). Additionally, none of the purine degradation transcripts attributed to Nephromyces were detected in uninfected tunicates (Table 3).
Phylogenetic trees of purine degradation genes are poorly supported at an interphylum level, indicating a rapid evolutionary rate. Whereas most genes are phylogenetically uninformative across the spectrum of eukaryotes, these gene trees have strong support for monophyly of purine degradation genes from Nephromyces and Cardiosporidium with Chromerids ( Figure 1). The combination of gene trees, expression only when Nephromyces is present, and preliminary genomic assemblies strongly suggest that these genes were present since the divergence of Apicomplexa and Chromerida and have been vertically transmitted.
Thus, these genes have been subsequently lost across apicomplexans, possibly multiple times. Although the exact placement of Nephromyces and Based on transcript abundance, purine degradation in Nephromyces peroxisomes appears to be heavily utilized. Only 0.13% of genes had a higher transcription rate than urate oxidase in our data from wild collected Nephromyces, and the other genes in the purine degradation pathway are among the most highly expressed transcripts in both wild and lab grown Nephromyces samples (Table 3). This result aligns with the previously reported high levels of urate oxidase protein in the renal sac of infected Molgula  Although the percentile ranking between these two organisms cannot be directly compared, such high xanthine dehydrogenase expression in Nephromyces is surprising. It seems unlikely that so much xanthine dehydrogenase production is needed to convert only endogenous purines of Nephromyces. However, xanthine is only detected in the renal sac in small quantities, not nearly as abundant as uric acid, and xanthine dehydrogenase activity is restricted to the renal wall, not the Table 3. Expression percentile ranking of purine degradation genes, from total expressed transcripts in Nephromyces (Neph), Cardiosporidium (Cardio) and Molgula (Mm). The wild Nephromyces and Molgula manhattensis data originate from the same RNA extraction and were bioinformatically separated. Data was also generated from laboratory grown tunicates, artificially infected with Nephromyces (Lab grown Neph 1 & 2). Cardiosporidium fractions represent 1) unfiltered pericardial fluid, 2) the 25% and 3) 30% fractions extracted from a sucrose gradient, and may contain different proportions of Cardiosporidium life stages. The three uninfected Molgula manhattensis were raised from gametes in the lab and never exposed to Nephromyces infection. The (-) denotes the transcript was not recovered in that dataset whereas (N/A) indicates the transcript was assembled, but the transcripts per million (TPM) was <1. renal lumen (Nolfi 1970). One possible explanation is that Nephromyces exports its xanthine dehydrogenase into the renal wall in order to drive the production of xanthine from hypoxanthine before the purine salvage enzymes adenine phosphoribosyltransferase and hypoxanthine-guanine phosphoribosyltransferase can salvage hypoxanthine into adenine and guanine.
High expression of purine degradation genes in Nephromyces is clear, but the purpose is uncertain. It does indicate purine degradation is an important pathway for Nephromyces, however, the functional significance is not immediately obvious. Pathway analysis predicts that Nephromyces is able to convert xanthine into urea and ureidoglycolate, however neither compound is biologically useful without further conversion. We propose that the products of purine degradation in Nephromyces are converted to glyoxylate.
One possible route is the conversion of ureidoglycolate into glyoxylate. Glyoxylate is a common substrate for a number of enzymes including glyoxylate oxidase, which catalyzes glyoxylate with water and oxygen to form oxalate and hydrogen peroxide (Kasai et al. 1963). Notably, no copy of glyoxylate oxidase has been identified in Nephromyces, which is surprising given that another common component of the renal sac is calcium oxalate . We have not identified any genes suggesting that Nephromyces or its bacterial endosymbionts can produce or process oxalate.
Calcium oxalate is also found in uninfected hosts indicating that the tunicate is the source. Another enzyme that uses glyoxylate as a substrate, which is present in Nephromyces/Cardiosporidium, is serine-pyruvate transaminase (AGXT), which can be localized to peroxisomes or mitochondria, and catalyzes glyoxylate to glycine and pyruvate (Takada & Noguchi 1985). An alternative enzyme for processing glyoxylate is malate synthase (MLS), which is also targeted to the peroxisome and missing from apicomplexans, including Cardiosporidium, but is found in Nephromyces ( Figure 1). Both AGXT and MLS (in Nephromyces) show similarly high expression as the purine degradation genes (Table 3), which is consistent with our proposed uric acid to glyoxylate pathway. In particular, AGXT is among the most highly  The light blue arrow represents the highly expressed amidohydrolase (red box) predicted to convert ureidoglycolate into glyoxylate. Enzymes on the left side are localized to peroxisomes, the right side to the cytosol, with the green vertical line representing the peroxisomal membrane. The predicted pathway is able to convert uric acid into glyoxylate, and subsequent conversion by serine-pyruvate transaminase (AGXT) or malate synthase, creates glycine and pyruvate or malate respectively. The * by AGXT indicates ambiguous predicted localization, to either peroxisomes or mitochondria provides an explanation for the exceptionally high expression of the purine degradation pathway. Third, it gives Nephromyces access to a primary carbon, nitrogen, and an energy source at no cost to its host. And finally, this change in primary carbon, nitrogen, and energy could conceivably reduce the impact of Nephromyces on its host, allowing Nephromyces densities to increase while decreasing virulence. Reduction in virulence would have been a necessary first step toward mutualism.
Uric acid as a primary carbon and energy source is not completely unknown. Bacterial species have been found in chicken hutches that were able to grow solely on uric acid Thong-On et al. 2012), and some species of fungi are able to grow on media solely containing uric acid ). However, this is a novel substrate for an apicomplexan to grow on, and while it is unlikely that Nephromyces could survive on uric acid alone, it is a promising base for both carbon and nitrogen acquisition.
It is possible that the Nephromyces bacterial endosymbionts (Sabree et al. 2009;Potrikus & Breznak 1980) are contributing to the proposed purine to glucose pathway, but that is not currently supported by our data.
As the adaptive significance of uric acid deposits in tunicates, and particularly in Molgula, are unknown, it is difficult to speculate on the effects of Nephromyces uric acid degradation to the host. If these renal sac deposits are a form of excretion by storage, as has been hypothesized (Goodbody 1965), then having a symbiont that is capable of digesting uric acid may be beneficial simply by digesting an indigestible metabolite and converting uric acid into urea.
Alternatively, once the uric acid has been broken down, the tunicate may benefit from metabolites derived from uric acid previously unavailable to the tunicate. If Nephromyces is overexpressing xanthine dehydrogenase in order to outcompete adenine phosphoribosyltransferase and hypoxanthine-guanine phosphoribosyltransferase, diverting hypoxanthine from purine salvage to purine degradation, there could be a potential cost to the host under purinelimited conditions.
Our data demonstrate that both the proposed mutualistic Nephromyces and parasitic Cardiosporidium encode the genes for purine degradation, which have been lost in other apicomplexans sequenced to date. Additionally, these genes   Besides Nephromyces, the phylum Apicomplexa is composed entirely of obligate metazoan parasites. As a result of an estimated 800 million years of evolution as obligate parasites, [6], many of the genomic patterns associated with parasitism have been described from the apicomplexan lineages. Gene expansions can be seen in the plasmodium var protein family, which are involved in host manipulation, evasion and in the expansion of rhoptry, microneme, and dense granule proteins. The list of core biosynthetic pathways lost in apicomplexans includes purine biosynthesis, purine degradation, biosynthesis of many amino acids, and vitamin biosynthesis. These losses make the parasite dependent on the host, not only for primary carbon and nitrogen, but also for any metabolites it can no longer generate by either de novo synthesis or by conversion. High demand on the host for these metabolites to fuel parasite growth increases the cost of infection, thereby increasing virulence. Parasites must maintain a delicate balance between transmission, virulence, and host immune system evasion.
The trade offs in this balance have been described in detail [7][8][9][10][11], but one common solution many parasites adopt is maintaining low relative abundance inside the host. Higher parasite abundance will increase the cost to that host, and increase virulence. If the parasites kill the host before completing their lifecycle or before transmission to a new host, their fitness falls to zero. Similarly, if parasites have a high prevalence in a population and high lethality, they risk decimating their host population. High-sustained infection prevalence is a good indicator of low virulence, and low virulence is often achieved by self-limited reproduction by the parasites. In this way, Nephromyces stood out as a very atypical parasite. Nephromyces has a nearly 100% infection rate, sustained almost year-round. Unexpectedly, based on typical host / parasite dynamics, Nephromyces also reaches very high cell densities. These atypical epidemiological factors were the basis for the  conclusion that Nephromyces must be mutualistic. In order to reach high cell densities while maintaining low virulence, Nephromyces was predicted to produce something of high value to the host, to offset the cost associated with maintaining such high densities of an obligate parasite.
The unusual epidemiology of Nephromyces becomes more apparent when contrasted with its parasitic sister taxon, Cardiosporidium cionae.
Cardiosporidium, first described in 1907 by Van Gaver and Stephan, and later described by Ciancio et al 2008 is a blood parasite found in solitary nonmolgulidae ascidian hosts, including Ciona intestinalis. Cardiosporidium quickly reaches and maintains ~95% infection prevalence by late July. In contrast to Nephromyces, Cardiosporidium cell densities remain low, with orders of magnitude difference in cell densities (based on DNA extraction quantities as a proxy for cell density). Virulence in both of these apicomplexans is thought to be low based on histological work by Ciancio et al. 2008 andNelson 1982. Low virulence is also predicted from the high-sustained infection prevalence. The contrast in cell densities between Nephromyces and Cardiosporidium, along with the lack of apparent virulence, indicates an unusual relationship between Nephromyces and its host.
In addition to being sister taxa, Nephromyces and Cardiosporidium share a number of other traits. Both organisms are monoxenous, with ascidians as the only host, both have infective stages that are transmitted through seawater, localize within the pericardium of the host and each harbor a monophyletic bacterial endosymbiont species that has been maintained since Nephromyces and Cardiosporidium diverged (Figure 3). These similarities make Cardiosporidium an ideal organism to compare with Nephromyces in order to resolve the genomic changes taking place behind the transition from obligate parasitism to a mutualistic host-symbiont relationship.
However, there are some key differences, besides the epidemiological factors, between Nephromyces and Cardiosporidium, including host species.
Cardiosporidium infects several genera of tunicates including Ciona, Halocynthia, Styela, Ascidiella and possibly others [12,13], while Nephromyces is restricted to the Molgulidae family of tunicates [14]. Interestingly, Cardiosporidium has not been found in any Molgulidae tunicates. Another key difference is that Cardiosporidium is an intracellular blood parasite, while Nephromyces is extracellular (another unusual trait for an apicomplexan). Additionally, Nephromyces is exclusively found in a Molgulidae specific structure called the renal sac, a large ductless structure of unknown function [15,16].
Despite its name, the renal sac does not seem to function as a typical renal organ, but was named for the large deposits of crystallized uric acid and calcium oxalate within it. Many ascidians have localized deposits of uric acid, but tunicates in the Molgulidae family have the largest [17,18]. While the function of these deposits in the tunicate remain unclear, previous work demonstrated that Nephromyces is able to degrade uric acid because it retains the ancestral purine degradation genes lost in all other apicomplexans (Chapter 2). Transcriptome data and pathway analysis suggest that uric acid may be the primary source of carbon and nitrogen for Nephromyces (Chapter 2). Uric acid is an atypical source of carbon and nitrogen, but it is not unheard of. There are several species of bacteria and fungi which can be cultured on media containing only uric acid [19][20][21]. Due to their unusual environment inside the renal sac and because Nephromyces is extracellular, this organism may not have access to all the required pre-formed metabolites. However, the genus Nephromyces is reported to maintain three different bacterial endosymbionts [22].
In addition to the monophyletic alphaproteobacteria, Nephromyces also harbors two other bacterial endosymbionts: a Betaproteobacteria and a Bacteroidetes. Acquisition and maintenance of bacterial endosymbionts is a common way for eukaryotes to gain new metabolic pathways and capabilities.
The functional capabilities of bacterial endosymbionts exploited by eukaryotic hosts, for example include amino acid metabolism and vitamin metabolism [23], nitrogen metabolism [24], defense [25], chemotrophic energy production [26], and photosynthesis [27], to name a few. While bacterial endosymbionts are common in many protist lineages, they are rare in the phylum Apicomplexa. The only known apicomplexans to contain bacterial endosymbionts are Cardiosporidium and Nephromyces. This limited distribution to apicomplexans with ascidian hosts may be due to an unknown aspect of ascidian biology.
Previous speculation that Nephromyces' bacterial endosymbionts are responsible for the observed high levels of purine degradation have recently been rejected (Chapter 2). However, the bacterial endosymbionts are likely instrumental to Nephromyces' ability to colonize the renal sac.
In order to examine the claim of mutualism, characterize the relationships involved in this tripartite endosymbiosis, and determine how Nephromyces achieves low virulence with high cell density, we sequenced the transcriptomes of Molgula manhattensis, Ciona intestinalis, Nephromyces, Cardiosporidium, and their bacterial endosymbionts. Additionally, to better understand the dynamics of the renal sac and the interplay between Nephromyces and its bacterial endosymbionts, we sequenced and partially assembled the Nephromyces genome and the genomes of all three types of the bacterial endosymbionts.

RNA Sequencing
The contents of a single renal sac from an individual Molgula manhattensis The large number of transcripts identified from Nephromyces was due to multiple species infection of a single host. Clustering sequences together resulted in 26938 transcripts at 90%, 23850 at 80%, 21762 at 70%, 19540 at 60%, 16668 at 50%. Due to the multi species nature of Nephromyces infections, the transcriptome is a pan-genome assembly rather than a precise uni-species dataset, but we estimate that there are between 8000 and 12000 unique transcripts in Nephromyces. Kyoto Encyclopedia of Genes and Genomes (KEGG) functionally predicts 13336 transcripts in the full dataset. The tool BUSCO was used to assess the completeness of Nephromyces transcriptome resulting in 81.8% complete transcripts and 6.3% partial.
RNA extractions for Cardiosporidium samples yielded 164 ng/µl for nonsucrose gradient separated blood, 48 ng/µl from cells taken from the 25% layer, and 24 ng/µl from cells taken from the 30% layer. These resulted in 97,417,356 reads from the non-sucrose separated sample, 115,085,369 reads from cells at the 25% layer, and 116,393,114 reads from the 30% layer. Separated by species, 3877 transcripts were from C. intestinalis, 16,663 transcripts were from Cardiosporidium, 1,689 transcripts were from the bacterial endosymbiont of Cardiosporidium. KEGG functionally predicts 9,775 total transcripts for Cardiosporidium, and BUSCO analysis for reports 69.7% complete and 11.9% partial coverage. BUSCO analysis for the bacterial endosymbiont transcriptome resulted in 14.8% complete and 19.6% partial against bacterial_od9.

Nephromyces genome
The Nephromyces genome assembly remains highly fragmented and consists of 1176 contigs greater than 5kb with a maximum length of 287,191 bp and an average length of 36 kb.

α-proteobacteria genome (Nαe)
Two different alphaproteobacteria endosymbionts were recovered from our genomic data and assembled into a draft genome. The presence of two closely related alpha proteobacteria genomes both with high AT bias (25% GC content) and regions of low complexity have limited our ability to assemble these genomes completely (

Biosynthesis of Amino Acids
Nephromyces is predicted to be able to synthesis 10 amino acids (alanine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, methionine, serine, and threonine). Cardiosporidium is predicted to be able to synthesis the same amino acids with the exception of cysteine. In addition to the complete pathways Nephromyces/Cardiosporidium have partial pathways for synthesis of phenylalanine from phenylpyruvate and can convert tyrosine from phenylalanine. Both Nephromyces/Cardiosporidium encode branched-chain amino acid aminotransferase, which adds the final amine group to valine, leucine, and isoleucine.

Vitamin and cofactor synthesis
Nephromyces has the predicted biosynthetic capabilities to produce riboflavin, acetyl CoA, nicotinate, folate, retinol, vitamin E, heme, and ubiquinone.
Cardiosporidium has similar predicted biosynthetic capabilities, but is only capable of synthesizing 6-Geranylgeranyl-2,3-dimethylbenzene-1,4-diol and lacks the final two enzymes in the production of vitamin E. Both Nephromyces/Cardiosporidium encode a copy of lipoyl synthase ( With such severe reduction in carbohydrate metabolism, pyruvate appears to be the only carbon source the alphaproteobacteria is capable of processing. The bacteroidetes endosymbiont (Nbe) has a similarly reduced carbohydrate metabolism as in the α-proteobacteria, however the reduction is Figure 3) Metabolic pathway capabilities of Nephromyces (orange) and its bacterial endosymbionts (alpha=teal, beta=purple, bacteroidetes=green) solid colored boxes indicated a complete pathway, light shaded boxes indicated a partial pathway, and white boxes indicate the pathway is not present.  (Figure 4).

Orthology
Nephromyces had 21,762 genes when clustered at 70% of these 20881 were assigned to orthogroups. 39.7% of orthogroups made contained Nephromyces and there were eight genus specific orthogroups containing 21 genes ( Figure 5). 3,455 orthogroups were shared by Nephromyces, Apicomplexa, and Chromerids. 421 orthogroups were shared between Nephromyces and Apicomplexa ( Figure 6). 218 orthogroups were shared between Nephromyces and Chromerids that were not found in Apicomplexa. Cardiosporidium had 7,395 genes, of which 6,977 were assigned to orthogroups. 31.5% of orthogroups contained Cardiosporidium and there was one species-specific orthogroup containing five genes. 2,778 orthogroups were shared between Cardiosporidium, Apicomplexa, and Chromerids. 236 orthogroups were shared between Cardiosporidium and Apicomplexa and not found in Chromerid. 219 orthogroups were shared between Cardiosporidium and Chromerids that were not found in Apicomplexa. Nephromyces and Cardiosporidium had 3, 106 shared orthogroups.
orthogroups were found in Cardiosporidium and not in Nephromyces.

Discussion
Our initial aim in sequencing and comparing the transcriptomes of Nephromyces/Cardiosporidium was to better characterize mutualism in Nephromyces. A mutualistic relationship was described based on the unusual epidemiology Nephromyces, but high infection prevalence is not proof of mutual benefit. While our current efforts do not conclusively characterize Nephromyces' relation to its host, the data do provide insight into Nephromyces' atypical lifestyle. By comparing Nephromyces (a proposed mutualist) to Cardiosporidium (a blood parasite) we are able to extricate the evolutionary effects of a changing relationship from the evolutionary effects of an ascidian host.
Since all of the sequencing on Nephromyces has been on samples containing multiple Nephromyces species, our Nephromyces data is therefore a pan-transcriptome/genome. This approach limits our results and conclusions. This is particularly evident in our genomic assemblies, which are highly fragmented and almost certainly poly-species chimeric. In addition to problems with assembling the genomes of closely related organisms, we are also unable to estimate gene family expansions or reductions in Nephromyces. While these limitations are significant, the pan-transcriptome/genome does reflect the natural biology of Nephromyces. All of the renal sacs sampled from the host M. manhattensis to date have contained multiple Nephromyces infections. Efforts to culture single species isolates in the lab have met with limited success. This indicates that sustained Nephromyces infection is dependent on contributions from the community of species and endosymbionts.
Notably, by sequencing multiple Nephromyces species we were able to recover the transcriptomes/genomes of all three of Nephromyces endosymbionts.
Recovering multiple bacterial endosymbiont types provides key insights into how this system works, and outweighs the disadvantages of a pantranscriptome/genome approach.
The transcriptomes for Nephromyces/Cardiosporidium are largely complete (estimated by BUSCO), but Cardiosporidium is estimated to be about 10% less complete than Nephromyces. We have taken this difference into account when comparing these two data sets. Both Nephromyces/Cardiosporidium encode an estimated eight thousand genes, which is a large number of genes for an apicomplexan. This estimate is similar to the number of genes in the most gene rich apicomplexan, Toxoplasma, which also encodes eight thousand genes. This is interesting given the phylogenetic placement of Nephromyces/Cardiosporidium in the hematozoa. Hematozoa, which contains plasmodiidae and piroplasmida lineages, have some of the smallest genomes with the least number of genes of any sequenced apicomplexans. This high gene number in both Nephromyces/Cardiosporidium may indicate that greater biosynthesis and metabolic capabilities are necessary for living in an ascidian host.
The two most striking observations made over the course of this exploration involve purine metabolism. The first is purine degradation; both Nephromyces/Cardiosporidium have the metabolic capabilities to convert xanthine into glyoxylate. Glyoxylate can then be converted, with serine-pyruvate aminotransferase (AGXT), into glycine and pyruvate; Nephromyces additionally encodes malate synthase (MLS), which combines Glyoxylate and acetyl-CoA, into malate. This is the proposed primary route of carbon, nitrogen, and energy acquisition for Nephromyces (Chapter 2). This pathway is absent in all other sequenced apicomplexans, but it appears the enzymes in this pathway were retained from the last common ancestor of Apicomplexa, and not a more recent horizontal gene transfer.
Nephromyces/Cardiosporidium also purine metabolism is de novo purine biosynthesis. While some apicomplexan lineages have one or two genes to synthesize inosine monophosphate (IMP) from immediate precursors, Nephromyces/Cardiosporidium are predicted to encode the entire de novo purine synthesis from 5-Phosphoribosyl diphosphate (PRPP). As the inability to synthesize purines has been widely targeted for drug development against other apicomplexan species, its presence in Nephromyces/Cardiosporidium is surprising The presence of both purine degradation and purine synthesis could be critical to Nephromyces' unusual epidemiology. By obtaining the bulk of the required carbon, nitrogen and energy from a metabolic waste product (i.e. tunicates lack the enzymatic ability to degrade purines past uric acid), Nephromyces is able to limit its impact on the host while still reaching high cellular densities. De novo synthesis of purines indicates that neither Nephromyces/Cardiosporidium is dependent of the host for IMP. In fact, these purine degradation and biosynthesis pathways may have been the integral factors that allowed Nephromyces to leave the intracellular environment and colonize the renal sac.
Another critical factor in Nephromyces' ability to survive in the renal sac is likely its bacterial endosymbionts. The α-proteobacteria endosymbionts found in Nephromyces and Cardiosporidium are monophyletic, which indicates they have been maintained and vertically transmitted since the divergence of Nephromyces and Cardiosporidium. In addition to the α-proteobacteria, Nephromyces has acquired a β-proteobacteria and a Bacteroidetes endosymbiont. The αproteobacteria and Bacteroidetes symbionts show a marked reduction in carbon metabolism with the Nαe, only encoding genes for the citric acid cycle.
Bacteroidetes is only capable of processing three carbon compounds and encodes a partial citric acid cycle. Such pronounced reduction suggests that Nephromyces provides its symbionts a limited 'diet'. In both symbionts, carbon metabolism may be dependent on pyruvate, which is one of the products of AGXT. Related to this limited carbon metabolism, all three of the bacterial endosymbionts paradoxically encode complete fatty acid biosynthesis, but lack fatty acid degradation. Presumably, the fatty acid biosynthesis is for the construction of membranes, but without fatty acid degradation these symbionts are incapable of processing fatty acids as a carbon source. Both Nαe and Nβe lack complete pathways for the creation of glycerophospholipids, and both contain phospholipid ABC transporters. This could indicate a dependence on Nephromyces for phospholipids.
None of the three endosymbionts of Nephromyces contain any genes involved in purine degradation. The absence of this entire pathway in the genomes of the endosymbionts is further support that the high levels of uric oxidase detected are from Nephromyces and not from any of its bacterial endosymbionts. Similarly, Nβe and Nαe do not encode any genes involved in de novo purine biosynthesis, including genes for the conversion from IMP to adenine and guanine. Nβe can likely synthesize purines from PPRP through the histidine biosynthesis pathway and contains the genes to synthesize adenine and guanine from IMP. The total lack of purine biosynthesis genes in both Nβe and Nαe makes these symbionts dependent on Nephromyces for both adenine and guanine. If Nephromyces were incapable of de novo purine biosynthesis then the entire renal sac community would be dependent on either the tunicate host for all purines, or on Nβe. This would be a significant burden on the host and would markedly increase the cost of infection, which does not align with Nephromyces' strategy of low virulence and high-density infection of its host. This argument adds support to the prediction that Nephromyces/Cardiosporidium are able to synthesize purines.
Given the reduced genomes and correspondingly reduced metabolic capabilities, Nβe and Nbe encode a large proportionally high number of genes for synthesizing amino acids, vitamins, and co-factors. Together Nβe and Nbe could provide Nephromyces with all but one essential amino acid (tryptophan). Nβe is predicted to synthesize leucine, isoleucine, and valine up to the last step, where the final amine group is added. Conversely, Nephromyces only encodes the last step in the conversion of these three amino acids. Similarly, Nbe encodes a partial biosynthetic pathway for phenylalanine, which seems to be complemented by Nephromyces. Nαe is capable of synthesizing three amino acids and only one, which is an essential amino acid, is not encoded by Nephromyces (lysine). Vitamin and cofactor biosynthesis in Nαe is also limited, synthesizing heme, ubiquinone, and lipoic acid, with lipoic acid being the only product Nephromyces may be incapable of synthesizing itself.
With such limited vitamin and amino acid metabolism encoded in the Nαe genome it is unlikely that Nephromyces is maintaining Nαe just for lysine biosynthesis, but from the data we are unable to propose what particular function Nαe serves Nephromyces. In addition, all of the limited species infections of Nephromyces we have been able to culture so far have an Nαe type of symbiont.
While neither the frequency of Nαe or the limited species cultures are strong support for the Nαe symbiont being essential, it does suggest that Nαe may have an important role in Nephromyces metabolism.
As more apicomplexan genomes are sequenced it is becoming apparent that while they do share a large core subset of proteins, the differential losses and expansions are very lineage specific. This is likely due to adaptations required for specific host biology. Each lineage displays a characteristic patchwork of different gene losses and expansions. Many of these lineages contain orthologs with the Chromerids that are not found in other apicomplexan lineages. Nephromyces and Cardiosporidium have retained both purine biosynthesis and degradation, which has been lost in all other apicomplexan lineages. There may be something particular to ascidian biology that necessitates retaining and expressing these purine metabolism pathways. Nephromyces and Cardiosporidium share the vast majority of their genes with each other, as well as encoding the majority of the commonly shared apicomplexan genes. With so many metabolic similarities between Cardiosporidium and Nephromyces, we were unable to detect any clear differences related to Nephromyces' proposed mutualistic relationship.
While there are no obvious differences between Nephromyces and Cardiosporidium, we are severely limited by a lack of lineage specific proteomic work. In other apicomplexans the gene families that modulate host immunity are highly lineage specific. In Nephromyces/Cardiosporidium, the mechanisms of host manipulation are entirely unknown. Without a greater understanding of how both Cardiosporidium and Nephromyces interact with their host at a proteomic level, we are unable to conclusively say that there is a difference between these two organisms.

Molgula manhattensis collection
Molgula manhattensis tunicates were collected from a dock in Greenwich Bay, Rhode Island (41°39'22.7"N 71°26'53.9"W) in July 2014. A single renal sac was separated from one tunicate, and all extraneous tissue removed. The intact renal sac was placed in liquid nitrogen for 5 min and then stored at -80°C.

Cardiosporidium cionae collection, isolation, and concentration
Ciona intestinalis were collected from Matunuck Marina, RI (41.3890° N,71.5201° W), in August 2017. Tunics were removed and the body wall was opened to allow access to the heart. A sterile syringe was used to remove cardiac blood as cleanly as possible. Blood was kept at 4° C until Cardiosporidium infection was verified using Giemsa stain to visualize Cardiosporidium. Heavily infected samples were pooled together and centrifuged at 500g for 5 minutes.
The resulting supernatant was removed and the samples were frozen in liquid nitrogen and stored at -80° C. Samples with high rates of infection were enriched for Cardiosporidium using sucrose gradients [30,31]. Gradients of 20, 25, 30, 35, 40% sucrose solutions in phosphate buffer were layered together. Approximately Center for Computation and Visualization [32]. Reads assembled into 145674 and 109,446 contigs from M. manhattensis and C. intestinalis respectively. Protein sequences were predicted using Transdecoder [32]. Blastp was used to identify bacterial sequences from assembled transcripts against NCBI's refseq and binned. Remaining Eukaryotic sequences were separated with blastp against a custom database of alveolate and ascidian transcriptomes. Trimmed reads were mapped back to each of the six bins (Nephromyces, M. manhattensis, Nephromyces bacteria, Cardiosporidium, C. intestinalis, and Cardiosporidium bacteria) and then reassembled independently in Trinity. The Nephromyces transcriptome was composed of multiple Nephromyces species, and CD-hit was used to cluster transcripts based on 50 percent identity. Transcriptome completeness was assessed with Busco v3 against the Eukaryotic and bacterial reference data sets [33]. Transcripts were annotated using Interproscan [34].

DNA Extraction
The renal sacs from 8 lab grown M. manhattensis individuals were dissected and their renal fluid was pooled in a 1.5ml Eppendorf tube. Contents were centrifuged at 8000g for 5 min. to pellet Nephromyces cells, and following centrifugation the renal fluid was discarded. 500µl of CTAB buffer with 5ul of proteinase K and ceramic beads were added to the pelleted Nephromyces cells.
The sample was placed in a bead beater for 3 min. and then on a rotator for 1.5hrs at room temp. 500µl of chloroform was added, mixed gently and centrifuged for 5 min. The top layer was removed and twice the sample volume of ice cold 100% EtOH and 10% sample volume of 3M sodium acetate were added to the sample and incubated a -20C overnight. The sample was centrifuged at 16000g for 30 min. and the liquid was removed. Ice cold 70% EtOH was added and centrifuged at 16000g for 15 min. Liquid was removed and sample air dried for 2 min. DNA was re-eluted in 50ul of deionized water.

Illumina Sequencing
A nanodrop (2000c, Thermo Scientific) was used to assess DNA purity and DNA concentration, and a genomic gel was run to assess DNA fragmentation.
Following quality control, an Illumina library was constructed. Library prep and sequencing were done at the URI Genomics and Sequencing Center (URI GSC).
The completed library was sequenced on the Illumina MiSeq platform at the URI GSC and the HiSeq platform at the University of Baltimore sequencing center on three lanes.

Pacific Biosciences Sequencing
Using the contents of 150 M. manhattensis renal sacs (done in batches of 10 then pooled), the same DNA extraction protocol was performed as for Illumina sequencing. DNA was sequenced using three SMRT cells on the Pacific Biosciences platform at the University of Baltimore sequencing center.

Illumina assembly
One MiSeq lane and three lanes of HiSeq, all from the same library, were trimmed using Trimmomatic [35] and then assembled using Spades [36] assembler on the URI server BlueWaves.

Pacific Biosciences assembly
Pacific Biosciences reads were error corrected using pbsuite /15.8.24 [37] on the Brown University server, Oscar. Reads were then assembled using Canu [38]. Contigs generated by Canu were combined with Illumina MiSeq/HiSeq short reads with Abyss v2.02 [39]. Nephromyces contigs were identified by mapping Nephromyces transcriptome reads using Bowtie2. Contigs with greater than 90x coverage as assessed with bedtools [40] were binned as Nephromyces.
Bacterial endosymbiont genome assembly Using the contigs from the Abyss assembly bacterial contigs were initially identified by hexemers using VizBin [41]. Transcriptomic reads that were identified as bacterial were mapped using Bowtie2 [42]. Bacterial contigs were separated based on a 90x coverage threshold with bbmap. Binned bacterial contigs were preliminarily annotated with Prokka [43]. Resulting annotations were run through KEGG GhostKoala to assign and separate by taxonomy. Taxon separated contig bins were merged and scaffolded using PBJelly from the PBsuite of tools [37]. Trimmed Illumina MiSeq and HiSeq reads were remapped to resulting contigs to insure accurate assembly using Bowtie2. Final assembled bacterial genomes were re-annotated with Prokka with a genus specific database.
Bacterial phylogeny 16s rRNA sequences from Nephromyces bacterial endosymbiont genomes, predicted by rRNAammer, and 16S rRNA sequences from Cardiosporidium transcriptome were used in the phylogenetic analysis. All 16s rRNA rickettsiales, sphingobacteriaceae, and alcaligenaceae sequences with a minimum length of 1300bp available on NCBI's refseq were downloaded separately. Sequences were aligned with MAFFT [44] with G-INS-I and trimmed to length in Geneious 6.
Maximum likelihood trees of the alignments were generated with RAxML v 8.2.0 using the GTRCAT model run for 10000 generations with 100 generation burn in [45].

Introduction
Nephromyces is a genus of Apicomplexa with a symbiotic relationship with their hosts, tunicates in the family molgulidae. First described in 1874 by de Lacaze-Duthiers, Nephromyces was given several "identities" until it was finally placed in Apicomplexa using molecular phylogenetics . Part of the confusion over its taxonomic affinity was because Nephromyces inhabits the renal sac, a structure unique to the molgulid tunicates. While the function of the renal sac in not understood, it contains high levels of uric acid and calcium oxalate . Based on the metabolic capacity of Nephromyces, it appears to use uric acid for the purpose of primary carbon and nitrogen acquisition (Chapter 2). In order to supplement a diet of uric acid, Nephromyces relies on bacterial endosymbionts for the biosynthesis of metabolites from pathways missing from its genome (Chapter 3). Three different types of bacterial endosymbionts have been found in the genus Nephromyces, an alphaproteobacteria in Rickettsia, a betaproteobacteria in Bordetella, and a bacteroidetes in the family sphingobacteriaceae (Chapter 3). Despite genomic data, which indicates that these different types of bacteria are not functionally equivalent, no species of Nephromyces has been shown to have more than one type of bacterial endosymbiont ).

Based on preliminary genomic and transcriptomic sequencing of
Nephromyces it became apparent that there was a surprising amount of genetic diversity in the genus. In addition to the high levels of genetic diversity, Nephromyces also had high incidences of multi-species infections within individual renal sacs. Attempts to culture single species infections and limited species (3-5 isolates) infections in the lab were met with mixed success, but even limited species populations did poorly compared to the cultures that contained species numbers that better approximated wild samples.
To quantify the biological diversity and the incidence of multispecies Nephromyces infections found in molgulid tunicates, we used an amplicon sequencing approach. Because polymorphic 18S rDNA sequences have been reported in Plasmodium (Li et al. 1997), we targeted the Cytochrome Oxidase I (CO1) mitochondrial gene, as well as the 18S. In order to account for the endosymbiotic diversity within the Nephromyces population, we also targeted the bacterial 16S rRNA. Genomic data indicate that members of Rickettsia, Bordetella, and sphingobacteriaceae are endosymbionts of Nephromyces isolates, but their diversity is unknown.

Methods
Fifty Molgula manhattensis tunicates were collected from a single floating dock located in Greenwich Bay, RI (41° 39 ' 11.009" N 71° 27' 5.843" W), over a period of 4 weeks in the summer of 2016. Renal sacs were dissected out of the animals and contents were collected by a micropipette and placed in 1.5 ml Eppendorf tubes. Dissecting tools were sterilized in a 10% bleach solution for 15 min and then rinsed between tunicates. Sample tubes were immediately frozen in liquid nitrogen for five minutes and subsequently stored at -80° C.
DNA was extracted using the method described in (Chapter 2). Extracted DNA was stored at -20° C. The 18S rRNA primers and CO1 primers were designed to target Nephromyces based on available genomic data (Chapter 3). The universal 16S rRNA primers from (Klindworth et al. 2013)  The addition of well specific adaptors, library preparation, and sequencing was done at the URI genomic sequencing center on the Illumina MiSeq platform.
Sequence data were de-multiplexed prior to analysis. Bduck from the bbmap suit of tools was used to bin reads based on CO1 primers .
The universal 18S rRNA and 16S rRNA primers were too conserved for reliable binning based on primers, so reads were screened against the PR2 database using the NCBI's magicblast . Sequences with an 85% ID and 35% coverage were classified as 18S sequences and binned into a new file composed of 18S reads. Adaptors and primer sequences were remove from the forward and reverse reads from each of the three read sets using bduck.
Cleaned and binned read sets were individually processed in R using dada2 with the pool="pseudo" setting . Assembled 18S and 16S were assigned taxonomies with the PR2 database (Guillou et al. 2013).
Cytochrome oxidase 1 (CO1) sequences were assigned taxonomy using BLASTx against NCBI's refseq_protein database. All 18S and CO1 sequences that were not apicomplexan were removed from the count table, taxonomy table, and sequence files. Remaining sequences were aligned with MAFFT  to 16S rRNA sequences from the three known bacterial endosymbionts found in Nephromyces. Reference sequences were trimmed to the amplicon sequence length and CD-hit was used to cluster sequences with 85% sequence identity. All bacterial sequences, which did not cluster were deemed contamination and removed from count table, taxonomy table, and sequences file.
Sequences from 18S, CO1 clusters and 16S bins corresponding to endosymbiont type were processed individually in R. Figures were made in R using ggplot (Wickham 2016).

Results
The amplicon sequencing run resulted in 25,895,690 Figure 7). The most common ASVs were seen in 59.57% of samples the least common in 2.12%, when clustered at 98% these numbers rise to 86.17% and 2.12%.
There is a total of 188 CO1 ASVs with an average of 52.8 ASVs per tunicate a max of 101 and a min of 16. When clustered to the 98% identity level there are a total of 26 clusters an average of 6.24 and 10/2 max/min ( Figure 8).
The most common AVSs were found in 53% of tunicates sampled and the rarest ASVs were in 2%. After clustering at 98% the most common clusters were in 75.5% of tunicates the rarest in 2%.    (4), and Rickettsia and sphingobacteriaceae in 8% (2).

Discussion
The high numbers of ASVs obtained in this study reveal that Nephromyces Alternatively, it may be disadvantageous to carry multiple endosymbionts when multispecies infections are universal in molgulid renal sacs. Given the extreme Muller's ratchet known to occur in bacterial endosymbionts (Moran 1996), it may be evolutionarily cheaper to maintain one bacterial endosymbiont and rely on conspecifics with other types of bacterial endosymbiont. Such a system could not evolve unless there was a high probability of a multispecies infection of any given host.
In a system with so much co-dependence there is the potential for "cheaters" to develop and indeed we have found Nephromyces species, that do not seem to contain any bacterial endosymbiont and are presumed to parasitize the system. The absence of bacterial endosymbionts is based on single species isolates and fluorescent in situ hybridization (FISH) microscopy ( Figure 6), but as the lack of signal in FISH microscopy is not definitive of absence, this has not yet been confirmed. Cardiosporidium has access to far smaller quantities of uric acid than Nephromyces and therefore must rely on additional sources of nutrition from the host. Molgula storage and concentration of uric acid to the renal sac has enabled Nephromyces to develop its unusual uric acid based metabolism.
It has been proposed that the mutualistic benefit to its host is the processing of indigestible uric acid. This may be the case, but it is likely an oversimplification. First, it is unclear if Molgula ever recovers anything back from the uric acid imported into the renal sac. It is possible that valuable metabolites like amino acids or vitamins are exported out of the renal sac, but this has not been demonstrated. Second, the purpose of the renal sac has not been established. Sequestration of uric acid to the renal sac may have developed over time as a way of ridding Molgula of an apicomplexan blood parasite; by providing a parasite with a metabolic waste product as an alternative to infecting blood cells. In this case, Molgula benefits from losing a parasite, but this relationship is hardly mutualistic, it is more of a clever host defense mechanism. Third, Nephromyces has been shown to express xanthine dehydrogenase at high levels (99 percentile of all gene expression). Hypoxanthine is the interchange between purine recycling and purine degradation. Xanthine dehydrogenase converts hypoxanthine to xanthine and xanthine to uric acid; this represents the beginning of purine degradation. The competing enzyme, hypoxanthine-guanine phosphoribosyltransferase (HGPRT), converts hypoxanthine to inosine monophosphate (IMP) and from IMP to adenine or guanine. Since the source of uric acid within the renal sac has been shown to be from the tunicate it is curious that Nephromyces would have such high expression of xanthine dehydrogenase . A possible explanation of Nephromyces high expression of xanthine dehydrogenase is as a form of host manipulation. By outcompeting host production of HGART, Nephromyces forces greater production of uric acid than may be ideal for the host. This is potentially a cost to the host, particularly in times when purines are scarce Almost as surprising as an apicomplexan with a uric acid based metabolism was where the genes in the purine degradation pathway come from.
All other sequenced apicomplexans have lost the purine degradation pathway. It was thought that the high levels of uric oxidase measured inside the renal sac originated from the Nephromyces bacterial endosymbionts. Our data conclusively show that the genes involved in purine degradation are encoded in Nephromyces and Cardiosporidium genome. Additionally, these genes are not the result of a gene transfer event, but are the genes that were present when Apicomplexa split with the Chromerids. This pathway had previously been attributed to the bacterial endosymbionts, but is in fact encoded by Nephromyces/Cardiosporidium. If the bacterial endosymbionts are not being utilized for purine degradation then they must be contributing in another way.
Using 16s rRNA we have determined that the α-proteobacteria in Nephromyces and Cardiosporidium are monophyletic and therefore present when Nephromyces/Cardiosporidium lineages split. This indicates that there may be some aspect of ascidian biology that makes maintaining a bacterial endosymbiont worthwhile. Particularly as bacterial endosymbionts are not found in other apicomplexan lineages.
We have yet to determine exactly what the critical function of the α-proteobacteria is, however because it is maintained in Cardiosporidium, which is intracellular, and in Nephromyces, which is extracellular in the renal sac, the function seems to be not exclusively connected to renal sac biology.
We do not have any genomic data on the Cardiosporidium αproteobacteria endosymbiont at this time, making any comparisons between Nephromyces and Cardiosporidium α-proteobacteria endosymbiont is preliminary. The genome from the α-proteobacteria in Cardiosporidium will need to be sequenced and assembled for more robust analysis. Preliminarily based on RNAseq data, Cardiosporidium α-proteobacteria appears to have more biosynthetic capabilities than Nephromyces'. This includes several essential amino acids and vitamins not present in the α-proteobacteria of Nephromyces.
In addition to α-proteobacteria in the genus Rickettsia, the genus Nephromyces also maintains a β-proteobacteria in the genus Bordetella and a Bacteroidetes bacterial endosymbiont in the family Sphingobacteriaceae. We have assembled the complete genome for Nephromyces Sphingobacteriaceae endosymbiont and Nephromyces Bordetella. NBe and Nβe have the biosynthesis capabilities for a number of amino acids and vitamins, which are not encoded in the Nephromyces genome. Providing Nephromyces with amino acids and vitamins eliminates the need for Nephromyces to scavenge those metabolites from the host. Presumably, this reduces Nephromyces dependence on the host and also provides a reliable source of these metabolites, which may not be available in the renal sac.
Despite the three types of bacterial endosymbionts (Nαe, NBe, Nβe) being inside different Nephromyces species, we do see some similar patterns to the dual endosymbiont example in glassy winged sharpshooters. While we do not see the single pathway integration where one symbiont produces fabF and the other produces the remainder of the pathway. The lack of overlapping functions seems to indicate that despite being in different Nephromyces species, that the close proximity in the renal sac is sufficient to allow for metabolite exchange between bacterial endosymbionts in conspecific Nephromyces species. The result is completely unexpected and represents an unusual evolutionary quirk for the community inside the renal sac. If the renal sac community is in close enough to allow for the development of non-overlapping functions in bacterial endosymbionts in different species, it must be concluded that conspecific Nephromyces species are frequently exchanging metabolites. Based on our isolation and culturing experiments, we hypothesize that Nephromyces may be incapable of existing in isolation without conspecific Nephromyces species which contain a different type of bacterial endosymbiont than their own. We have not found a Nephromyces species containing two different types of bacterial endosymbionts, however we can't conclusively say that Nephromyces species with dual endosymbionts don't exist.
It remains unclear why a system dependent on conspecifics would develop when maintaining multiple endosymbionts would eliminate the need for competing Nephromyces species and guarantee that whatever host Nephromyces infected would be able to be colonized independent of conspecifics. Perhaps the cost of maintaining multiple endosymbionts is greater than the cost of sharing.
Indeed we have found that some Nephromyces species do not maintain any endosymbiont and presumably parasitize the community, i.e. relying on the products of other Nephromyces' bacterial endosymbionts. In order for such a system to develop any given Nephromyces species must colonize a renal sac where there will be complimentary Nephromyces species. The rate at which this happens needs to be greater than the cost of maintaining two endosymbionts otherwise we would presumably see dual endosymbiont Nephromyces species. Nephromyces is not observed in Cardiosporidium and is likely connected to the unusual renal sac community dynamics. With so many species, each with a lineage of vertically inherited bacterial endosymbionts, we predict that even bacterial endosymbionts of the same type may differ widely in their metabolic capabilities. Presumably bacterial endosymbionts, even within the same taxa, could contribute different metabolites to the renal sac community. Given the tremendous amount of diversity, our sequencing of just a few of the different bacterial endosymbionts is insufficient to develop a complete picture of the intricacies of this system. It is likely that different bacterial endosymbionts within the same type may differ in metabolic capabilities.
Adding to our uncertainties we do not currently know if there are any genetic barriers preventing reproduction between different Nephromyces species.
Sexual reproduction occurs inside the renal sac in the presence of multiple other Nephromyces species; interbreeding between species seems likely, unless there are strong genetic barriers between species. Indeed, the proximity and the interdependence of the system in general seems to indicate a great deal of interspecific breeding.
As our genomic assemblies of Nephromyces are incomplete, due in large part to the difficulties in assembling a metagenome of closely related species, we are not able to say how Nephromyces unusual epidemiology, environment, and community composition has affected its genome. Given the difficulties with Nephromyces sequencing and assembling, the genome of Cardiosporidium is a more attractive target. There are plans to sequence the genome, but currently we shown in other apicomplexans to be important for invasion, immune evasion, and host manipulation. Based on Orthofinder analysis we find surprisingly few lineage-specific genes that might be involved in dealing with an ascidian immune system. We also find few genes without orthologous in either Nephromyces or Cardiosporidium compared to other apicomplexans. It is possible that these genes without orthologous are involved in the specific challenges imposed by intracellular ascidian life cycle in Cardiosporidium and the renal sac for Nephromyces. However, because these genes do not have known orthologs studied in other species, we are unable to determine function bioinformatically. This work represents a step toward fully understanding the complexities of this unusual system, but leaves many questions unresolved. First, sequencing the genome of both Cardiosporidium and Cardiosporidiums bacterial endosymbiont would allow for more robust comparison to Nephromyces Rickettsia endosymbiont. This would provide a better understanding of the evolutionary history of both Nephromyces and its endosymbionts. Secondly, the biochemical pathways were based on bioinformatics with minimal confirmation at the protein level, the presented pathways need to be confirmed. Another step would be to show that uric acid is central to the metabolism of Nephromyces. A potential method to demonstrate this pathway is by injecting isotope labeled uric acid into the renal sac, and then using a new method for identifying the proteins from a specific organism in a metaproteomic sample (Kleiner et al. 2018). If this could be adapted to this system we could potentially confirm uric acid as the primary carbon and nitrogen source for Nephromyces, and determine the metabolites exchanged with the bacterial endosymbiont. This could also show if any of the carbon or nitrogen from uric acid makes its way across the renal wall back to the tunicate. If useful metabolites are exported or leaked out of the renal sac this would be the best support yet that the relationship between Nephromyces and Molgula is in fact mutualistic.