STRUCTURE, FUNCTION AND EVOLUTION OF PHOSPHOPROTEIN P0 AND ITS UNIQUE INSERT IN TETRAHYMENA THERMOPHILA

Phosphoprotein P0 is a highly conserved ribosomal protein that forms the central scaffold of the large ribosomal subunit’s “stalk complex”, which is necessary for recruiting protein elongation factors to the ribosome. Evidence in the literature suggests that P0 may be involved in diseases such as malaria and systemic lupus erythematosus. We are interested in the possibility that the P0 of the “ciliated protozoa” Tetrahymena thermophila may be useful as a model system for vaccine research and drug development. In addition, the P0s of T. thermophila and other ciliated protozoans contain a 15 to 17 amino acid long insert, unique to the N-terminal region. This project sought to further characterize the T. thermophila P0 (TtP0) and its unique insert through structural and functional bioinformatics studies. In order to visualize the three-dimensional structure of TtP0, we created a homology model of the N-terminal region of TtP0 and its insert from available P0 structure and sequence data. When the insert was modeled “in-context” in the presence of a previously published crystal structure of the T. thermophila ribosomal RNA, we discovered a surprising association between the insert and a highly variable portion of the rRNA, termed expansion segment 7, or ES7. When we investigated if this association could occur in other ciliates, we found very little data for the ES7 sequence in other species, meaning that further analysis on the conservation of this association is not possible at this time. Still, the presence of an association in T. thermophila may indicate that the insert has a functional role unique to the ciliates, perhaps in the regulation of P0 function by phosphorylation. In addition, we also investigated whether the highly conserved nature of P0 meant that it could be useful for phylogenetic and evolutionary studies. By studying P0 sequences from ciliates and other closely related clades, we could determine if P0 provides any information on the early evolution of eukaryotic species. We collected P0 sequences representing all of the eukaryotic supergroups, and used them to create phylogenetic alignments and trees based on the whole molecules, as well as the individual functional domains. Overall, we found that the trees did not resolve very well at the basal branches, but terminal branches had much stronger support. The trees also successfully separated the ciliate P0 sequences into groups matching the previously established taxonomy for the ciliates. Finally, we found evidence that the N-terminal domain of P0, called the L10 region, is much more evolutionarily stable than the C-terminal 60S region. Thus, the variability of the 60S region appears to contribute to the diversity of ciliate and eukaryotic P0 sequences. Once additional P0 sequences become available for underrepresented clades, they could be used to provide stronger support for the weaker branches of the tree. Both studies provide a starting framework for further computational-based work on P0, such as homology modeling of P0s from other ciliates or simulations of insert phosphorylation. These studies may also serve as a starting point for in vitro or in vivo experiments on the protein and its ciliate-specific insert.


LIST OF TABLES
We hypothesized that this insert may have a function unique to T. thermophila, such as regulation of stalk complex function via phosphorylation of the insert. Almost no mention of this insert exists in the literature, and while the T. thermophila ribosome has had its structure analyzed by x-ray crystallography, it lacks a structure for the insert and provides limited data on the rest of Tetrahymena's P0 (TtP0). In order to investigate the possible structure and function of the insert in TtP0, we performed several in silico analyses. The TtP0 sequence was used with several phosphorylation site prediction tools to detect the likelihood of phosphorylation in the insert. The TtP0 sequence was also combined with existing P0 structure and sequence data to produce a homology model of the N-terminal region of TtP0, including the insert. When the insert was modeled in the context of the T. thermophila 26S rRNA, the insert associated with a portion of the rRNA that we identified as expansion segment 7 (ES7). This suggests a potential interaction between ES7 and the insert. When the ES7 region of T. thermophila and that of three other ciliated protist species were compared, we found little evidence that the insert-ES7 interaction could occur in other ciliates, although more definitive analysis will require the availability of more sequenced genomes from ciliated protists. Overall, this study lays the groundwork for future in vitro studies to verify the presence of the insert-ES7 interaction in T. thermophila, and

INTRODUCTION
Our laboratory recently reported that a 15-17 amino acid insert is present in the N-terminal region of the large subunit ribosomal protein, phosphoprotein P0, of Tetrahymena thermophila (TtP0) and other ciliated protists, based on analysis of genomic data (Schumacher et al, , 2010aSchumacher and Hufnagel, MS in preparation). This insert was not present in other prokaryotes or eukaryotes examined.
The insert was later also noted by Klinge et al (2011), in their crystallographic study on the T. thermophila large ribosomal subunit, but no mention was made of its sequence, structure or possible function. In this paper, we used homology modeling and other analyses to extend previous studies by providing a possible functional organization for the L10 region of TtP0 including the ciliate-specific insert. By developing a model of the three-dimensional shape of P0 through homology modeling, the shape of key P0 functional domains can be understood. Assuming that "form follows function", anatomical data combined with prediction programs can help in developing hypotheses about P0 function that can then be tested experimentally. Here, we report that our homology modeling analysis provides evidence that the insert forms a flexible loop that may have a novel regulatory function via an interaction with the ES7 region of 26S ribosomal RNA. We further report that the ciliate-specific insert of T. thermophila contains a potential serine phosphorylation site for Casein II kinases, based on analysis of the predicted protein sequence of TtP0 using phosphorylation site prediction tools (Pagni et al 2007;Blom et al, 1999).
Recently, the structures of the small and large subunits of the T. thermophila ribosome were solved by X-ray crystallography (Rabl et al, 2011, PDB Code 2ZXM;Klinge et al, 2011, PDB code 4A1C, 4A1D). The 60S ribosomal subunit structure contained structural coordinates for most of the ribosomal proteins, with the notable exception of the stalk proteins, which include TtP0. The crystallographic data for P0 was not clear enough to resolve its atomic structure, and instead, Klinge et al (2011) reported TtP0  Phosphoprotein P0 (P0) is a component of the 60S subunit of the eukaryotic ribosome ( Figure 1). P0 is able to combine with other phosphoproteins, P1 and P2, to form a "stalk" complex that interacts with extra-ribosomal elongation factors, namely EF-1α and EF2 in eukaryotes (EF-Tu and EF-G in prokaryotes) (Uchiumi et al, 2002).
This stalk complex is part of the "GTPase-associated center", which is defined by the GTP-dependent binding of the elongation factors. The protein composition of the ribosomal stalk varies between the three domains of life, but a single P0 molecule (called L10 in eubacteria) is always present in the stalk, acting as a scaffold for other phosphoproteins, usually P1 and P2 (Gordiyenko et al, 2010). The N-terminal domain of the stalk interacts with the ribosomal protein L11P (L12 in eubacteria) (Nomura et al, 2006). The stalk also forms two contacts with the 26S (23S) ribosomal RNA. The loops containing these sites have been termed the "thiostrepton loop" (H42-H44 on Klinge et al Tetrahymena model, position 1070 in E. coli) and the "sarcin-ricin loop" (H95 on the Klinge et al Tetrahymena model, position 2660 in E. coli), for the antibiotics and ribotoxins that interact with those loops in both prokaryotes and eukaryotes (Uchiumi et al, 2002).
There are several eukaryote-specific inserts in the ribosomal RNA, called expansion segments; two of these, ES7 and ES39, are located in the proximity of the stalk complex (Ben-Shem et al, 2011). Together, these expansion segments account for a large yet variable portion of the eukaryotic-specific RNA that interacts with conserved and eukaryotic-specific proteins (Wilson and Cate, 2012;Ben-Shem et al, 2011;Klinge et al, 2011 The general structure of P0 based on these experiments can be found in Figure 1. The P0 sequence can be divided into three functional domains, two of which (L10 and 60s) have been identified in PFAM (http://pfam.sanger.ac.uk/) as conserved regions (Remacha et al, 1995). The L10 domain [PF00466] is located near the N-terminal of P0. This domain is the rRNA binding region of the stalk, and is present in all three domains of life . The L10 region has been visualized as a fivestrand beta sheet with five alpha helices surrounding it. In addition, this region contains the 15-17 aa ciliate-specific insert that is the focus of this paper.  2010). It has been proposed that this region may form contacts with ribosomal protein L11, the 23S rRNA and EF2, indicating a possible role in GTPase turnover and elongation factor discrimination (Justice et al, 1999;Santos et al, 2004;Naganuma et al, 2010;Kravchenko et al, 2010 Nomura et al, 2012). In addition, a phosphorylation site has been identified in the P0 of yeast, rat and the buds of Populus family plants (Ballesta et al 1999, Liu et al 2010. According to these studies, phosphorylation takes place at a serine or threonine residue located a few residues before the C-terminal peptide. However, radioisotope labeling and electrophoresis studies of the P proteins of Tetrahymena pyriformis (a species distinct from but related to T. thermophila) failed to produce evidence of phosphorylation (c.f. Sandermann, Kruger and Kristiansen, 1979). The conserved serine or threonine residue is noted to be absent in Tetrahymena pyriformis P proteins, which was suggested to explain the lack of phosphorylation (Ballesta et al, 1999). The 60S region contains alpha helices that protrude from the ribosome (two in eukaryotes, three in archaebacteria) and provide space for P1 and P2 to bind, as heterodimers. The portion of the 60S region beyond the P1/P2 binding sites has yet to be crystalized successfully, likely due to its predicted flexible nature. However, NMR structures of several C-terminal peptides from human P proteins are available (Soares et al, 2004).
In addition to its role in protein synthesis, there is evidence suggesting that P0 may have extra-ribosomal functions. Immunocytochemical evidence suggests that P0 can locate to the cell surface in mammals, yeast and single-celled Apicomplexan parasites such as Plasmodium falciparum and Toxoplasma gondii, organisms that cause malaria and toxoplasmosis respectively (Singh et al, 2002;Sehgal et al, 2003). It was reported that P. falciparum parasites were exposed to monoclonal anti-P0 antibodies; their ability to infect mice was blocked, indicating that P0 may play a role in host cell invasion (Rajeshwari et al, 2004). Furthermore, when mice were injected with a highly conserved C-terminal domain from the P0 of P. falciparum, they were protected from malaria parasite invasion. In addition, when mice were immunized with a plasmid coding for an antigen derived from the P0 of Leishmania infantum (a non-apicomplexan parasite), they developed immunity to Leishmania major. Based on these findings, P0 has been proposed as a candidate for vaccine research and drug development for the control of protistan parasitic infections (Iborra et al, 2003).
Antibodies against P0 and the other P proteins have also been implicated in human diseases, and could be used to detect certain diseases before the appearance of symptoms. For example, elevated levels of antibodies against the L10 region and the C-terminal peptide have been found in some systemic lupus erythematosis patients (Heinlen et al, 2010;Uchiumi et al, 1991). A recent study also suggested that elevated levels of the antibodies are also involved in autoimmune hepatitis, which may indicate a common targeting mechanism for both diseases (Calich et al, 2013).
Our lab is investigating the potential of the ciliate Tetrahymena thermophila, a eukaryotic microorganism related to the apicomplexans by virtue of their shared membership in the alveolate clade of protists, as a model for vaccine research. Our group has recently obtained immunocytochemical evidence for the location of P0 at the surface of T. thermophila (Schumacher et al, 2010a, b, ms in preparation).

Sequence Alignments:
The selected sequences (104 in total) were aligned using the MCoffee web server (Notredame et al, 2002), which combines the output of several different alignment programs into a single consensus sequence alignment. To further refine our alignments by using P0 structural data, the MCoffee alignment was combined with the coordinates of possible template structures to create an Expresso alignment using default parameters (Notredame et al, 2002). A smaller selection of eukaryotic sequences (24 total) were aligned with ClustalW, in order to emphasize the presence of the ciliate insert (Larkin et al, 2007).

TtP0 Motif Analysis:
The TtP0 protein sequence was run through the Motif Scan tool at MyHits to detect possible motifs and functional sites on P0 and in the region of the predicted insert (Pagni et al, 2007). The protein sequence was also analyzed using NetPhos 2.0 (Blom, Gammeltoft and Brunak, 1999) and DISPHOS (Iakoucheva et al, 2004) to predict the phosphorylation potential of all serine, threonine or tyrosine residues on the protein.
To predict specific kinases that might phosphorylate these same amino acids, NetPhosK with standard parameters (Blom et al, 2004) was also used on the TtP0 amino acid sequence. focus on modeling the "L10 core" region in our study ( Figure 1). This region also contains the insert, which was de novo modeled after completion of the backbone modeling.

Backbone Modeling:
After superimposition, we determined the appropriate template for homology modeling, based on the quality of the structural data, as well as the homology of the template sequence to the target sequence. Of the six possible templates, we chose to make models from the Yeast P0 (chain q of 3U5I). This is because S.cerevisiae was the nearest relative to T.thermophila among the species considered, and because the yeast P proteins are well characterized in the literature. After building 25 models, we then visually inspected them to find areas of variability in the structures, which were limited to loops between the secondary structure elements. The alpha-helices and beta-strands remained very consistent amongst all 25 models. After inspection, we manually refined the alignment between the Tetrahymena and yeast P0 sequences in Discovery Studio. When these adjustments were completed, we built 25 additional homology models using the same protocol as above, using the refined alignment and yeast P0 as a template.

Backbone Refinement:
The best-scoring (lowest energy) model from this second run was then refined using the "side-chain refinement" tool in Discovery Studio, based on the CHI-ROTOR algorithm and CHARMm minimization (Spassov, Yan and Flook, 2007). After refining the side-chains, the entire protein was refined using the Minimization tool of Discovery Studio. The TtP0 backbone model was minimized using multiple runs of the Steepest Descent and Conjugate Gradient protocols using a CHARMm forcefield and "Generalized Born" implicit solvent model at all steps. Minimization was repeated until the Potential Energy reached a plateau.
Following each minimization, a score was calculated for the model using the "Verify Protein Profiles-3D" option in Discovery Studio. The score did not change much with each minimization, and stayed within the predicted range given by Profiles-3D. After all minimizations were complete, the refined backbone assessed with Discovery Studio's "Protein Health" protocol to verify the validity of the model.

Insert Modeling and Refinement:
The refined and minimized backbone of the TtP0 homology model was used as the template for de novo modeling of the ciliate insert. Twenty-five de novo models were created, using the "Build Homology Models" function under the same parameters used for the backbone. The lowest-energy model with the insert was refined using the "Smart Minimizer" minimization option, with 1000 steps of Conjugate Gradient modeling and standard parameters. Side chains were refined as above, and a second round of "Smart Minimizer" minimization was used to confirm the model's energy was at a plateau. The "Protein Health" option was used again at this point to assess the lowest-energy model and insert.
To investigate the influence of nearby components in the 26S ribosome complex, the minimized TtP0 homology model was placed into a file containing bases 551-599, 1226-1256 and 1306-1319 of the T.thermophila 26S rRNA (PDB code 4A1D), as well as chains F, K and X of 4A1D. The insert was minimized separately, both in and out of context of the nearby rRNA and protein chains using the "Loop Refinement" protocol, based on the LOOPER program (Spassov, Fllok and Yan, 2008) with CHARMm minimization ("CHARMm Polar H" forcefield). One hundred variations of the loop (residues 69-84) were created in the presence (in context) of the rRNA/protein chains and one hundred variations created in the absence (out of context) of the rRNA/protein chains. The quality of the models was verified at all stages used the "Verify Protein Profiles-3D" and "Protein Health Report" options in Discovery Studio. The twenty lowest-energy conformations of the insert were chosen to demonstrate the variability of the loop conformations. For the "in-context" models, the potential for non-bonded interactions between the insert and the rRNA/protein chains was investigated using Discovery Studio's "Monitor Intermolecular H-Bonds" and "Monitor Distance" tools.

Insert Interactions:
Using the secondary structure data provided in Klinge et al (2011) along with the data gathered from the "in-context" modeling, bases potentially interacting with the insert were observed at the tip of the ES7B region of the T.thermophila 26S rRNA.
After identifying ES7 as a potential interacting partner, we compared the sequence Alignments.

Alignments of P0:
To determine the location of the ciliate insert compared to conserved regions in P0, several alignments of the predicted protein sequences of T. thermophila P0 (TtP0) and its orthologues in other organisms were prepared. These included a MCoffee alignment using 104 P0 and L10 sequences from the three domains of life ( Figure 2A) and a ClustalW alignment of 24 eukaryotes ( Figure 2B). This included 12 sequences from ciliated protists, representing three of the eleven recognized ciliate classes  MCoffee (without PDB structural information) and Expresso (with PDB structural information) alignments were evaluated for use in the homology modeling.
Inclusion of 3D structural data improved the quality of the alignment (from 73 to 87 alignment score, out of 99). As expected, the P0 and L10 sequences exhibited strong homology in the center of the protein sequences, and weaker homology at the Nterminus, C-terminus and the site of the insert. Within the insert region, the inserts of the ciliates aligned on one side of a homologous 5 AA peptide, while the kinetoplastid inserts aligned on the opposite side (not shown). All other eukaryotes showed large gaps in this region, except for the homologous peptide.
Netphos 2.0 is used to evaluate the phosphorylation potential (PP) of the serine, threonine and tyrosine residues in order to predict generic phosphorylation sites. A higher score indicates a stronger confidence that the residue represents a phosphorylation site, with values above 0.5 considered supportive of phosphorylation.
In all, sixteen residues scored above the 0.5 cutoff. Two of these residues, Ser78 (PP=0.985) and Tyr80 (PP=0.722) were located in the insert region ( Figure 4). One of these (Ser78) was also identified by MyHits. DISPHOS, a tool with a similar function to Netphos 2.0, predicted six possible sites, three threonine residues and three tyrosine residues. Three of these hits were found in the insert at Thr76, Tyr80 and Tyr 83.
Unlike the other tools, NetphosK predicts specific kinases that act on a serine, threonine or tyrosine residue, using a PP score like Netphos 2.0. NetphosK found 25 possible matches to kinases at 20 different residues, with 5 having more than one possible matching kinase. None of these residues were located in the insert. Of all the residues studied, only one, Tyr211, was predicted as a phosphorylation site by all three tools. A summary of the possible phosphorylation sites in TtP0 are included in Table   2.
Other modifications: Three N-myristoylation sites were predicted by MyHits, at residues 147-152, 270-275 and 284-289. One amidation site at residues 181-184 was also predicted. None of these predicted sites are located within the insert region.

Homology modeling of TtP0:
Because of the flexible nature of the C-terminal region of the stalk, the modeling studies were limited to the L10-containing N-terminal domain of TtP0 (residues 7-125, 203-218), with and without the insert. The quality of the models was verified by generating Protein Health reports, which included a check of the main chain conformations against a Ramachandran plot. In the lowest-energy backbone model, 99 of the 108 non-terminal (not glycine or proline) residues (91.4%) were in allowed regions, eight were in marginal regions, and one was in a disallowed region.
After the insert was modeled in and before its conformation was minimized, only eleven of the 121 non-terminal residues (9.1%) were in marginal regions; three of these (ARG84, ALA88, LYS90) were located in the insert. For the out of context model of the insert, thirteen of 121 residues were in marginal regions, including three in the insert (THR75, TYR79 and TYR82) and one (LYS89) was in a disallowed region.The in-context insert of the model included no disallowed residues and many of the same marginal residues as the out-of-context model, with LYS78 and ASP80 instead of TYR79 and TYR82.
Overall, modeling the insert in the presence of the ribosome ("in context") produced a significantly different result than modeling without the ribosome ("out of context"). Out of context, the insert took on a variety of possible orientations and forms, and even formed coils in a few cases ( Figure 5, A and B). In context, we observed more constrained conformations than in the out of context models, along with what appeared to be a close association between the insert and a portion of the rRNA later identified as ES7 ( Figure 5, C, D and E). While some variation was still observed, this was restricted to loop models with higher (less negative) energy scores.

Interactions between ribosomal RNA and insert of TtP0:
Once a good fit between TtP0 and the rest of the 60S subunit was achieved for both yeast-derived and Tetrahymena-derived ribosomes, the models were investigated further using Discovery Studio. We wanted to determine what intermolecular forces, if any, could be responsible for the close association we observed between the insert and ES7B.
Using the "Monitor" function of Discovery Studio, we identified a number of hydrogen bonds between atoms of the insert and ES7, with measured bond distances between 2 and 3 Angstroms. A diagram and summary of the H bonds predicted by the 10 lowest energy models is shown in Figure 6. Among these models, H bonds were observed between atoms from bases C584 and A 585, and residues ARG83, GLN84 and GLY 86. While the exact atoms in the H bonds varied between different insert conformations, certain atoms, like the HN of GLY86 and O2 and N3 of C584 made H bonds in several of the models. The number of times these atoms were involved in H bonds may indicate their functional importance.
We hypothesized that if the interaction between ES7B and the insert is part of a ciliate-specific regulatory mechanism for the stalk, then the residues or bases involved should be conserved in other ciliate species. To assess the validity of this hypothesis, we compared the available sequences of ES7 and the inserts of several ciliate species to those of T. thermophila. In total, we found five ciliate species whose published 26S rRNA sequences contained ES7. These species are T thermophila, T. among closely related species. The inserts were compared by visual inspection (Table   1), and the ES7 sequences were compared by a ClustalW sequence alignment ( Figure   7).
The Tetrahymena species showed strongly homologous sequences for ES7.
The bases we observed interacting with the insert in the homology model (C584 and A585) were found to be conserved in T. pyriformis. While no P0 sequence has been published for T. pyriformis, the predicted P0 sequences for three other Tetrahymena species are available (Table 1). These predicted insert sequences are highly conserved, and the residues that interact with ES7 were observed to be present in all four species.
However, the insert and ES7 sequences in O. trifallax both differed from those of the two Tetrahymena species. Furthermore, the P. tetraurelia insert and ES7 sequences showed the largest divergences from those of T. thermophila. Also, the aligned P.
tetraurelia ES7 sequence had a large number of gaps. Since the rest of the P.
tetraurelia 26S rRNA sequence, outside the expansion segment, is highly homologous to the same regions in other ciliates, it is clear that ES7 in Paramecium is significantly shorter than in the other ciliates.

Homology modeling of TtP0:
To explore the structure and function of a previously identified ciliate-specific P0 insert, we created a homology model of the L10 region, including the insert, of Our homology model consists of one of the three structural domains of TtP0, and contains about one-third of its amino acid residues. Other crystal structures of the stalk complex could be used as templates for constructing additional models of the middle domain and the P1/P2 helices. This was outside the region of interest in the present study, but it will be visited in a future study. However, we decided to focus on the L10 region in the current study, for two reasons. 1), The crystal structure of the T.
thermophila ribosome lacks coordinates for any residues beyond the L10 region; without these experimentally-derived coordinates as a check, building the other sections of TtP0 might decrease the homology model's overall quality. 2), Modeling the other regions could distract from the main focus of our study, the structure and function of the ciliate insert.

Phosphorylation of TtP0:
Unlike other eukaryotes, T. thermophila lacks a conserved serine that is usually located a few residues before the conserved C-terminal peptide (Ballesta et al, 1999 We attempted to determine, through modeling approaches, whether phosphorylation of the insert of TtP0 would have a significant effect on its conformation and interaction with the 26S RNA. However, the nonstandard nature of phosphorylated amino acids makes them difficult to parameterize and incorporate into homology models using the methods outlined in this paper. Another approach will likely be necessary to test whether the L10 insert is phosphorylated, and how phosphorylation could affect the conformation of the loop and its potential interaction with the 26S rRNA. An in vivo approach may eventually be necessary to confirm if the insert is in fact phosphorylated, and if so, whether this could serve to regulate the interaction of TtP0 with the 26S rRNA.

The ciliate insert/ES7B interaction:
After refining the model of the ciliate insert in the presence of a portion of the 26S rRNA, we were intrigued to find that the original range of orientations for the insert loop appeared to be restricted by protein-RNA interactions. Hydrogen bonding occurred between residues toward the C-terminal end of the TtP0 insert (specifically Gln84, Phe85 and Gly86) and bases C584 and A585 of the rRNA. Both of these bases are located on the tip of a loop of Expansion Segment 7B (ES7B), a eukaryote-specific region of the 26S rRNA in the large ribosomal subunit ( Figure 5)     A NetPhos 2.0 search for generic serine, threonine and tyrosine phosphorylation sites.
All potential sites are scored between 0 and 1, with scores of 0.5 or greater representing likely phosphorylation sites. Two potential phosphorylation sites within the insert, at Ser 78 and Tyr 80, are marked with an asterisk.  Below: A table indicating the H-bonding atoms for the ten lowest-energy loop conformations. Numberings for the bases and residues are relative to the portion of the P0 and ES7 that was included in the modeling, rather than their placement in the actual ribosomes.

Figure 7: Alignment of ciliate ES7B
A portion of an unedited ClustalW alignment of four ciliate LSU rRNA sequences, showing the varying nature of ES7B among the ciliates. The region of ES7 with a potential interaction with the insert is highlighted in gray.  Summary of predicted phosphorylation of serine, threonine and tyrosine residues of TtP0 from NETPHOS2.0, DISPHOS and NETPHOSK. Residues within the ciliatespecific insert are highlighted in gray. Residues that were predicted to be "likely    Organism with insert Sequence Source:  Table 2 ABSTRACT The large subunit ribosomal protein, phosphoprotein P0, is a necessary component for protein elongation factor recruitment. Orthologues of P0 are present in both prokaryotic and eukaryotic species, and the protein is thought to be one of the most highly conserved ribosome proteins. In this study, we investigated if P0 could serve as a good target for phylogenetic studies by itself, and if analysis of the phylogeny of P0 would reveal events during early eukaryotic evolution, as well as the evolution of the Ciliophora. P0 and L10 protein sequences from organisms representing the major eukaryotic supergroups were aligned and used to build phylogenetic trees based on the entire protein, as well as the individual functional protein domains of P0. We found that P0 could provide support for higher-level taxa, but failed to provide strong support for the earliest roots of the trees. The ciliates could be resolved into previously defined Classes, but the monophyly of the Alveolata Group was not supported in all of the trees. Domain trees of P0 seemed to indicate that the C-terminal 60S region may contribute significantly to P0 diversity, while the Nterminal L10 region appeared to be more conserved in eukaryotes. We also discuss how the phenomenon of long-branch attraction may have factored into our results, as well as how it could be avoided in future phylogenetic studies on P0.

INTRODUCTION
Ribosomal phosphoprotein P0 (P0) is a component of the 60S subunit of the eukaryotic ribosome. P0 is able to form a "stalk" complex, with the phosphoproteins P1 and P2, that interacts with extra-ribosomal elongation factors (EF-1α and EF2; EF-Tu and EF-G in prokaryotes) as part of the "GTPase-associated center" (Uchiumi et al, 2002). P0 is present in organisms from all three domains of life; the P0 analog in eubacteria is known as L10, while the archaebacterial equivalent is also called P0.
While the exact composition of the ribosomal stalk varies between the three domains of life, the stalk always contains a single copy of L10/P0, acting as a scaffold for other phosphoproteins, usually P1 and P2 (L7/L12 in eukaryotes) (Gordiyenko et al, 2010).
The P0 sequence can be divided into three functional domains, two of which have been identified in PFAM (http://pfam.sanger.ac.uk/) as conserved domains (Remacha et al, 1995 regions, is not present in eubacteria, and is unclassified in PFAM. There is little known about its function, although it has been hypothesized that the region is involved in binding to EF2 (Santos et al, 2004;Justice et al, 1999 P0 is also thought to be one of the 29 most highly conserved eukaryotic ribosomal proteins that form the core of the universal eukaryotic ancestor (Harris et al, 2003). Generally, phylogenetic studies on eukaryotes have been based on the sequence of small subunit ribosomal RNA (c.f. Cavalier-Smith, 1987;Doolittle, 1987;Woese, 1987;Zillig, 1987) or more recently, concatenated alignments of highly conserved genes (c.f. Parfrey et al, 2009;Katz et al, 2012). One of these concatenated gene studies recently focused on ribosomal proteins, but only small subunit proteins were utilized (Leigh and Chang, 2012). Because of its highly conserved nature, P0 may provide a valuable addition to these phylogenetic studies. In an early phylogenetic analysis,  showed that L10/P0 sequences could be used to distinguish between eubacteria, archaebacteria and eukaryotes. More recently, Pucciarelli et al (2005) concluded, from a study on the P0 sequences of a limited number of organisms, that P0 could be useful for investigating "the phylogenetic origin of early eukaryotes". Today, many more sequenced eukaryotic genomes are available; therefore, a much more comprehensive and detailed analysis of the evolution of L10/P0 is possible, with a greatly improved opportunity for discovering new information about early eukaryotic lineages.
Tetrahymena thermophila is a unicellular eukaryotic microorganism that belongs to a Phylum of protists known as the Ciliophora (ciliated protists, aka ciliates).
Along with the apicomplexan parasites (e.g. Plasmodium, Toxoplasma, Eimeria) , the Ciliophora belong to a protistan clade known as the Alveolata, which in turn has been proposed to be part of the "SAR" (Stramenopiles, Alveolata, Rhizaria) Supergroup of eukaryotes (Adl et al, 2012). The ciliates are unique in that they contain two different kinds of nuclei with two differing genomes, a vegetative, transcriptionally active macronucleus (MAC) and a genetic, transcriptionally silent micronucleus (MIC) (c.f. Karrer, 2000). So far, only macronuclear genes have been utilized in phylogenetic studies, because gene predictions have only been carried out on macronuclear genome sequences, and because gene expression is almost exclusively limited to the macronucleus.
The amino acid sequence of the P0 ortholog of T. thermophila (TtP0) was originally obtained through preliminary genome sequence analysis and verified through PCR-based methods by Pucciarelli et al (2005). Further characterization by gene sequence analysis and immunocytochemistry was more recently carried out in our laboratory Schumacher et al, 2009;Schumacher et al, 2010a, b, c;ms in preparation). Through Clustal W-based sequence alignments of TtP0 with P0 sequences from ciliates and other organisms, it was revealed that an additional 15-17 amino acid-long insert is present in the L10 region of T. thermophila and other ciliates. However, this insert was not found in any other prokaryotes or eukaryotes.
Alignments that included a larger sample of eukaryotes showed that a smaller, apparently unrelated insert is present in the same location in members of the Kinetoplastida, an Order of excavate protists ).The ciliatespecific insert was also noted more recently in T. thermophila by Klinge et al (2011).
Through homology modeling experiments, evidence was provided that the insert of T.
thermophila may interact with expansion segment 7 (ES7) of the 26S ribosomal RNA of T. thermophila (Pagano et al, ms in preparation). This evidence for a functional role of the insert suggests that it may be useful for phylogenetic studies on the early diversification and systematics of the Ciliophora.
In the present study, we wanted to obtain more information about the early evolution of eukaryotic P0, as well as about the evolution of the L10 insert in the ciliate lineage. We created sequence alignments and phylogenetic trees using L10 and P0 protein sequences from a wide variety of eukaryotes and prokaryotes.
Trees were created from complete and modified P0/L10 sequences, as well as from each of the three functional domains, L10, 60S and MID. These trees were then compared to the taxonomic classifications proposed by Adl et al (2012), and to phylogenetic trees based on other methods, such as concatenated sequences of conserved genes and small ribosomal proteins. We provide evidence that P0 may be useful for assigning ciliates to different clades, and that the later branches of P0's evolution are consistent with other phylogenetic studies. The early stages of P0's evolution in eukaryotes are still ambiguous after this study.

P0 homologue identification:
The TtP0  codons that needed to be removed (Larkin et al, 2007).

P0 and L10 sequence alignments:
The TtP0 amino acid sequence was aligned against P0/L10 sequences from 90 (eukaryotes only) or 100 (eukaryotes, archaebacteria and eubacteria) organisms using MCoffee, run under default parameters (Notredame, Higgins and Heringa, 2000). Due to a problem in MCoffee where the first input sequence (usually TtP0) was assigned a lower homology score than it should normally have, a duplicate TtP0 sequence was included in the alignment. The TtP0 duplicates always appeared at the same location in the trees, thus providing one type of control during tree building. Based on these alignments, poorly-aligned terminal regions were removed from all 101 sequences, and the remaining amino acids were realigned in MCoffee. The amino acids corresponding to positions 1-5 and 274-324 of TtP0 were removed. After the N-and C-terminals were trimmed, we also removed the inserts from the ciliate and kinetoplastid P0s, and realigned the 101 sequences. For both of these edited alignments, the P0 of T. thermophila was arbitrarily chosen as the reference point for trimming the terminals and inserts.

Phylogenetic tree building:
The alignments described above were used to create phylogenetic trees, using both Maximum Likelihood (ML) and Fitch-Margoliash (FM, a method based on distance matrices) algorithms. The RAxML web server at CIPRES was used to construct 1000 bootstrapped ML trees under a Protein CAT model and JTT matrix, followed by a majority-rule consensus tree (Miller, Pfeiffer and Schwartz, 2010;Stamatakis, 2014). The alignments were also used to make 1000 Fitch-Margoliash trees (from distance matrices) and a majority-rule consensus tree, using the PROTDIST, FITCH and CONSENSE programs available in the PHYLIP software package (Felsenstein, 2005). The consensus trees were displayed and rooted using the Interactive Tree of Life website (Letunic and Bork, 2006). The archaebacterium Pyrococcus horikoshii was chosen as the root for all of the consensus trees, based on its evolutionary distance from the eukaryotes and the presence of a 60S domain.

P0/L10 sequence diversity:
The P0 and L10 sequences of 101 different organisms-92 eukaryotes, 4 archaebacteria and 5 eubacteria-were used as the basis for phylogenetic alignment.
A full list of the species used and the classes to which they belong is given in Table 3.
All five of the major eukaryotic supergroups identified by Adl et al (2012)

Trees derived from complete and trimmed L10 and P0 sequences:
After a MCoffee alignment of the complete P0 and L10 sequences was performed, we observed that the N-and C-terminals contained a significant amount of gaps and were poorly-aligned, compared to the rest of the protein.
To gauge the effect of these poorly aligned terminals on the resulting trees, we removed them from the sequences and realigned the remaining sequence data to produce a "trimmed terminals" alignment. Finally, the region containing the ciliate and kinetoplastid inserts was removed, along with the terminals, to determine what effect the presence or absence of the insert had on the quality of the trees. This produced a third "trimmed terminals and insert" alignment. All three alignments were used to build 1000 ML and FM trees (Figures 8-13).

Maximum Likelihood Trees:
Complete P0 tree: Within this tree, the apicomplexans formed a monophyletic group with reasonable support (between 58 and 100 percent) for its terminal nodes The Excavata also split up across the tree. As noted above, the kinetoplastids were found on the same branch as the spirotrich ciliates, whereas Giardia lamblia and Giardia intestinalis were grouped with the slime molds of the supergroup Amoebozoa.
Finally, the remaining excavates (Trichomonas vaginalis and Hordeum meleagridis) were located on the same branch as the eubacterial L10 sequences, which were situated on an exceptionally long branch.
The rest of the eukaryotic P0s generated monophyletic branches. The opisthokonts (including the fungi) and the Archaeplastida formed monophyletic branches far from the archaebacterial root of the tree. As with the other groups, support values for the more terminal branches were reasonably strong , with values ranging from 61% to 100%, whereas many of the basal branches exhibited less than 50% support (i.e. no support value shown), indicating less certain placement on the tree.
Trimmed Terminal P0 tree: Several differences were observed between these trees and the trees derived from whole P0 sequences. Notably, the alveolates were monophyletic, with the ciliates contained within a clade that included the apicomplexans (Figure 9). The alveolate clade consisted of two subgroups-the Oligohymenophorea and Spirotrichea in one group, and the Heterotrichea and Apicomplexans in the other. The excavate kinetoplastids, too, moved to a different location than the previous trees. Rather than grouping with the spirotrichs, they formed a branch with B. natans and two cryptophytes, G. avonlea and Guillardia theta. The remaining excavates (G. lamblia, T.vaginalis and H. melesgridis) formed two neighboring branches, located closer to the archaebacterial root. The eubacterial sequences were found on a longer branch, close to the L10-like nucleomorph sequences near the base of the tree. The stramenopiles moved also, farther away from the heterotrichs, towards the opisthokonts; they still formed a single clade as in the previous tree. Finally, the Archaeplastida and Opisthokonta remained in the same location as they did on the whole P0 trees, and support values for these clades were consistent between the trees.
Trimmed Terminals and Insert tree: Overall, the basal branches of this tree appeared shorter than in the other trees, which is likely an effect of removing most of the poorly-aligned amino acids from the input sequence alignment (Figure 10) Trimmed Terminals tree: Unlike the ML tree, the alveolates are not monophyletic, forming three separate branches in this tree ( Figure 12).
Oligohymenophoreans form their own branch earlier in the tree, followed by the spirotrichs and finally the heterotrichs. The spirotrichs are very weakly associated with the kinetoplastids and the C. mesostigmatica nucleomorph sequence. Heterotrichs and apicomplexans associate closely in the FM tree, albeit with somewhat weak bootstrap support (28%). Support values within the apicomplexan clade have improved from the Whole P0 tree, with a range from 65% to 100%. The P0 of B. natans is now associated with Goniomonas rather than the stramenopiles, which form their own clade. Other clades are present in similar positions compared to the previous trees.
Trimmed Terminals and Insert tree: Once again, the heterotrichs and apicomplexans closely associate in this tree with a support value of 27% ( Figure 13). This is much weaker support than in the ML version of the tree, which has a 52% support value for the heterotrich-apicomplexan branch. The oligohymenophorea and spirotrichs associate with a 33% support value; both associate very weakly with the kinetoplastids (9% support). This relationship is slightly different than in the ML tree ( Figure 10), where the kinetoplastids are more closely associated to the spirotrichs than the oligohymenophorea.

Maximum Likelihood trees derived from individual P0 domains:
To examine the phylogeny of different functional regions of P0, and to uncover the effect that each functional domain of P0 may have had on the protein's overall phylogeny, we divided the eukaryotic P0s into three parts, and created trees for each of the domains, based on 1000 iterations (Figures 14, 15 and 16). Only ML trees were prepared, because the DM trees did not appear to be as useful in our earlier work with three domain trees. Eubacterial L10 sequences were excluded from these trees because the Eubacteria only contain the L10 region, and would not contribute any meaningful data to the trees of the MID and 60S domains. As with the three-domain trees, bootstrap support values above 50% were observed more often for terminal branches than for more basal branches, while branch lengths were much longer than those seen in the three-domain trees.
L10 Domain: The ciliate groups resolved into two uneven parts (See Figure   14). The heterotrichs and spirotrichs were located closer to the other alveolates than the oligohymenophorea, which formed a group with E. dispar (Amoebozoa). The kinetoplastids were located on a long branch at the top of the tree, near B. natans and in proximity to other excavates and the stramenopiles. One other notable change was the interruption of the opisthokonts by a long branch containing the Dictyostelia species and the other Amoebozoa representatives except for E. dispar.
MID Domain tree: The spirotrich and heterotrich ciliates were closely associated, while the Oligohymenophorea were more distantly situated, forming a branch with the kinetoplastids (Figure 15). The apicomplexans were distant from all three classes of ciliates, forming a branch near the root of the tree and showing slightly more fragmentation than in other trees. As for the other excavates, the Giardia species formed a very long branch near the apicomplexans, spirotrichs and heterotrichs. T.
vaginalis and H. meleagridis were on the same branch as G. avonlea and G. theta.
Stramenopiles, Archaeplastida and opisthokonts were all monophyletic, as in the other trees.
60S Domain: Unlike in the trees from other regions, O. trifallax split off from the spirotrichs to form a group with the oligohymenophorea, kinetoplastids and Amoebozoa ( Figure 16). On a nearby branch, the other spirotrichs, the heterotrichs and stramenopiles were grouped together. Many of the branches in this larger group are longer than other branches in the tree. The kinetoplastids were associated more closely with the ciliates than the apicomplexans, as in the whole P0 tree. Also, the apicomplexans were located closer to the fungi, and P. marinus, formed a long branch near the Viridiplantae. The excavate clade was quite fragmented in this tree, forming long branches in three separate regions of the tree.

Topology of the Maximum likelihood three-domain trees:
Overall, the terminal branches had strong support values, suggesting that, for the ciliates, P0 may be useful for distinguishing species from each other and for identifying the class to which each species belongs. Also, while unicellular eukaryotes were inconsistently positioned, the Viridiplantae (green plants) and Opisthokonta (both single-celled fungi and multicellular animals) consistently grouped near the top of the tree, far away from the archaebacterial root. This observation is made stronger by the large number of opisthokonts sampled. Even though P0 is a highly conserved protein, it still appears to provide limited information about the early evolution of the eukaryotes, as there was poor support for the basal nodes, less than 50% in most cases.
The poor support for these nodes makes it difficult to identify in which eukaryotic clades the P0 is more closely related to the ancestral prokaryotic L10.

Phylogeny of the ciliates (ML):
We only examined P0s from three of the eleven classes of ciliates proposed by ; thus, any conclusions drawn for the phylogeny of the ciliates would be preliminary. However, the three ciliate classes studied (Oligohymenophorea, Spirotrichea and Heterotrichea) consistently formed separate clades, supporting the class distinctions established by Lynn and Small (1997), with strong bootstrap support in all trees. However, of the three trees based on the entire P0 sequences, only the Trimmed Terminals tree (Figure 9) showed some support for the monophyletic association of the apicomplexans and ciliates, in keeping with other evidence that supports a clade called the Alveolata (Adl et al, 2012).
However, the bootstrap value was below 50% for the node linking the ciliates to the apicomplexans, and some ciliates appeared to associate more strongly with the apicomplexans, while others did not. Perhaps removing the ciliate-specific insert from appears to be quite unstable in these three-domain trees. Therefore, additional rhizarial sequences may be necessary to stabilize the branch to which B. natans belongs.

Topology of the Fitch-Margoliash three-domain trees:
For each of the three-domain data sets, 1000 bootstrapped trees were prepared

Phylogeny of the ciliates (FM):
The FM (as well as the ML) trees provide some evidence that the heterotrichs are more closely associated with the apicomplexans than the spirotrichs or the oligohymenophorea. This relationship holds even when the insert in removed from the heterotrichs, though the branch is not strongly supported in either tree. Lynn and Small (1997)  Excavata. This lack of monophyly for the SAR Supergroup is complicated and possibly explained by the weak basal branches, as well as the lack of Rhizarial P0s in the trees. As with the ML trees, the inclusion of additional P0 sequences for the SAR could help to resolve the question concerning the monophyly of the clade.

Phylogeny of the P0 functional domains (ML only):
Overall, the topologies of the single-domain trees appeared to be quite different from those of the three-domain trees. One of the major differences is that many of the branch lengths were significantly longer in the single domain trees. This may be due to the smaller lengths of the individual domains. Since branch lengths reflect the average number of substitutions per amino acid position, having fewer possible residues to measure increases the contribution of each residue substitution to the branch length. The phenomenon of long-branch attraction (Bergstein, 2005), however, can result in some false positioning of species or clades, but comparison with three-domain trees may help to resolve potential problems of this type.
L10 region: With regard to the L10 region tree, the oligohymenophorea form a clade that is more distinct, whereas the spirotrichs and heterotrichs exhibit a closer association. Thus, the L10 region may have diverged more extensively in the Oligohymenophorea. Surprisingly, the L10 region of P. tetraurelia seems to be quite distinct from that of the Tetrahymenidae (bootstrap value of 86% for this split).
Additional P0 sequences from other species of the Peniculids may be useful in providing support for this divergence. The effect of the ciliate-specific L10 insert may help to exaggerate the branch lengths for the various ciliate groups. Further analyses in which the insert is removed and the edited L10 regions are used to build new trees may help to clarify the effect that the insert has on the tree structure. Also, there is still weak bootstrap support for a close relationship between the apicomplexans and the various classes of ciliates in this tree.
It was also noted that one group of Opisthokonta (the Supergroup that includes the multicellular animals) appeared to move to a location closer to the archaebacteria rather than further away ( Figure 14). This is likely to be an artifact of the treebuilding, due to the inability of the L10 region to provide significant information about the early ancestors of the modern supergroups. The placement of the Dictyostelia on a long branch within this group is suspect, and might be due to longbranch attraction. One possibly significant result is the clear separation of the kinetoplastids from the ciliates, given how closely the kinetoplastids cling to the ciliates in other trees, which may be a false positive, as with the clearly erroneous association of the eubacteria with members of the Excavata, in the complete P0 trees ( Figure 8).
MID region: In this tree, the ciliate, apicomplexan and kinetoplastid clades are distinct, but they fragment and disperse to different sections of the tree. The Oligohymenophorea and Heterotrichea are closer together, and the Spirotrichea form a group with the kinetoplastids in a different section of the tree. The apicomplexans lie at the root of the tree (see Figure 15), in contrast to the rest of the trees. This difference in the apparent earliest group between the L10 and MID tree is likely due to poor basal branch support in the tree, as well as ambiguity about which group should be placed first. Overall, the rest of the groupings are similar to results from the other trees, but the fragmented nature of the tree appears to reflect and may be derived from the sequence diversity of the MID region. thus the 60S region tree provides the best support for the SAR clade, out of the three regional trees.
Many of the terminal (Genus or species level) branches of the 60S tree ( Figure   16) are long, especially those of the ciliates. This indicates that more substitutions or changes have occurred in this region. This large amount of change may be due to the presence of repetitive sequences of amino acids (like alanine and glutamic acid) in the 60S region. Such repetitive sequences would make replication errors more likely. The 60S region of repetitive sequence has been termed the 'hinge", because it is thought to be a flexible portion of the protein necessary for interacting with the elongation factors (Gonzalo and Reboud, 2003). It is worth noting that in our sequence alignments, the 60S regions of different P0s aligned poorly, which was why the hinge sequences were removed in the Trimmed Terminal trees. The variability of the 60S region may be a large contributor to the diversification of ciliate P0s and of eukaryotic P0s in general.
The 60S region may also hold clues to how P0 evolved from L10. In their phylogenetic study of the stalk proteins, Shimmin et al (1989) suggested that this region shares homology with the stalk protein P1/L12 (described earlier), and proposed a model of P0 evolution where P0 arose from the fusion of ancestral L10 and L12 genes. A comparison of the 60S region with ribosomal proteins like P1 and P2 could be the basis of a future study, since the P1 and P2 gene/protein sequences of T.
thermophila and many other eukaryotes have not been identified or characterized yet.

Long-branch attraction in L10 and P0:
In all trees, we observed some branches where organisms known to be evolutionarily distant were grouped together, such as the eubacteria and a couple of the excavates (see Figure 8). Long branches between prokaryotes and eukaryotes are expected, given their long history of divergence from each other. However, this divergence does not account for the unusual placement of these branches, which are caused by a phenomenon known as long-branch attraction. Long-branch attraction occurs when two divergent sequences have undergone enough changes that they appear more homologous than they actually are, causing tree-building programs to falsely group them together (Bergstein, 2005). Bergstein reviewed four methods for tree building (maximum likelihood, maximum parsimony, distance matrix and Bayesian inference), and found that ML trees were less vulnerable to long-branch attraction. It was also noted that protein sequences were less likely than gene sequences to form false branches, due to the larger number of possible amino acids versus nucleotides.
However, even though ML methods and protein sequences were used in the present study, long-branch attraction still appeared to cause false branches to appear in all of the three-domain ML trees (Figures 8, 9 and 10). There are two likely factors contributing to their appearance; the poorly supported nature of the basal branches, and the presence of poorly-aligned terminals in the whole P0 alignment. Removing the terminals and leaving the strongly-aligned portions of P0 seemed to address some of the noise, but removing the insert seemed to reintroduce some problems, such as the fracturing of the ciliate clade. Removing poorly-aligned regions did not strengthen the support of basal branches, so another method is necessary to improve the resolution of basal branches. The simplest method might be to add more sequences to the alignment, especially in the case of fragmented clades like the Excavata and Alveolata. The sequences used in this study represent most of the excavates and ciliates whose P0s have been sequenced, so this work will need to be revisited in the future, as sequencing projects continue. Other ways to improve the quality of the basal branches may also need to be investigated, as it is still unclear from these findings whether P0 could be utilized to trace the earliest stages of the eukaryotic tree of life.

Conclusion:
Using maximum likelihood and distance matrix methods, several phylogenetic trees were created from alignments of whole L10 and P0 sequences, as well as the individual functional domains of P0. Both methods produced trees with poorly supported basal branches and stronger terminal branches, reflecting uncertainty in the early evolution of P0. Despite the unbalanced support of the branches, the results suggest a relationship between the P0s of ciliates and kinetoplastids, although the support was not strong. Surprisingly, there was also generally poor support for a relationship between the P0s of ciliates and apicomplexans, both members of a wellestablished clade, the Alveolata. The postulated SAR Supergroup was also not wellrepresented by P0's phylogeny in the current study, although this may be partially due to representation of the Rhizaria in the tree by a single species. However, support was strong for the Genera and Classes of ciliates that had been previously established through other studies, and thus P0 may be useful as a basis for classification of organisms at higher taxonomic levels. Of the two known functional regions of eukaryotic P0, the C-terminal 60S region may be the most significant contributor to the evolutionary diversification of P0, while the N-terminal L10 region seems to be the most conserved. As new genomes are sequenced and more P0 sequences become available, it should be possible to revisit the phylogeny of L10 and P0, and draw stronger conclusions about the evolutionary transition from prokaryotic L10 to eukaryotic P0.

Figure Legends:
In all trees, brackets identifying the clades of interest in this study have been provided. C (blue bracket) indicates members of the Ciliophora, A (green bracket) indicates members of the Apicomplexa, and K (red bracket) indicates members of the Kinetoplastida.  Inferred from the amino acid sequences of P0/L10 from 101 species, with P.
horikoshii as an outgroup. The two S. coeruleus sequences, labeled A and B, represent two distinct P0 hits from the same genome, while the two T. thermophila sequences are identical due to a quirk in MCoffee.  The two S. coeruleus sequences, labeled A and B, represent two distinct P0 hits from the same genome, while the two T. thermophila sequences are identical due to a quirk in MCoffee. horikoshii as an outgroup. The two S. coeruleus sequences, labeled A and B, represent two distinct P0 hits from the same genome, while the two T. thermophila sequences are identical due to a quirk in MCoffee. horikoshii as an outgroup. The two S. coeruleus sequences, labeled A and B, represent two distinct P0 hits from the same genome, while the two T. thermophila sequences are identical due to a quirk in MCoffee. Fig. 9: The maximum likelihood consensus tree of the 60S region of eukaryotic P0s. Inferred from the amino acid sequences of P0 from 91 species, with P. horikoshii as an outgroup. The two S. coeruleus sequences, labeled A and B, represent two distinct P0 hits from the same genome, while the two T. thermophila sequences are identical due to a quirk in MCoffee.