BIOGEOGRAPHY AND NESTED PATTERNS OF GENETIC DIVERSITY IN THE DIATOM THALASSIOSIRA ROTULA

Diatoms are some of the most biogeochemically important organisms on our planet, responsible for up to 40% of primary production in the ocean. Diatoms are also the most diverse group of algae, exhibiting high levels of cryptic speciation, and remarkable clonal diversity nested within species and populations. As single cells within a dynamic fluid environment, diatoms possess the potential to disperse throughout the globe. Yet, despite the global dispersal potential of these ecologically important organisms, little is understood about the structure and distribution of diatom diversity throughout the globe, or its ecological significance. The goal of this work was to explore nested patterns of diversity in the diatom Thalasiosira rotula by mapping its distribution over global geographic space, capturing the variations in genetic composition over time, and inferring its ecological significance. A series of molecular markers was used to target different scales of genetic subdivision from the cryptic species to the population to the individual. Ribosomal genes (rDNA) were used to define lineages nested within the species, and clarify the relationship between T. rotula and its cryptic sister species, Thalassiosira gravida. Microsatellite markers were developed for T. rotula, and offered a high-resolution glimpse into the population-level subdivisions within the species. Microsatellite analysis was robust enough to identify individuals within these populations. Ribosomal DNA analysis confirmed that T. rotula and T. gravida are, indeed, separate species. These genes also demonstrated that T. rotula is subdivided into three lineages diverged by 0.6 ± 0.3% at the internal transcribed spacer gene. Each lineage exhibits a unique geographic distribution, and harbors high levels of physiological diversity. Over 400 global samples of T. rotula collected from 8 locations in 2010 offered a ‘snapshot’ of diversity nested within the species. Microsatellites revealed high levels of population divergence among these global samples. Variation over time at a single site was as great as variation between samples collected thousands of kilometers apart. Within a single site, Narragansett Bay, changes in population structure over the course of three years demonstrated that populations are comprised of many clonal lineages, harbor high adaptive potential, and may respond quickly to changes in the marine environment and local ecology. The structure of diversity among global samples was used to infer the importance of dispersal versus environment in isolating populations and maintaining diversity. At all levels of molecular diversity, geographic distance demonstrated only a weak relationship to genetic distance. This suggests that global connectivity among populations is not limited by geographic distance. Nonetheless, T. rotula does not represent a single panmictic population. Significant genetic structure was detected on all levels of molecular diversity examined. A series of Mantel tests revealed that temperature and chlorophyll a were most highly correlated with genetic relatedness among global populations, despite our exploration of other variables including nutrient concentration, temperature, irradiance, salinity, and cell abundance. This suggests that certain populations may be better adapted to thrive in competitive high-chorophyll phytoplankton ‘bloom’ periods, whereas others may only thrive during non-bloom periods. Taken together, these data suggest that environmental and ecological selection may heavily influence the genetic structure of diatom populations, and help to explain the extraordinary diversity harbored within and among diatom species.

diversity. Over 400 global samples of T. rotula collected from 8 locations in 2010 offered a 'snapshot' of diversity nested within the species. Microsatellites revealed high levels of population divergence among these global samples. Variation over time at a single site was as great as variation between samples collected thousands of kilometers apart. Within a single site, Narragansett Bay, changes in population structure over the course of three years demonstrated that populations are comprised of many clonal lineages, harbor high adaptive potential, and may respond quickly to changes in the marine environment and local ecology.
The structure of diversity among global samples was used to infer the importance of dispersal versus environment in isolating populations and maintaining diversity. At all levels of molecular diversity, geographic distance demonstrated only a weak relationship to genetic distance. This suggests that global connectivity among populations is not limited by geographic distance. Nonetheless, T. rotula does not represent a single panmictic population. Significant genetic structure was detected on all levels of molecular diversity examined. A series of Mantel tests revealed that temperature and chlorophyll a were most highly correlated with genetic relatedness among global populations, despite our exploration of other variables including nutrient concentration, temperature, irradiance, salinity, and cell abundance. This suggests that certain populations may be better adapted to thrive in competitive high-chorophyll phytoplankton 'bloom' periods, whereas others may only thrive during non-bloom periods. Taken together, these data suggest that environmental and ecological selection may heavily influence the genetic structure of diatom populations, and help to explain the extraordinary diversity harbored within and among diatom species.

INTRODCUTION
With an estimated 100,000 species hypothesized to exist, diatoms are the most diverse group of algae on the planet (Mann and Vanormelingen 2013) . Molecular clock estimates place the origin of diatoms at 250 mya (Sorhannus 2007). As freefloating, single-celled organisms, it is argued that diatoms experience high dispersal throughout the global ocean, and therefore high levels of genetic connectivity, which may limit their potential for genetic isolation (Fenchel and Bland 2004;Finlay 2002).
Despite this paradigm, amassing evidence demonstrates that individual diatom species exhibit genetic structure at local  Rynearson and Armbrust 2005), regional , and global scales ). The mechanisms driving genetic isolation and structuring genetic diversity within diatom species are little understood. By exploring the extent of genetic diversity and connectivity among populations of a single globally distributed diatom species, this study aims to investigate the ecological factors correlated with diatom diversity structure and genetic connectivity among diatom populations. Doing so broadens our understanding of the evolutionary processes have led to the vast diversity of diatoms on our planet.
We can contemplate the uniqueness of diversity structure in the ocean by comparing distribution and life history patterns of organisms between terrestrial and marine environments. Primary production is equally distributed between the land and the ocean (Field 1998), yet the patterns of evolution and diversity structure over space may be driven by very different mechanisms. For one, terrestrial organisms are dominated by small population sizes and limited dispersal (Carr 2003). Populations are distributed over two spatial dimensions, and organisms are associated with longer generation times (Steele 1985). In contrast, marine primary producers are phytoplankton, generally smaller than 200µm, and constantly drifting with the tides and currents. This leads to a high potential for dispersal among these organisms (Finlay 2002;. Phytoplankton are also associated with massive effective population sizes (Fenchel and Bland 2004;Hochberg 1988), and can form blooms that cover vast areas of the ocean . Their distributions occupy three spatial dimensions, and they can rapidly evolve through fast generation times, dividing on timescales of once per day.
Because of these characteristics, the evolution and diversity structure of phytoplankton may be driven by very different mechanisms than primary producers on land.
For the most part, our understanding of diatom diversity extends only to local and regional scales. The species studied to date demonstrate some common characteristics. First, it has been shown that 'super species,' or those previously thought to occur throughout the globe, can be comprised of many different species that are genetically distinct and reproductively isolated, despite the fact that they appear identical under a microscope (Amato 2007;Fenchel 2005;. Next, all diatom species studied to date exhibit high levels of intraspecific diversity and large population sizes, with clonal diversity greater than 90% (Collins 2013). For example, a single bloom of the diatom Ditylum brightwellii was comprised of thousands of clonal lineages (Rynearson and Armbrust 2005). Next, populations demonstrate stability over time. For instance, in fjords of Sweden, genetic signals from resting spores of the diatom Skeletonema marinoi demonstrate that a single population can persist in a local environment for over one hundred years . Finally, diatom populations can be highly diverged among environments that exhibit high physical connectivity, and close local proximity. For instance, the diatom D. brightwellii exhibits significantly diverged populations in Puget Sound, and the neighboring Strait of Juan de Fuca ).
Our understanding of diatom diversity on global spatial scales is very limited. In fact, only one study has explored diversity within a diatom species on a global scale-this study focused on the species Pseudo-nitzschia pungens ). This species is comprised of three separate lineages that are associated with unique geographic distributions. The authors used high-resolution genetic markers to explore diversity within one of these lineages, and found that genetic distance exhibited a strong relationship to geographic distance. They concluded that, for this species, dispersal is limited, and geographic distance is a strong barrier to genetic connectivity among populations. While the link between geographic distance and genetic isolation was apparent in this species, the underlying ecological mechanisms driving the isolation-by-distance (IBD) pattern remain unclear. Also, whether or not this pattern of IBD is a species-specific phenomenon, or common across diatom species is not understood.
In the global study of Pseudo-nitzschia pungens population structure, samples were collected over ten years and pooled over seasons and space to represent regions . Sampling of diatoms is often a challenge, especially when addressing global spatial scales. Pooling samples over seasons and years is often unavoidable, but prevents an understanding of the intersection between genetic diversity and environmental diversity of the dynamic ocean, which varies widely over space and time. Because of this, we have little understanding of the relative contribution of global dispersal versus environmental selection in structuring populations and influencing the evolution of diatoms.
This dissertation broadens our understanding of diatom diversity structure on global spatial scales through the lens of the ecologically important diatom species, Thalassiosira rotula. T.rotula thrives in ecosystems throughout the globe, found in every ocean basin, and across hemispheres . Throughout this dissertation, I explore nested patterns of diversity within samples of T. rotula collected at global spatial scales. I employ a sampling scheme that avoids pooling samples over time and space. Analyzing each discrete sample, each with its own environmental metadata, allows me to explore the ecological correlates of global diversity structure in this high-dispersal and globally distributed species. Doing so allows me to tease apart the key environmental correlates of diversity structure, expanding our understanding of the ecological mechanisms that influence the isolation and connectivity among diatom populations.
This dissertation explores nested patterns of diversity and their ecological correlates from several different angles. Chapter 1 addresses the following questions: 1) Does the species Thalassiosira rotula exhibit genetic structure on the lineage level? 2) What is the relationship between differences in genetic diversity, physiological diversity, and genome size within the species? Chapter 2 uses high-resolution molecular markers to measure the extent of diversity nested within these lineages. Chapter two explores the following questions: 1) How are populations The degree of inter-and intra-specific divergence between T. gravida and T. rotula suggest they should continue to be treated as separate species. The phylogenetic distinction of the three closely-related T. rotula lineages was unclear.
On the one hand, the lineages showed no physiological differences, no consistent genome size differences and no significant changes in the ITS1 secondary structure, suggesting there are no barriers to interbreeding among lineages. In contrast, analysis of intra-individual variation in the multicopy ITS1 as well as molecular clock estimates of divergence suggest these lineages have not interbred for significant periods of time. Given the current data, these lineages should be considered a single species. Furthermore, these T. rotula lineages may be ecologically relevant, given their differential abundance over large spatial scales.

INTRODUCTION
Photosynthetic organisms in terrestrial and marine habitats contribute to global primary production in roughly equal proportions (Field 1998;Laisk et al. 2009). For most terrestrial photoautotrophs, species distributions occupy two spatial dimensions and vary over relatively long time scales (Bolker 1999;Burger 1981;Sanmartan 2004;Simpson 1974). In marine habitats, most primary producers are unicellular phytoplankton, smaller than 200µm that drift with tides and currents. Marine phytoplankton differ from their terrestrial counterparts in that these tiny organisms have higher dispersal potentials, the ability to occupy three spatial dimensions, and species distributions that can vary over time scales of days to weeks (Carr et al. 2003;Simon 2009).
Within the phytoplankton, diatoms are a particularly important class of algae.
These commonly-occurring organisms generate over 20% of global primary production, and thus play a key role in driving global biogeochemical cycles . They are found in almost all aquatic habitats, are comprised of an estimated 200,000 species (Mann 1996) and yet only arose in the early Mesozoic (~ 250 mya) (Sorhannus 2007 Testing the biological species concept using diatoms poses a significant challenge because it has proven difficult to consistently control sexual reproduction in the laboratory. For those species where sexual reproduction can be controlled, there appears to be a relationship between reproductive incompatibility and compensating base changes in the stem regions of ITS2 secondary structures (Amato et al. 2007;Behnke et al. 2004;Coleman 2000;Coleman 2003). Although this same relationship has not been explicitly observed for the ITS1, it has been argued that the two genes do not evolve independently of one another, and thus their secondary structures are similarly conserved (Coleman and Vacquier 2002). In laboratory experiments with the diatom genus Sellaphora, interbreeding was successful between individuals with 7.3% divergence at the ITS1 and ITS2, but not between those with 10% divergence (Behnke et al. 2004). Similarly, reproductive isolation has been observed between species of the genus Pseudo-nitzschia that diverged by 2.4 % at the ITS1 and ITS2.
Morphologically cryptic diatom species have also been identified by 0.5% sequence divergence at the 28S rDNA Sarno et al. 2007;Zingone et al. 2005). Furthermore, diatom lineages have been shown to exhibit differences in genome size Von Dassow 2008), suggesting that polyploidization may play a role in driving cryptic diatom speciation, as is commonly observed in plants (Wood 2009) The identification of morphologically cryptic species has led to the question of whether cosmopolitan species are truly globally distributed or whether these morphospecies are instead divided into multiple species with distinct biogeographic ranges. For example, the diatom Skeletonema costatum was once thought to be a "super" species based on its ability to thrive and even dominate phytoplankton communities in an exceptionally broad range of environments (Smayda 1958). It was recently shown to consist of several different species Sarno et al. 2007;Zingone et al. 2005) that may each have unique geographic distributions . Similarly, geographic differentiation has been shown for the harmful algal bloom-forming genus Pseudo-nitzschia Hubbard 2008 (Behrenfeld et al. 2006;Cermeno et al. 2010 ;. It has been challenging to describe general patterns of species division in diatoms because previous studies used different methods, focused on different species, and often sampled few isolates from a restricted spatial scale. We focused on identifying genetic subdivision in the diatom morphospecies Thalassiosira rotula by simultaneously examining variation in rDNA sequences, physiology, and genome size from isolates collected from around the globe. Thalassiosira rotula is a commonlyoccurring diatom that can dominate phytoplankton assemblages across diverse marine habitats and hydrographic environments (eg. (Brockmann et al. 1977;Bursa 1961;Cassie 1960;Gran 1930;Matsudaira 1964;Pratt 1959;Rytter Hasle 1976;Smayda 1957;Smayda 1958)). Here, cells were collected along a transect in the Eastern North Pacific and their rDNA (18S, ITS1, 28S) compared with isolates collected from the Pacific, the Atlantic, and Mediterranean Sea to determine the geographic distribution of rDNA sequence variants. Growth rates among isolates were used to determine the relationship between molecular and physiological diversity.
Variation in genome size among isolates was measured as recent work indicated that differences in DNA content may identify cryptic species Von Dassow 2008). Morphological studies suggested that T. rotula (Meunier 1910) and T.
gravida (Cleve 1896) are likely a single species (Sar et al. 2011;Syvertsen 1977). To test this hypothesis, we determined rDNA sequence variation among culture collection isolates of T. rotula and T. gravida. By examining isolates collected from around the world, we were able to identify sufficient genetic differences between T. rotula and T.
gravida to warrant their continued description as distinct species and to identify genetic subdivision within T. rotula and its correspondence to geographic location, physiological variation and differences in genome size.

Isolates
Cells of the diatom morphospecies Thalassiosira rotula/ gravida were collected from 8 locations in the Eastern and Western Pacific and Western Atlantic between 2007 and 2009 (Table 1). Culture collection isolates of T. rotula and T gravida from an additional 7 locations were also obtained (Table 1). For all field samples (sites 1-4, 7, 8, and 13), surface water was passed through a 20µm mesh net.
At least 17 single cells or short chains were isolated from the >20µm size fraction using a stereomicroscope (Olympus SZ61) rinsed in sterile seawater three times, and transferred to 1 mL sterile Sargasso seawater amended with f/20 nutrients

Ribosomal DNA sequencing and analysis
To quantify genetic variation among isolates, three regions of the ribosomal DNA (rDNA) were sequenced: the small subunit (18S), the D1 hypervariable region of the large subunit (28S), and the internal transcribed spacer region I (ITS1). To amplify each rDNA region, a reaction mixture containing ~5ng DNA, 0.1 mmol L -1 dNTPs (Bioline), 0.05 U µL -1 Accuzyme DNA polymerase (Bioline), 1X buffer (Bioline), and 0.5 µmol L -1 each of forward and reverse primers was used. Using polymerase chain reaction (PCR), the ITS1 was amplified using a newly-designed primer specific to T. rotula and T. gravida: RotIIR GTCACAGTCCAGCTCGCCACCAG and primer 1645F . The PCR consisted of a 2 min denaturation step at 95ºC, 36 cycles of 94ºC for 30 s, 62ºC for 30 s, and 72ºC for 1 min followed by a 10 min extension at 72ºC. The ITS1 was amplified from 106 isolates, including 10 from each field sample. The 18S was amplified from 16 of those isolates using universal 18SA and 18SB primers (Medlin et al. 1988). The PCR consisted of a 2 min denaturation step at 95ºC, 33 cycles of 94ºC for 20 s, 55ºC for 60 s, and 72ºC for 2 min and 10 min at 72ºC. The D1 region of the 28S was also amplified from 16 isolates using the forward primer, 28SF: ACCCGCTGAATTTAAGCATA, and reverse primer, 28SR: ACGAACGATTTGCACGTCAG (Auwera and Wachter 1998), at the same thermocycling conditions as the 18S PCR. All sequencing was performed on an ABI 3130xl (Applied Biosystems). Both strands of the ITS1 were sequenced to completion using primers 1645F and RotIIR. For 18S rDNA genes, both strands were sequenced to completion using primers 18SA and 18SB (Medlin et al. 1988), 18SC2, 18SE2, and 18SF3 , and 18SD (Armbrust and Galindo 2001a). 28S was sequenced using 28SR and 28SF, respectively. All sequences are available in Genbank (accession numbers JX069320-JX069349 and JX074825-074930) Sequences were assembled using SeqMan II 3.61 (DNASTAR inc), and aligned using Clustal W  in Mega4 ); boundaries of the ITS1 were determined through alignment with Genbank accession EF208798.
Significant differences in rDNA sequence variation between T. rotula and T. gravida were determined using analysis of molecular variance (AMOVA) with 1000 permutations. To examine relationships among T. rotula ITS1 variants, a network was generated using the median joining algorithm in Network 4.5.1.6 (Fluxus Technology Ltd.) and tested for statistical significance using AMOVA with 1000 permutations.
Significant differences in within-lineage diversity were determined using the population differentiation algorithm  run with 1000 steps of the Markov Chain, 1000 dememorization steps, and an alpha of 0.05. All AMOVA and population differentiation analyses were conducted using Arlequin v. 3.11 (Excoffier 2005 (Martin et al. 2010).

Physiological Variation
Isolates of T. rotula collected from the Mediterranean Sea (CCMP1647, site 11), from the coast of Vancouver Island (VIA, site 2), and from the Seto Inland Sea (SIS, site 6) were made axenic to remove contaminating bacteria. Sterile glass tubes were prepared with 4.5 mL sterile f/2, 10 µL sterile bacterial test medium (Bacto TM) (5 g L -1 Bacto-peptone and 5 g/L malt extract), ~ 1000 cells of each isolate and different volumes of sterile antibiotic mix (50 -400 µL) containing 0.1 g L -1 Penicillin G (Potassium salt), 0.0025 g L -1 dihydrostreptomycain sulfate, and 0.005 g L -1 gentamyacin. At 72, 96 and 120 hours, 5-50 µL of culture were subcultured into 4 mL of sterile f/2 media at 14ºC with a 12:12 light:dark cycle and routinely tested for bacterial contamination using Bacto TM.
Cultures were allowed to acclimate until no differences in maximum growth rate were observed between transfers. Maximum acclimated growth rates were determined following Rynearson and Armbrust  and Brand   (Enderlein 1961a), and the Tukey multiple comparison test  were used to determine the significance of differences among isolates at different temperatures and light levels, and between axenic and xenic growth rates. For each isolate, the diameter of 30 cells was measured using an E800 microscope at 20X (Nikon). One-way ANOVA was used to determine the significance of differences between cell size across all temperature and light conditions. Alpha was set to 0.05 for all statistical tests.

Relative genome size
Five isolates were chosen to examine relative differences in genome size using To investigate relationships between cell size and genome content, cell volume of each isolate analyzed using flow cytometry was determined by measuring cell diameter and length for 30 cells isolate -1 using an E800 microscope at 20X (Nikon).
The significance of differences in average cell volume among strains was determined using ANOVA.

rDNA variation
Culture collection isolates of T. gravida and T. rotula were not significantly different (p>0.05) at the 18S (0.1% divergence), but diverged significantly (p <0.05) from each other at the ITS1 (7 ± 0.3%) and 28S (0.8 ± 0.03%). Of the 97 field isolates analyzed, 10 had 28S and ITS1 sequences that were identical (100%) to T. gravida culture collection isolates identified by taxonomists (Table 1b) and thus were designated as T. gravida. rDNA sequences of the remaining field isolates were identical (100%) at the 28S and 99-100% similar at the ITS1 to T. rotula culture collection isolates identified by taxonomists (Table 1a) and thus were designated as T. rotula.
Of 92 global T. rotula isolates, 23 unique ITS1 sequences were detected with an average sequence divergence of 0.6 ± 0.3 %. Twenty sequences were relatively rare, identified in fewer than 4 isolates. Of these, sixteen sequences were identified just once, three sequences were identified twice, and one sequence was identified in four isolates. Three sequences (Table 3, sequences 1-3) were relatively abundant (identified in >10 isolates). The median joining network of all sequences was significant (AMOVA p<0.001), and was used to define three distinct lineages corresponding to the three most abundant sequences, and closely branching but rare sequences (Table 3, Figure 1). Lineage 1 was comprised of six sequences and was dominated by sequence 1, identified in 17 isolates (Table 3). All lineage 1 isolates originated from the coastal North Pacific (Figure 2, sites 1-3 and 5). Lineage 2 was comprised of three sequences and was dominated by sequence 2, identified in 11 isolates (Table 3). Lineage 2 isolates were sampled from Puget Sound (Figure 2, site 4) with the exception of six isolates sampled from coastal North Pacific waters ( Figure   2, site 1). Lineage 3 was sampled from waters throughout the global ocean ( Figure 2, sites 6-12) and consisted of 14 sequence types. Lineage 3 was dominated by sequence 3, identified in 38 isolates (Table 3).
The predicted folding structure of the T. rotula ITS1 revealed no compensatory Using a dated phylogenetic analysis, divergence times were estimated for T. gravida, the three T. rotula lineages, and several other Thalassiosira species ( Figure   4). Divergence between T. gravida and T.rotula was approximately 3.28 Mya.  (Martin et al. 2010). When added to the network analysis, intra-individual variants did not alter the initial network structure (data not shown) and lineage groupings 1-3 remained significant (p<0.001).

Physiological variation
Triplicate maximum acclimated growth rates of axenic and xenic cultures of There was no significant clustering of growth rate with ITS1 lineage at any of the six light and temperature conditions (p>0.05). Instead, relative growth rate among isolates differed with treatment, illustrated by extensive crossing of growth curves as environmental conditions changed (Figures 5a and b). At high light intensity (112 µmol photons m -2 s -1 ), specific growth rates ranged from no growth to 0.92 ± 0.04 day -1 (Figure 5a). At low light intensity (50 µmol photons m -2 s -1 ), specific growth rates ranged from 0.22 ± 0.01 -0.66 ± 0.01 day -1 (Figure 5b). At both light intensities, the CV was significantly larger at 4°C than at 10 or 17.5°C (p<0.05). Growth rate varied significantly among isolates at each temperature (p<0.05). There were no significant relationships between growth rate and cell diameter at any culturing condition (p>0.05).
The physiological response of isolates to light intensity was dependent on temperature. For example, at 4°C, there were no significant differences (p>0.05) in growth rates between high and low light intensities for all isolates except SIS ( Figures   5a and 5b). In contrast, at 17.5°C growth rates differed significantly (p<0.05) between high and low light intensities for all isolates except VIB (Figures 5a and 5b).

Variation in relative genome size
Genome size was measured in five isolates. Two of these isolates represented lineage 1: VIA (ST1) and VIB (ST1). Three isolates represented lineage 3: SIS, CCMP3264, and CCAP1085_21 (ST3). No live isolates were available for lineage 2.
The samples from all isolates contained cells in both G1 and G2 phases of the cell cycle, as indicated by bimodal distributions of integrated DNA fluorescence, with peaks separated by a factor of 2 in fluorescence intensity ( Figure 6). G1 and G2 distributions provided an internal standard of our ability to detect changes in genome There was no relationship between genome size and cell size measurements.
Average cell volume of all isolates was 21.52 ± 5.56 µm 3 . There were no significant differences in cell volume between different isolates (p>0.05). However, VIB exhibited the largest average cell volume (28.65 ± 13.22 mm 3. ), and exhibited the largest range in cell size.

DISCUSSION
To determine subdivision within the T. rotula morphospecies, it was first necessary to examine genetic divergence between T. rotula and T. gravida. Previous studies identified morphological plasticity in the characteristics used to define each species and argued for a single species designation (Sar et al. 2011;Syvertsen 1977).
Here, we found that culture collection isolates of the two species differed by at least 7% at the ITS1 and 0.8% at the 28S rDNA. This level of divergence is comparable to that observed between different species of the diatom Skeletonema (0.5% 28S divergence)  and between Pseudo-nitzschia species (7.2% ITS1 divergence) that were confirmed using mating experiments (Amato et al. 2007). ITS1 sequence variation indicated that T. rotula and T. gravida diverged approximately 3.28 Mya. Furthermore, the predicted ITS1 secondary structures of the two species differed considerably, a characteristic related to reproductive incompatibility in protists (Behnke et al. 2004;Coleman 2000;Coleman 2003;Coleman 2007;Coleman and Vacquier 2002). Although the links between reproductive isolation and ITS1 folding structure are not as well understood as the ITS2, it has been suggested that the ITS1 and ITS2 molecules co-evolve to maintain important biochemical interactions necessary for processing the mature ribosome (Armbrust and Galindo 2001b;Coleman 2000). Here, differences in the predicted T. gravida and T. rotula folding structures indicate that significant evolution has occurred at the ITS1 (Armbrust and Galindo 2001b;Coleman 2000). Differences in rDNA sequences and predicted ITS1 secondary structures between culture collection isolates of T. rotula and T. gravida suggest that the original species designations are correct.
The majority of field isolates were not significantly different from T. rotula culture collection isolates at the 18S and 28S rDNA. This pool of isolates could be divided into three distinct ITS1lineages which diverged from T. rotula culture collection isolates by 0-0.6%, an order of magnitude less than their divergence with T.
gravida. This level of variation is comparable to that identified within other diatom species; for example, Pseudo-nitzschia pungens clades diverged by 0.5% at the ITS1 (Enderlein 1961b).
Several lines of evidence suggest that the three lineages may be able to interbreed. First, their predicted ITS1 secondary structures were identical to each other, suggesting that mutations in the ITS1 have not resulted in significant structural changes to this important molecule (Armbrust and Galindo 2001b). In addition, there were no compensatory base changes (CBCs) in the ITS1 stem regions of the three lineages, suggesting that this gene is conserved. Overall, the lack of CBCs and conservation of secondary structure among lineages suggest that they may retain the ability to interbreed. Second, there were no consistent differences in genome size among lineages, another indication that they may be able to interbreed. Importantly, differences in genome size were observed within and not between lineages. Within lineage 1, genome size differed by roughly two fold and within lineage 3, by 30%.
Genome duplication, or polyploidization, is common in plants, and has been shown to result in rapid reproductive isolation (Hegarty and Hiscock 2007;Hegarty and Hiscock 2008;Otto 2003;Otto 2007;Wood 2009 Although several lines of evidence suggest that interbreeding could occur, additional data suggest that these lineages may not be actively interbreeding. To look for signatures of recombination, multiple copies of the ITS1 were sequenced from individuals representing each lineage. If lineages were interbreeding, one might expect to find, for example, some copies or recombinants of a lineage 1 sequence in a lineage 2 individual (Butlin 1987). No signature of recombination could be detected among lineages suggesting that interbreeding, if it occurs, is infrequent and below the threshold of detection. Here, we used a single genetic marker to examine recombination; future analysis of recombination would be improved by surveying a greater number of genes. It is also worth noting that diatoms divide primarily asexually, and that sexual cycles have been examined for only a handful of the large number of described species (Levins 1969;Templeton 1981). Sexual recombination in the field has been observed (Servedio 2003;), but rarely, and estimates of the incidence sexual recombination in diatoms varies widely, from once per year to once every 40 yrs . In addition to an inability to detect recombination, it appears that gene flow between lineages has been reduced for significant time periods. A dated phylogenetic analysis indicated that lineage 3 diverged from T. gravida 3.28 Mya. Lineages 1 and 2 diverged later, at 0.68 Mya.
Because divergence calculations can vary depending on outgroups, genes, or calibration points used in analysis Sorhannus 2007), divergence times should be interpreted cautiously. Even if these estimations are off by orders of magnitude, the estimated time since last interbreeding is significant.
In the marine environment, interbreeding could cease to occur through such mechanisms as isolation by distance, physical barriers to gene flow, competitive exclusion, environmental adaptation, or genetic and phenological characteristics that prevent gametes from fusing in the field (Palumbi 1994).
Here, it appears that isolation by distance was an unlikely mechanism promoting differentiation among T. rotula lineages. For example, genetic distance among lineages was not related to geographic distance. In fact, T. rotula lineage 3 had a cosmopolitan distribution ranging from the Mediterranean Sea, the N. Atlantic to the N. Pacific. This distribution is comparable to that observed in lineages of the pennate diatom P. pungens ) and contrasts with many terrestrial plant species, where genetic distance among lineages often correlates with geographic distance (eg. (Chiang and Schaal 1999;Sork et al. 1999)). The observation that lineages can be broadly distributed in both centric and pennate diatoms suggests that dispersal likely plays a significant role in regulating gene flow. Lack of isolation by distance observed here suggests that there are no physical barriers impeding broad dispersal.
Instead of acting as barriers, physical features, such as water recirculation, may act to reduce gene flow, allowing different lineages to arise and be maintained. For example, hydrographic features have been hypothesized to drive genetic divergence in diatoms in coastal fjords of the NE Pacific where recirculating water may retain cells inside the fjord, allowing them to remain in and adapt to a particular location . Interestingly, lineage 2 was observed within a recirculating coastal fjord in the NE Pacific, and exhibited significant divergence from lineage 1 sampled outside of the fjord, suggesting that water recirculation may influence genetic subdivision in multiple species of phytoplankton.
Competitive exclusion and environmental adaptation may be additional mechanisms initiating and supporting lineage divergence. Here, all but one location was dominated by just a single lineage. In those locations, the probability that other lineages were present but not detected was low. For example, a lineage representing 10% of the population would have a 99% probability of being detected in our 40 isolates collected from the N. Atlantic (Narragansett Bay and Martha's Vineyard) ), suggesting that these sites were likely dominated by single lineage. This may be due to competitive exclusion of lineages not adapted to the environment in Narragansett Bay and coastal N. Atlantic. There may, however, be more complex dynamics in play. For example, all three lineages were sampled from the Queen Charlotte Islands in the NE Pacific. This location may act as a hub of intermixing between water masses and may provide an environment heterogeneous enough to support the ecological niches of all three lineages.
To explore potential signatures of environmental adaptation, we compared the physiological response of each lineage to a range of light intensities and temperatures, two important environmental variables known to affect phytoplankton growth (Bonin et al. 2007;Fleishman et al. 2009;Leal and Fleishman 2002). As established by Brand (Brand 1981), differences in acclimated growth rates can be used to identify underlying genetic variation among isolates grown in a single environmental condition. Analyzing the growth rates of isolates under different conditions allows for comparisons of genotypic versus environmental effects (Falconer 1952 although it is not a required part of its life cycle and the frequency of resting spore formation in the field and the rate of germination success are unknown Karentz and Smayda 1984;. An intriguing hypothesis is that environmental adaptation to different phenological triggers for spore formation and germination could foster the initiation and maintenance of distinct lineages.
Finally, interbreeding could cease to occur among lineages if prezygotic barriers to gene flow no longer allow for sexual recombination among lineages.
Prezygotic barriers to gene flow include changes to gametes that prevent them from recognizing each other (Palumbi 1994). In diatoms, the Sig1 gene has been hypothesized to play an important role in gamete recognition (Armbrust and Galindo 2001a). This gene has undergone rapid evolution among strains of the diatom T.
weissflogii, including distinct protein changes that may alter the interaction between gametes in the field (Armbrust and Galindo 2001a). This type of prezygotic barrier to gene flow may lead to speciation via reinforcement of postzygotic differentiation (Butlin 1987;Servedio 2003). Previous attempts to amplify the Sig1 gene in T. rotula have failed (Armbrust and Galindo 2001a), but future transcriptional or genome-wide sequencing analyses may shed light on the potential for a prezygotic barrier to gene flow between lineages observed here. A prezygotic barrier to gene flow could well explain the conflicting data we obtained regarding interbreeding among lineages, where genome size and RNA secondary structure indicated no barriers to interbreeding but active recombination may no longer occur. In this scenario, prezygotic barriers to gene flow would prevent sexual recombination among lineages but the rDNA would not have yet diverged to levels that compare with more distantlyrelated species.

CONCLUSIONS
Our data suggest that genetic divergence between T. rotula and T. gravida is significant and that they should continue to be described as distinct species, with future investigations to more fully describe differences between them. From a taxonomic standpoint, the divergence among T. rotula lineages is less clear. On one hand, it appears as if interbreeding between lineages has not occurred for long periods of time, suggesting that they could represent recently diverged cryptic species. On the other hand, lineages exhibited no differences in ITS1 secondary structure or genome size and exhibited no clear physiological partitioning, suggesting that they are conspecific. Prezygotic barriers to gene flow and differential adaptation to environmental factors involved in vegetative growth and/or spore formation are possible mechanisms that explain these conflicting results. Given the current data, the T. rotula lineages should be considered a single species. Future studies to tease apart their relationships will benefit from analyses of genetic variation beyond the rDNA.
The high dispersal marine environment is clearly not a barrier to speciation in diatoms, a group of organisms with an estimated 100-200,000 species, second only to angiosperms as the most diverse primary producers on the planet (Mann 1996;Round 1990). Confirming species identity in closely-related diatom species is particularly difficult because testing for reproductive compatibility in diatoms is often not feasible in the lab and ultimately, may not reflect recombination in the field (Mann 1999).
Differences in the abundance of T. rotula lineages in the field suggest that past evolutionary events promoting their subdivision have led to lineage designations that are likely ecologically relevant, highlighting that in diatoms, a close interplay of ecology and evolution may regulate their impact on global biogeochemical cycles.

COMPETING INTERESTS
This work is associated with no competing interests, financial or otherwise.

AUTHORS' CONTRIBUTIONS
TR and KW conceived and designed the present study; KW, RO, and DR performed the experiments; KW and TR analyzed molecular data, RO and KW analyzed genome size data, DR, KW and TR analyzed physiological data; KW and TR wrote the manuscript. All authors read and approved the final manuscript.

Table 1. Sites and Isolates
Description of site and isolates collected, including isolation success and genes sequenced from each site. Site numbers refer to map featured in Figure 1. Isolation number or isolation success number was not available for culture collection strains, nor for those isolated in Japan. Culture collection accession numbers are indicated in the "Origin" column. a) All T. rotula isolates, including their classification as lineages 1, 2, or 3, and whether or not they were analyzed for physiological diversity. b) Description of T. gravida isolates Table 2. Genbank Accession Numbers Thalassiosira sp. and Genbank accession numbers used in phylogenetic analysis.

Species
Accession no.
Martha's Vineyard, Narragansett Bay, Japan, Mediterranean, Scotland 6,7,8, 9,10,1 2 Mediterranean and Narragansett Bay A/C) indicate ambiguous signals in sequencing, and thus the possible presence of two alleles at that position.

Figure 1. Network Analysis
Network analysis representing the most parsimonious relationship between sequence variants, separated by single base pair mutations (dotted lines). From 92 isolates, three lineages were defined (those with more than 10 isolates/sequence type, plus their most closely associated sequence types). Each color represents a lineage, each circle represents a sequence variant, and its size indicates the number of isolates comprising each sequence variant. Numbers inside each node the number of isolates comprising that sequence type. All other nodes represent 1, 2, or 4 isolates, scaled to size.

Figure 2. Sample Map
Map of global sample locations from where T. rotula isolates were collected.
Numbers correspond to location and sample information given in Table 1.  Bayesian analysis of divergence times among Thalassiosira sp. based on rDNA ITS1 sequence alignment. Time estimates are derived from a relaxed molecular clock calibrated using Sorhannus [12]. Branch numbers represent time of divergence (Mya).
Lineages 1 and 2 diverged from one another approximately 0.22 Mya. The tree topology matches that of Sorhannus [12], with the divergence of T. pseudonana from all other T. sp at 30 Mya. Placement of T. weissflogii and T. guillardii differ from Sorhannus [12], which may be due to differences in ITS1 and 18S mutation rates.   Clonal diversity within sites was greater than 99%, and divergence between sites exhibited F ST values upwards of 0.14. The presence of genetically distinct populations demonstrates that significant levels of genetic isolation can occur in the ocean despite the enormous dispersal potential of these planktonic organisms. Isolation-by-distance measures demonstrated that genetic distance was weakly related to geographic distance (r 2 =0.2), suggesting that distance is not a strong barrier to connectivity between diatom populations. A series of Mantel tests was used to examine correlations between pairwise genetic differences among sites, and differences in environmental and ecological factors including temperature, salinity, T. rotula cell abundance, and chlorophyll a. The combination of chlorophyll a and temperature accounted for 53% of genetic variability among sites (p=0.02), suggesting that environmental variables play a role in the structuring genetic variation among diatom populations. In fact, these data suggest that environmental selection may play a more important role than dispersal potential in structuring diversity and controlling geneflow among global populations of T.rotula.

INTRODUCTION
The relative influence of dispersal and environmental selection on the genetic connectivity of marine diatom populations has been widely debated for decades (Fenchel and Bland 2004;Finlay 2002;Medlin 2007). On local and regional scales, diatoms possess enormous diversity, often morphologically cryptic, both within and among species (Beszteri et al. 2005; Molecular studies examining the extent and structure of diatom population diversity, for the most part, have targeted local and regional spatial scales, and species studied to date demonstrate some common characteristics. First, all exhibit high levels of genetic diversity and large population sizes, with H e values upwards of 0.9 ), and clonal diversity greater than 90% (Rynearson and Armbrust 2005). For example, a single bloom of the diatom Ditylum brightwellii was comprised of thousands of clonal lineages   . It has been suggested that these populations are, in fact, separate species (Koester, 2011). Early allozyme work shows a similar pattern in Narragansett Bay, where two populations of Skeletonema costatum were identified in the spring and fall, respectively . Diatom species studied to date exhibit some differences in population structure as well. For example, in the centric diatom species D. brightwellii and S. marinoi, significantly diverged populations have been sampled on small geographic scales (e.g. < 500 km). These populations exhibited high levels of divergence indicative of limited gene flow despite the high physical connectivity between sites .
In contrast, little population structure and high gene flow were observed among samples of the pennate diatom species Pseudo-nitzschia pungens in the Southern Bight of the North Sea (Casteleyn 2009a).
The characteristics of diatom population structure over global spatial scales are virtually unknown. For marine algae, it has been widely argued that high dispersal may eliminate genetic distance as a barrier to genetic connectivity (Finlay 2002;Foissner 2006;Medlin 2007). Only a single study has examined global population structure in a diatom species. In this study, Pseudo-nitzschia pungens populations exhibited evidence for allopatric isolation, where geographic distance was suspected to be a strong barrier to gene flow   The goal of this work was to determine population-genetic structure across global geographic space in order to better understand those factors of the ocean system that may influence the generation of diversity and distribution of populations throughout the globe, and over time.

Field samples and isolates
Four-hundred forty-nine isolates of T. rotula were collected throughout 2010 as part of a simultaneous global sampling effort targeting twenty locations throughout the Atlantic and Pacific Ocean basins, and across hemispheres (Table1, Figure 1). The majority of sampling sites were chosen based on their alignment with established time series and thus contained significant metadata (Table 2). For sites not associated with time series measurements, ancillary data (e.g. temperature and salinity) was collected via in-situ sensors (e.g. YSI, Xylem). Whole seawater was rush-shipped from these twenty locations to the University of Rhode Island, USA at one to five time points per site throughout the year, targeting specifically the winter-spring phytoplankton bloom periods in the northern and southern hemispheres.
Whole seawater was shipped in 1 L dark Nalgene bottles in small coolers containing ice packs and sample information. In seawater samples where T. rotula could be identified morphologically, single cells or chains were isolated, cultures maintained, and DNA extracted according to ).
In brief, surface seawater was concentrated over a 20 µm mesh net. For each field sample where T. rotula was present, up to 96 single cells or chains were isolated from this >20µm size fraction using a stereo microscope (Olympus SZ61), rinsed in sterile seawater three times, and transferred to 1 mL sterile Sargasso seawater amended with f/20 nutrients . Isolates were incubated in either at 4, 10, 14, or 20°C, according to closest surface seawater temperature (SST) at isolation, and under a 12:12-h light:dark cycle of 90-120 µmol photons m -2 s -1 . Upon reaching approximately 1000 cells/mL (1 to 3 weeks depending on growth rate), and upon confirmation that the cultures were free of algal contamination, cells were filtered and DNA extracted.
In addition to the collection of field isolates, whole seawater samples were processed for community analysis in two ways. First, concentrated samples were fixed using a 2% acid Lugol's solution, and stored in 20 ml scintillation vials. Second, whole seawater was filtered onto three 0.2 µm polyester filters (Millipore), at 100 mls per filter, and stored at -80°C.
Because shipping time took anywhere from 1 to 4 days, a simple experiment was conducted to test whether or not genetic diversity in a 1 L bottle changed due to the shipping process. Two liters of whole seawater were collected from Narragansett Bay on January 26, 2010 ( isolates was genotyped and analyzed based on the procedures described below.

Microsatellite discovery and optimization
Six microsatellite loci (TR1, TR3, TR7, TR10, TR8, TR27) were isolated from the T. rotula genome using two methods. Loci TR1 and TR3 were isolated using the microsatellite enrichment method described in Hamilton et al. (Hamilton et al. 1999) and adapted by Spies et al. (Spies et al. 2005)  Msatcommander software (Faircloth 2008) was used to digitally identify repeat regions within the resulting contig and raw-read 454 datasets using the following search parameters: mono to hexa-nucleic acid repeat lengths of 10 or greater, and Microsatellites were amplified initially using un-labeled forward and reverse primers (  (Table 3). Markers were further optimized for fragment analysis, resulting in the final PCR conditions used to genotype field isolates (Table   3). Six loci were chosen for downstream analysis of field isolates and amplicons sequenced to confirm identity.

Amplification of microsatellite and ITS1 regions in field DNA
To quantify genetic variation among isolates at both the population and lineage levels, isolates were genotyped at six microsatellite loci, and sequenced at the ITS1.
Six loci were amplified from the gDNA of 449 T. rotula isolates using the following PCR reactions: Locus TR27 was amplified using the three-primer method described in Blacket et al., 2012(Blacket et al. 2012 BSA, and 0.03 U µl -1 Mango Taq Polymerase (Bioline). Thermal profiles of all microsatellite reactions consisted of an initial 5min denaturation step at 94°C, followed by a set of 30-35 cycles with annealing conditions optimized to each primer pair with fluorescent labels (Table 3), a 10 cycle probe-annealing step for TR27 only consisting of 94°C for 20 s, 54°C/61°C for 20s and 72°C for 30s, and finally (for all loci) a 10 min extension at 72°C. Alleles were scored using an ABI 3130xl (TR1,TR3, TR7) or an ABI 3730xl (TR8, TR10, TR27), and analyzed using the software Gene Mapper 5 ® (Life Technologies).
To amplify the ITS1 rDNA region, a reaction mixture containing ~5ng DNA, 0.1 mmol L -1 dNTPs (Bioline), 0.05 U µL -1 Bio X Act DNA polymerase (Bioline), 1X buffer (Bioline), and 0.5 µmol L -1 each of primers 1645F  and RotIIR ). The PCR consisted of a 2 min denaturation step at 95°C, 36 cycles of 94°C for 30 s, 62°C for 30 s, and 72°C for 1 min followed by a 10 min extension at 72°C. The ITS1 was amplified from 350 isolates, targeting a subset of each field sample. All sequencing was performed on an ABI 3130 xl (Applied Biosystems), and sequences are available in Genbank.

Analysis of microsatellite alleles
Summary statistics for each locus and each site were generated using the Excel  (Hochberg 1988).

Statistical analysis of population structure
Using microsatellite genotypes and allele frequencies, the presence of hierarchical population structure could be tested among samples over space and time.
The significance of genetic differentiation among sites was determined using the exact G test and Fisher's exact probability test  in GENEPOP v4.2 . The extent of pairwise differentiation between sites, F ST, was calculated in GenAlEx 6.5 . Pairwise tests between sample sites were conducted separately for each locus, as well as among all loci. All p-values were Bonferroni corrected. Analysis of molecular variance (AMOVA) was calculated using GenAlEx 6.5, testing for the presence of genetic structure at global and basin scales. Principal Coordinates Analysis (PCoA) was used to ordinate pairwise F ST in 2D space among sites with sample size greater than 10 individuals. The Bayesian clustering program STRUCTURE v2.2 ) was used to examine hierarchical resolution of population structure in the global dataset. STRUCTURE is a Bayesian analysis that inferrs the presence of distinct populations, assigns individuals to populations, and estimates population allele frequencies in situations where many individuals are migrants or admixed. In the STRUCTURE program, populations (K) from 2 to 13 were tested in 10 independent runs using the admixture model and correlated allele frequencies, with locations as prior, a burn-in of 100,000, and 100,000 repetitions. A maximum K of 13 was tested to reflect the total number of samples. Estimates of the number of populations (K) were based on Pritchard et al.  and ). StructureHarvester v.6.93 (Earl 2012) was used to determine the optimal K, and generate input files for downstream analyses. The Greedy algorithm in CLUMPP v.1.11 (Jensen et al. 2005) was used to evaluate agreement between independent STRUCTURE runs, and to arrange cluster labels. DISTRUCT v1.1  was used to visualize results of STRUCTURE.

Phylogenetic Analysis
To determine genetic structure and the extent of genetic diversity on the lineage level, 257 ITS1 sequences were sequenced and analyzed following the methods described in Whittaker et al. ). SeqMan II 3.61 (DNASTAR, Inc.) was used to curate raw sequencing data, and sequences were aligned using Clustal W  in Mega4 ). The boundaries of the ITS1 were determined through alignment with Genbank accession EF208798, and T. rotula sequence types compared with previously identified ITS1 lineages within the species T. rotula . To examine relationships among T. rotula ITS1 variants, a network was generated using the median joining algorithm in Network 4.5.1.6 (Fluxus Technology Ltd.).

Seascape Genetics
Correlations between pair-wise genetic distance and pair-wise Euclidean distance of environmental variables among sites were examined. These calculations relied upon extensive metadata associated with each sample. Metadata were generously provided by the institutions responsible for the time series associated with each site, or by individuals who collected the samples (

Population diversity of T. rotula
To test effect of long-distance shipping on T. rotula genetic diversity and population structure, two samples were collected simultaneously from Narragansett Bay. One sample was processed immediately and the other was shipped in a cooler for 4 days (samples NBa and NBb, respectively). NBa and NBb did not differ in their isolation success rates (both at 100%), but differences in sample size reflect differences in the number of positively identified T. rotula during isolation. This suggests that cells in shipped sample were healthy, as the probability of successful isolation (growth) remained the same in NBa and BBb. In those isolates that survived, and were positively ID as T.rotula, diversity statistics did not differ significantly between shipped and immediately processed samples. samples at a transit time of four days resulted in no effect on population diversity, and did not introduce any bias in genetic diversity or population structure.
In total, 449 T. rotula isolates were obtained from across the globe at eight locations, and thirteen discrete time points. The species identity of all isolates was confirmed via T-RFLP screening (Figure 2), 257 of which were further confirmed via ITS1 sequencing ( four identical six-locus genotypes were observed more than once in 449 individuals genotyped. Samples also exhibited high levels of genetic diversity, as measured by expected and observed heterozygosity. Across all loci, expected heterozygosity was as high as 0.83, and even higher for individual loci (Figure 3, Tables 3, 4 and 5).
Observed heterozygosity was lower than expected for all sites, at all loci (Table 4,   table 5). In fact, significant departures from Hardy-Weinberg equilibrium in the form of heterozygote deficiencies were observed for most site and locus combinations (  (Table 6). We tested whether microsatellites showed the same deliniation as ITS1 sequences using Principal coordinates analysis (PCoA); this test showed no differentiation between lineages when individuals from lineages 1 and 3 were analyzed together ( Figure 5). Because of this, we chose not to analyze population structure separately for lineages 1 and 3. PCoA axis 1 and 2 explained 44% and 20% of variation, respectively ( Figure 5). AMOVA demonstrated that populations were not significantly structured across ocean basins (Table 7). Genetic variability was significantly structured among samples, among individuals, and within individuals (p=0.001) ( Table 7).
Not all isolates from each water sample were equally or significantly diverged (   shortest continuous ocean distance between sites (p=0.007, r 2 =0.20); the result was significant, but the r 2 value was so low as to be uninformative ( Figure 8). This relationship was not observed when samples from either ocean basin were analyzed separately (data not shown).

Environmental variation and genetic structure
Environmental and ecological variables (salinity, temperature, chlorophyll a, and cell abundance) differed considerably across sample sites and over time. For instance, T. rotula cell abundance ranged from 100 cells L -1 to 55,000 cells L -1 . At a single site, Narragansett Bay, cell abundance increased from ~2,000 cells/L to over 55,000 cells/L over the course of three sampling periods distributed over two months ( Figure 9). Environmental conditions varied widely among samples.  Figure 10). The combination of temperature and chlorophyll a concentration explained 53% of the variance in global population structure (p = 0.02). Temperature alone explained 27%, and chlorophyll a concentration alone explained 36% of the variance in global population structure (Table 8).  ). This contrast may be due to differences in environmental variability and physical dynamics of the study sites. For example, the Mariager fjord in Denmark is associated with a relatively lower flushing rate than observed in Narragansett Bay, RI coastal Roscoff and Washington. The residence time of Mariager fjord is eight months . This is in contrast to Narragansett Bay, which is associated with a very short residence time between 10 and 40 days (Pilson 1985a;Pilson 1985b). Roscoff is a coastal site within the Western English Channel, experiences high connectivity to both the North Sea and Atlantic water masses, and is dominated by persistent tidal flushing (Robinson et al. 1986;Southward et al. 2004). The Washington coast is a region that experiences dynamic upwelling, and our sample site was associated with a highly exposed coast versus the more protected fjord site. Narragansett Bay, France, and Washington experience high levels of physical connectivity to surrounding waters, characteristics that may support the population differentiation over time observed here.

DISCUSSION
Different populations may be introduced from connected waters, and be specifically adapted and tightly coupled with environmental conditions that can vary widely over short periods of time.
The observation of significantly diverged populations over short periods of time in T. rotula may also relate to characteristics of the life history of the species.
For instance, T. rotula and other diatoms like S. marinoi form resting spores, or dormant cells that remain viable for many decades in the sediment . The triggers of resting spore formation and germination are little understood for many species, but relatively well understood for S. marinoi; it has been hypothesized that the accumulation of cysts in the sediment may support the persistence of a single fjord population, preventing the introduction of new populations to the system (Godhe and Härnström2010; ).
Although T. rotula is known to create resting spores, the success of T. rotula resting spore germination, triggers of spore formation, and level of accumulation in different sediments is virtually unknown. Resting spore dynamics, and the diversity of spores in sediments throughout the globe, may influence the variability of population structure over time. Hypothetically, sediments may harbor high levels of resting spores, and act as a 'seed bank' for diverse populations whose accumulation in the water column is triggered by favorable environmental conditions . Due to the difficulty of obtaining diatom samples, individuals from culture collections or random sampling events have traditionally been pooled to represent locations of interest, despite sampling times being separated by decades, or distributed across seasons. The act of pooling samples over time or regional space relies on the assumption that spatial variability is significantly greater than variability over time.
Evidence from T. rotula strongly refutes this assumption. Pooling samples over many decades may ignore the large variations in genetic composition embedded in time, and lead to false conclusions about the structure of populations, and factors driving that structure. The methods used here provide a viable option for obtaining simultaneous (non-pooled) samples over large spatial scales, because shipping time (4 days) was shown to have no effect on T. rotula population structure. By analyzing discrete samples separately, I was able to examine changes in genetic composition over time.
Pooling samples over space and time may result in erroneous clustering of individuals into a priori groups, leading to assumptions about genetic divergence that mask underlying genetic substructure; this may result in false measurements of heterozygosity, such as those described by the Wahlund effect .
Because so little is known in regards to the spatial and temporal variability of diatoms, pooling of samples should be avoided wherever possible.

Dispersal and Geography
Phylogeographic inference from the ribosomal ITS1 gene has revealed that T.
rotula can be subdivided into three lineages that exhibit differences in their geographic distributions . Lineage 1 was observed predominantly in the Eastern Pacific, Lineage 2 within Puget Sound, and Lineage 3 in both both the Atlantic and Pacific Ocean basins. A molecular clock estimated the divergence time between Lineage 1 and Lineage 3 at around 0.7 Mya. This period is similar to the split between P. pungens lineages, which also exhibit differences in geographic distribution, one being cosmopolitan and the other restricted to the NE Pacific . It has been hypothesized that punctuated glaciation, and fluctuations in sea level during the Pleistocene contributed to this pattern of lineage divergence, and the associated geographic structure in P. pungens   Although triggers of PCD in either species have not been examined, a greater tendency towards PCD in P. pungens under adverse conditions could prevent their long-distance dispersal, making distance a greater barrier to gene flow.
A more likely explanation of differences between the species' population structure may be differences in their ability to form resting spores, or dormant cells, that may provide a stepping-stone for dispersal in those species that can form them.
Life stages of cell quiescence and resting spore formation may affect the ability for cells to traverse large distances of the ocean. Resting spores may remain viable in environmentally unfavorable conditions, and the ability to form viable resting spores may be essential in facilitating dispersal in these organisms . Pseudo-nitzschia pungens does not form resting spores, but T. rotula does ; this may lead to differences in their dispersal potential, explaining why geographic distance is a greater barrier to geneflow in P. pungens than T. rotula. Starkly different patterns of population structure in the two species points to the need to expand our understanding of diatom population diversity over larger spatial scales and across more species.

Ecological drivers of population structure
Thalassiosira rotula, as a species, does not represent a single panmictic population. In fact, despite evidence of connectivity over broad spatial scales, significant population structure within T. rotula over time and space demonstrates that 'everything is not everywhere.' But does the environment select? Distinct populations within global samples of T. rotula demonstrate that mechanisms exist to support their genetic isolation, and reduce gene flow. So, although geographic distance is not a strong barrier to gene flow amongst populations, other mechanisms for selection must exist to support the patterns of population structure observed. One mechanism supporting this pattern may be environmental selection.
On the global scale, in T. rotula, environment was strongly correlated with patterns of genetic relatedness among samples. Specifically, the combination of temperature and chlorophyll a was significantly correlated with pairwise genetic distances, explaining 53% of genetic variability (p = 0.02). Two possible hypotheses explaining the correlation between environment and genetic relatedness are 1) sexual selection based on environmental triggers of reproduction or 2) isolation of populations due to environmental selection.
To maintain the population structure observed here, sexual reproduction amongst isolates within a population must occur more frequently than between isolates from different populations. Diatoms exhibit both clonal and sexual life cycles, and the frequency of sex in the field is little understood for the majority of species; observations of sex in the field are rare, and laboratory breeding experiments often unsuccessful. Environmental conditions can trigger sexual reproduction in diatoms ). In addition, the genes responsible for gamete recognition (Sig1 in particular) have been shown to evolve rapidly, exhibiting high levels of inter and intra-specific divergence within the Thalassiosra genus (Armbrust 1999;Armbrust and Galindo 2001). Rapid evolution of gamete recognition genes may regulate the breeding success of individuals from unique populations, thus promoting their isolation.
Here, we observed the genetic signature of sexual reproduction in T. rotula.
Isolates across sites and loci exhibited high levels of heterozygote deficiencies, as has been observed in other diatom species (Casteleyn 2009a;Evans 2009;. The consistency of this pattern across loci and samples suggests that this pattern is a real feature, and not solely a signature of null alleles. Exclusive clonality and asexual reproduction would result in an excess of heterozygotes (Balloux et al. 2003). Thus, heterozygote deficiency may be an imprint of frequent sexual reproduction in the species. Heterozygote deficiency has been observed across diatom species, and yet this common phenomenon is not fully understood. Thalassiosira rotula exhibited extremely high levels of clonal diversity; only four identical six-locus genotypes were observed more than once in 449 individuals genotyped. 99% clonal diversity suggests that the environment is heterogonous enough to prevent any one clone from becoming dominant ). In addition, sexual reproduction must be frequent in the field, and a force responsible for generating and maintaining diversity. vary as widely over time as they do in space are structured and maintained by the environment. These data strongly suggest that it is the environment, rather than dispersal, that isolates and poses barriers to gene flow within T. rotula.   Table 3. Microsatellite Loci List of microsatellite loci, repeat motifs, and their optimal conditions for amplification. Italicized primer segment of locus TR27 refers to the forward fluorescent tag used in the three-primer amplification method. Touchdown cycles were used to improve amplification of loci in low-concentration field DNA, and may not be necessary for high-concentration DNA samples. Optimal temperature refers to the temperature of annealing that balances the need for high specificity, and suitable amplification. Modification to the 5' end of the forward primer refers to a fluorescent probe, allowing for detection of amplicons in fragment analysis.         Admixture is apparent when membership coefficient for individuals is < 1 into two or more groups.

Figure 7. Detection of K Groups
Graphical method allowing for the detection of K including a.) the mean log of K (L(K)) ± SD over 10 runs for each value of K (2-13) b. Delta K, as described in Evanno et al.[43]. K =11 was chosen as the most likely number of populations.

INTRODUCTION
Diatoms contribute an estimated 40% of global carbon production . As biogeochemical powerhouses, the factors affecting their production and abundance over space and time are of particular interest. The genetic diversity of diatom communities, its seasonal variability, and sensitivity to long-term climate trends, is thought to influence the health and productivity of marine ecosystems Diaz et al. 2006;Duffy 2002;Folke et al. 2004;Stachowicz et al. 2002;Worm et al. 2006). The diversity and composition of diatom species in the field has traditionally been monitored using microscopy. More recent use of molecular tools has provided a glimpse into the fine-scale genetic underpinnings of diatom diversity, on the population and even clonal levels (Casteleyn 2009;). These finer-scale genetic subdivisions may be tightly coupled with environmental variability over time (Doney et al. 2004;Schmidt et al. 2008).
Thus, clarifying the scale and triggers of change in genetic composition over time may enhance our understanding of the interactions between diatom diversity, evolution, and environmental variability. Blooms occur when cell growth rates outpace loss rates; growth rates can vary with changes in nutrient, temperature, light regimes, physical forcing, grazing pressure, or viral infection (Behrenfeld 2010;Cloern 1996;).
The timing of these physical and environmental triggers of diatom growth rate relates to the temporal variation of cell abundance in the marine environment (Cloern 1996).
Differences in growth rate among individual strains have clear genetic underpinnings as shown in genotype-by-environment experiments ). This classical physiological work is supported by ever-increasing evidence from diatom transcriptomic data, demonstrating the diversity of molecular and metabolic responses of diatoms to stress factors such as nutrient limitation across species and individuals (Bowler et al. 2008;Dyhrman et al. 2012;Hwang et al. 2008;Mock et al. 2008;Sapriel et al. 2009). Whereas these genetic underpinnings of growth rate may be apparent in the laboratory, the extent to which they influence patterns of population-genetic diversity in the field is little understood. Exploring the temporal variations in diatom population-genetic structure provides a link to understanding the interactions between selection pressures of the ocean system and the physiological underpinnings of diatom diversity.
Although the structure and variability of diatom populations over time are little understood, a few key studies have used molecular tools to provide important insight.
The first molecular study of diatoms used allozymes to demonstrate that the diatom Skeletonema costatum was subdivided into two populations differentially present during spring and fall in Narragansett Bay, RI, USA . Whereas these allozyme 'populations' are now thought to be reproductively isolated species, this work was the first to demonstrate two characteristics of diatom species that are gaining support. First, diatoms exhibit a high potential for cryptic diversity within 'super species' previously assumed to be cosmopolitan in their distribution .
Second, diatoms exhibit evidence for sympatric isolation, where genetic isolation can occur within the same location, driven by selective pressures of the environment. For example, populations of the diatom Ditylum brightwellii co-occurring in space have been detected at different times of the year , although data on genome size differences between these populations suggest that they may be different species ). If both D. brightwelli and S.costatum seasonal populations are, in fact, different species, this suggests that reproductive isolation of diatoms may occur in sympatry, although alternative paths of speciation cannot be ruled out.
Understanding the extent and distribution of diatom diversity over time, and in relation to variations in the marine environment, is particularly significant in the context of a changing climate. Evidence suggests that a rapidly changing climate is shifting primary productivity throughout the global ocean; these changes in ocean productivity are most likely due to shifts in the distribution of phytoplankton populations over space and time Hays et al. 2005;Karl et al. 2001). The resiliency of phytoplankton populations to climate change ultimately depends on the extent and distribution of genetic diversity (Duffy 2002;Folke et al. 2004;Ptacnik et al. 2008;Stachowicz et al. 2002;Worm et al. 2006), which dictates both their metabolic and evolutionary potential to respond to environmental stress (Bell and Collins 2008). Unfortunately, little is known about the ways in which the molecular diversity of phytoplankton interacts with changes in climate, either through short-term metabolic acclimation, or through adaptation over evolutionary time scales.
This study explores the evolutionary and adaptive potential of the species Thalassiosira rotula, using isolates collected over time in Narragansett Bay, RI. We compare clonal diversity calculated from multilocus genotypes with physiological variability amongst isolates to infer the flexibility of the species in response to environmental variability. The extent and distribution of diversity, as well as the strength of genetic connectivity between populations within a single diatom species, may play a significant role in determining its ability to adapt to environmental change.
The goal of this work was to explore changes in population-genetic diversity over time in Thalassiosira rotula, a bloom-forming member of the phytoplankton community within the highly productive, well-studied estuary, Narragansett Bay, Rhode Island. Isolates from within Narragansett were compared with isolates collected from an adjacent coastal site (off of Martha's Vineyard, MA) over three years in order to better understand changes in genetic composition over time and in relation to local spatial connectivity and environmental variability.

Field samples and isolates
To examine changes in population-genetic diversity over time, 442 isolates of  Figure 1). Whole seawater samples were collected in 1 L dark Nalgene bottles from a single station in Narragansett Bay, RI ('Station 2') and a single station off the coast of Martha's Vineyard (MVCO) (Figure 1). Both sites are part of time series that collect data on phytoplankton diversity, chlorophyll a, and physical parameters. Because samples aligned with time series measurements, isolates were associated with metadata including species-specific phytoplankton abundance, and temperature, salinity, and nutrient concentrations. In seawater samples where T.
rotula could be identified morphologically, single cells or chains were isolated, cultures maintained, and DNA extracted according to ). In brief, surface seawater was concentrated over a 20 µm mesh net. For each field sample where T. rotula was present, up to 96 single cells or chains were isolated from this >20 µm size fraction using a stereo microscope (Olympus SZ61), rinsed in sterile seawater three times, and transferred to 1 mL sterile Sargasso seawater amended with f/20 nutrients . Isolates were incubated at either at 4, 10, 14, or 20°C according to closest surface seawater temperature (SST) at isolation, on a 12:12-h light:dark cycle of 90-120 µmol photons m -2 s -1 . Upon reaching approximately 1000 cells ml -1 (1 to 3 weeks depending on growth rate), and upon confirmation that the cultures were free of algal contamination, cells were filtered and DNA extracted and stored at -80°C until further analysis.
In addition to the collection of field isolates, whole seawater samples were processed for whole community analysis in two ways: First, concentrated samples were fixed using a 2% acid Lugol's solution, and stored in 20 mL scintillation vials.
Second, whole seawater was filtered onto three 0.2 µm polyester filters (Millipore), at 100 mLs per filter, and stored at -80°C.
An experiment was conducted to test whether or not genetic diversity in a 1L bottle changed due to sample shipping, and is explained in depth in Chapter 3 (Whittaker, in prep). In short, two leters of water were collected from Narragansett Bay on January 26, 2010. One liter was immediately processed (as described above), and the second liter was shipped to Washington, USA and back to Rhode Island, USA at a transit time of four days. Upon returning to Rhode Island, 48 single cells were isolated from the shipped sample. These two samples are named NBf and NBg in this study, to keep all sample names temporally consistent. The same samples are named NBa and NBb in Chapter 2, again, for internal consistency with time.

Strain physiology and culture conditions
Twelve strains of T. rotula were collected in August, 2013 when sea surface temperature (SST) was 22.3°C, and their growth rates were determined at five temperatures (4, 10, 17.5, 22.3, and 30°C), and under 24 hr light at120 µmol photons m -2 s -1 . Cultures were allowed to acclimate until no differences in maximum growth rate were observed between transfers (about 10 generations). Maximum acclimated growth rates were determined following Rynearson and Armbrust  and . Briefly, the in vivo fluorescence of semi-continuous batch cultures was measured daily using a 10AU Field Fluorometer equipped with the in vivo chlorophyll optical kit (Turner Designs). The maximum acclimated growth rate for each isolate was determined by regressing the change in the log of fluorescence over time and testing the equality of slopes from at least three serial cultures (Analysis of Variance α = 0.05). If slopes of serial growth curves were homogeneous, the average regression coefficient was used to estimate the common slope, which represented the average acclimated growth rate. Analysis of variance (ANOVA), Sheffe's Test (Enderlein 1961) and the Tukey multiple comparison test  were used to determine the significance of differences among all isolates, and among isolates at different temperatures. Alpha was set to 0.05 for all statistical tests.

Species identification and genotyping
To confirm species-level identity of Narragansett Bay and Martha's Vineyard isolates, the T-RFLP procedure was performed, as described in Whittaker and Rynearson, 2014 (in prep). To quantify genetic variation among isolates at both the population and lineage levels, isolates were genotyped at six microsatellite loci (TR1, TR3, TR7, TR8, TR10, TR27). All six loci were amplified in the gDNA of T. rotula isolates using the PCR conditions described in Whittaker and Rynearson (in prep).

Analysis of microsatellite alleles
Summary statistics for each locus at each site and time point were generated using the Excel Microsatellite Toolkit v. 3.1.1 , and GenePop v4.2 . These statistics included observed and expected heterozygosity for each locus, and across all loci, as well as polymorphism information content (PIC) values for each locus, and across all loci. Total number of alleles per locus and per site was also calculated. The number of unique genotypes in the 589-isolate sample set (also known as the G:N ratio) for each time point, and each site, was calculated by identifying matching multilocus genotypes (MGLs). Deviations from Hardy-Weinberg equilibrium (HWE) were calculated in Genepop v4.2 . Inbreeding coeficients F IS were generated in GenAlEx version 6.5 . Problems with multiple tests were alleviated by calculating corrected p-values via the Bonferroni technique.

Statistical analysis of population structure
The significance of differentiation among time points and between sites was determined using the exact G test and Fisher's exact probability test in GENEPOP v4.2 . Pairwise tests between samples were conducted separately for each locus, and across all loci. GENEPOP and GenAlEx were used to calculate pairwise F ST among samples. Further estimations of population structure including G' ST , and G'' ST were calculated in GenAlEx 6.5 Smouse 2006, 2012). All p-values were corrected using the Bonferroni method. Analysis of molecular variance (AMOVA) was calculated using GenAlEx were calculated based on Pritchard et al.  and . The Greedy algorithm in CLUMPP v.1.11 (Jakobsson and Rosenberg 2007) was used to evaluate agreement between independent STRUCTURE runs, and used to arrange cluster labels. DISTRUCT v1.1 ) was used to visualize results of STRUCTURE. Population structure over time, between sites, and between sites over time was further visualized using principal coordinates analysis (PCoA) in GenAlEx 6.5. These analyses were used to assess changes over time within Narragansett Bay, but also used to determine the extent and scale of population connectivity between Narragansett Bay and its close geographic neighbor MVCO.

Environmental and Ecological Correlations
Correlations between pairwise genetic distance and environmental distance among samples were tested. Metadata were provided by the Phytoplankton of Narragansett Bay time series and Martha's Vineyard Coastal Observatory (MVCO).
Parameters examined for all samples included the following: sea surface temperature (SST), salinity, chlorophyll a, temperature, and irradiance. Nutrient data were available for ten out of fourteen Narragansett Bay samples, and were included in an additional analysis. Principal components analysis (PCA) was used to ordinate environmental distances among samples. In addition, cell abundance of 245 phytoplankton species, measured by the Phytoplankton of Narragansett Bay time series for each sample, was used to compare community composition with intraspecific genetic composition of T. rotula. Community diversity metrics such as species richness, Shannon Index, and Chao1, and Simpson Index were calculated in EstimateS (Colwell 2010).
A series of nonparametric Mantel tests was performed to test correlations between genetic distance and variables of environmental distance among samples.
The statistical test known as BIOENV

Isolate collection and strain physiology
Twelve isolates, collected in August, 2013 at 22.3ºC, were used to examine the extent of physiological adaptation to their temperature of isolation. Growth rates of strains incubated at five temperatures, 4ºC, 10ºC, 17.5ºC, 22.3ºC, and 30ºC exhibited significant differences (Figure 2

Environmental and Ecological Correlations
Significant correlations were found between environmental and genetic variability over time. Environmental and ecological conditions varied considerably at both sites throughout the sampling period. Over the course of 3 years, samples collected at temperatures ranging from 1°C to 19.5°C in Narragansett Bay, and from 2°C to 14.1°C in Martha's Vineyard. Chlorophyll a was also quite variable, ranging from 0.6 µgL -1 to 22.1 µgL -1 in Narragansett Bay, and from 3.98 µgL -1 to 0.1 µgL -1 in Martha's Vineyard. The Euclidean ordination of differences in salinity, chlorophyll a, irradiance, and temperature are shown using PCA (Figure 7). BIOENV statistical analysis determined that 59% of variability in pairwise genetic distances between sites can be explained by the combination of irradiance and chlorophyll a differences (p = 0.01) ( this correlation was significant with an R 2 of 0.36 (p = 0.04) (Figure 8). What we can infer from this pattern of succession is a) Waters off the coast of Martha's Vineyard and Narragansett Bay appear to be highly connected and undifferentiated b) Shifts in populations can occur rapidly over time within these sites c) Population structure is unrelated to seasonal variability, and instead varies with environmental or ecological changes over shorter periods of time.

Potential sources of populations
While inferring population sources is a major challenge for many high dispersal marine organisms (Roman and Darling 2007), there are several feasible scenarios that would allow for different populations to become established over short periods of time. First, cells of any given population may be below detection, or rare in the water column until environmental or ecological conditions trigger their rapid growth, and our detection of them (Caron and Countway 2009). Second, dormant cells in the form of resting spores or cysts may remain viable in the sediment; these may be a source for populations that arise only when cues for their growth are triggered, or when conditions of the surface water are favorable . Third, high connectivity with coastal environments, and high dispersal potential, may allow for colonization and establishment of new populations in each site when conditions are favorable such that they can outcompete and replace incumbent populations. Reduced diversity may be a signature of colonization events, where recently introduced populations exhibit lower diversity than source populations (Davies et al. 1999 Instead, favorable conditions may select for the growth of individuals from different populations existing in the rare biosphere or sediment at each site. In fact, it has been observed in the diatom Skeletonema marinoi that genetic diversity in the water column is tightly coupled with that of the sediment .

Historical T. rotula dynamics in Narragansett Bay
Population structure in T. rotula observed over three years of sampling was insightful when placed in the context of long-term patterns of abundance in Narragansett Bay, thanks to a historical record of phytoplankton species abundance dating back to 1950s. Since this time, T. rotula has demonstrated a bimodal distribution of abundance with temperature, consistent over many decades . Whereas the timing of T. rotula presence in the bay is somewhat sporadic from year to year, and low to medium abundances of the species occur throughout the year, the highest abundances of the species occur at 4ºC and 17ºC, respectively, with a dip in abundance at 10ºC (Krawiec, 1978). Isolates in this study were collected across a broad range of temperatures in the Bay, from 1ºC to 19.5ºC. T. rotula populations were not subdivided according to temperature in Narragansett Bay, suggesting that the bimodal peak in abundance with temperature does not reflect the differential growth of two separate populations, as has been demonstrated by Skeletonema costatum in the bay ). In addition, it has been shown from T. rotula autecology experiments that growth rates are consistently higher at 10ºC than at 4ºC ; this demonstrates that a dip in abundance at 10ºC is not necessarily related to a reduction in cell-specific growth rate at this temperature. Taken together, these data suggest that the pattern of bimodal abundance of T. rotula in Narragansett Bay does not relate to genetic or physiological characteristics of the species. Instead, factors such as grazing or physical mixing may impact the low accumulation cells in the bay at 10ºC, and high abundance of cells at 4ºC and 17.5ºC. In fact, grazing by protists has been shown to remove up to 200% of daily phytoplankton production in Narragansett Bay (Lawrence and Menden-Deuer 2012). These data also suggest that cell abundance of T. rotula in the field may not relate directly to differences in growth rate, or the physiological differences of populations observed in this study. In fact, no relationship between cell abundance and population structure was observed here.
Instead, cell abundance in the bay may relate more strongly to predation or stratification in the bay, which can dictate predator-prey encounter rates (Gerritsen and Strickler 1977). In addition, cell abundance in the field at different temperatures may not directly relate to the physiological adaptation of individuals or populations to those environmental conditions.

Physiological and genetic coupling
To test the relative physiological adaptation of populations or individuals to different environments, we measured the growth rate of individuals at five different temperatures, reflecting the temperature range of Narragansett Bay. Twelve isolates collected at 22.3ºC were grown at 4ºC, 10ºC, 17.5ºC, 22.3ºC, and 30ºC. These data confirm previous autoecology work by  showing that T.
rotula grows more slowly at 4ºC than at all other temperatures measured. No difference in growth rate was observed between isolates grown at 10ºC and 17.5ºC, suggesting that the difference in these temperature conditions may not trigger rapid increases in growth rate in these isolates, which were collected at higher temperatures (22.3ºC). In fact, individuals grew on average 60% faster at their temperature of isolation (22.3ºC) than at other temperatures, and none grew at 30ºC. This suggests that these isolates may be particularly adapted to the temperature from which they were collected, and that this condition may be an environmental trigger to the increased growth rates allowing for their detection in the field.
Isolates exhibited dramatically different patterns in growth among individuals ( Figure 3). In fact, every isolate exhibited a unique physiological footprint across temperatures tested. This physiological diversity reflects the genetic and clonal diversity observed within T. rotula isolates. For instance, we observed 99.6% unique clones amongst samples collected here; out of 589 isolates genotyped, only 2 were sampled with matching multi-locus genotypes. Thus, T. rotula clonal lineages harbored within each population are associated with high physiological diversity. It has been hypothesized that the rapid and frequent environmental change in the ocean contributes to high levels of clonal diversity, preventing the dominance of any single clonal lineage ; this holds true only if clonal diversity is related to physiological diversity, as observed here. The magnitude of variation in growth rate amongst T. rotula isolates, related to clonal diversity, may explain the rapid changes in the composition of populations in the field; this has also been demonstrated for the diatom D. brightwellii . Narragansett Bay isolates show a clear coupling between the scales of genetic and physiological diversity; this suggests that populations harbor enormous adaptive potential, offering ecological flexibility in the highly variable marine environment.

Inter vs. intraspecific diversity
Species abundance data for all Narragansett Bay samples allowed for comparisons between species-level community composition of the Bay and the genetic composition of T. rotula over time. It has been hypothesized that the species-level diversity of field communities may correlate with the genetic diversity within species; this relationship is thought to exist due to the similar selective pressures structuring diversity and driving evolution on both the inter and intra-specific levels (Vellend 2005;Vellend and Geber 2005). In addition, processes such as mutation, drift, and migration may be acting in parallel on the intraspecific and species levels. Here, we observed a significant correlation between species richness and intraspecific heterozygosity (genetic diversity). This correlation suggests that parallel factors are responsible for structuring and maintaining the diversity of species in the field community, as well as the genotypic diversity within species. The heterogeneity of the marine environment, especially in coastal temperate ecosystems like Narragansett Bay, most likely provides ecological niches highly variable over space and time that are filled on different scales by diverse species and genotypes. The fact we observed this correlation in a planktonic organism, constantly impacted by a dynamic fluid environment, suggests that the mechanisms maintaining diversity and driving evolution may be consistent both within species, and on the whole-community level.

Ecological correlates of genetic structure
Although changes in genetic composition over time showed no correlation with seasons, years, or space, a strong relationship was observed between chlorophyll a concentration and population structure. In fact, chlorophyll concentration explained 53% of the variance in divergence among samples (p=0.01). Chlorophyll is related to productivity (Behrenfeld et al. 2005;Cloern et al. 1995), and changes in its concentration in the field can reflect a generalized response of phytoplankton to an influx in nutrients, temperature, stratification, as well as loss factors such as grazing and mixing (Marra et al. 1990;Riemann et al. 1989 Here, rapid growth rates, accompanied by high physiological variability, were most likely responsible for the rapid response and succession of populations. These populations harbor high levels of physiological diversity, which most likely relates to their relative production in the field. This is particularly significant in terms of the timing and productivity of phytoplankton blooms. The formation of phytoplankton blooms is, at times, sporadic, and triggers of bloom formation little understood. T.
rotula demonstrates that populations can respond rapidly to changes in the local environment, whether that be interactions with other species, a release of nutrient stress, or changes in abiotic factors. Here, population structure over time was unrelated to seasonal changes. This work shows that population succession may be tightly coupled with ecological interactions in the field, dictated just much by resource competition and competitive exclusion as the fluctuation of abiotic factors in the environment.       membership probability of that individual into any number of K groups. Admixture is apparent when membership coefficient for individuals is < 1 into two or more groups.

Figure 7. Isolation-by-Environment
Isolation-by-environment graph shows regression of pair-wise genetic (F ST ) with pairwise environmental distance (measured as Euclidean distance of environmental factors among sites. This graph shows the regression of F ST with pair-wise chlorophyll a differences among sites; chlorophyll a was the variable found by BIOENV to be the most significantly correlated with pair-wise F ST (Rho = 0.062, p=0.01).

DISCUSSION
Diatoms are of biogeochemical importance to the earth's climate, playing an important role in the global cycling of carbon . Found in virtually every marine environment, their ecological success cannot be denied. The vast extent of diversity both within and between diatom species most likely contributes to their widespread ecological success and productivity in the marine environment; for instance, this diversity has been shown to be functionally significant (Sarthou et al. 2005;. We know that phytoplankton can evolve rapidly, and quickly fill new ecological niches ). In addition, as single-celled planktonic organisms, diatoms maintain the potential to disperse throughout the globe. Yet, despite the global presence and ecological importance of diatoms, virtually nothing is known about the ways in which their intraspecific diversity is distributed and structured on global spatial scales. Even less understood are the ecological factors that influence the evolution of unique populations within species, and support their genetic connectivity over space and time. This dissertation sought to address these questions by a) exploring the extent and structure of diversity nested within the diatom species T. rotula over global space and time b) identifying factors of the marine environment that have contributed to the evolution of this diversity.
Understanding the extent and distribution of diatom diversity is particularly important in the context of climate change. For one, diatoms are highly productive photosynthetic organisms that play an essential role in global cycling of carbon Sarthou et al. 2005). Evidence suggests that a rapidly changing climate is shifting primary productivity throughout the global ocean; these changes in ocean productivity are most likely due to shifts in the distribution of phytoplankton populations over space and time Cermeno et al. 2010 ;. The resiliency of phytoplankton populations to climate change ultimately depends on the extent and distribution of genetic diversity, which dictates both their metabolic and evolutionary potential to respond to environmental stress .
The results presented here suggest that T. rotula harbors high levels of clonal diversity, as has been observed in other species (Godhe and Härnström2010;. More importantly, these high levels of clonal diversity relate to high levels of physiological variation observed among individuals collected from around the globe ). In samples from Narragansett Bay and off the coast of Martha's Vineyard collected over three years, T. rotula populations were highly diverse, and shifted over short time scales. This suggests that the water column or sediment harbors a repository of populations that may respond differently and rapidly to changes in the environment or ecology. T. rotula exhibits high levels of genetic and physiological diversity within populations that shift rapidly over time; if these characteristics are common to other diatom species, it suggests that they may be well-suited for adaptation to rapidly changing environmental conditions. On the other hand, in a changing climate, reduction in environmental variability due to loss of seasonal extremes may the reduce strength of those evolutionary forces supporting diversity within communities and species. More likely, characteristics of rapid growth rates, accompanied by high physiological variability, contribute heavily to the rapid response and succession of T. rotula populations; these characteristics suggest that T. rotula, and other diatom species, harbor high evolutionary potential that has contributed to their ecological success in the past, and may aid in their ability to adapt to future changes in the earth's climate.
The extent to which diatom diversity varies over time and space may play an important role in determining their bloom formation. Diatom blooms are some of the most prominent features of biogeochemical variability in open-ocean and coastal marine environments . Diatom blooms are defined as a rapid increase in cell biomass, which contribute a large drawdown of carbon from the atmosphere, and rapid flux of fixed carbon that can sink to depths when the bloom declines . Despite their massive impact on marine ecosystems, the timing of blooms is often sporadic and unpredictable, and triggers of blooms little understood; but in general, diatom blooms occur when cell growth rates outpace loss rates (Lucas et al. 1999a;Lucas et al. 1999b;Smayda 2000).
Differences in growth rate among individual strains been shown in the lab to have genetic underpinnings ). In the field, exploring the temporal and spatial variations in diatom population-genetic structure provides a link to understanding the interactions between selection pressures of the ocean system and the physiological underpinnings of their diversity and bloom formation.
The data presented here suggest a strong link between the spatial and temporal variations in diatom population structure and their bloom formation. For example, I observed that population-genetic structure within T. rotula was significantly correlated with whole-community chlorophyll a content on local and global scales, and over time. This means that individuals collected under conditions of low chlorophyll (or non-bloom conditions) were more closely related than with individuals collected at high chlorophyll concentrations (or bloom conditions), and vice versa. In Narragansett Bay, T. rotula populations shifted rapidly over time, suggesting that they can respond quickly to changes in the local environment, whether that relates to interactions with other species, a release of nutrient stress, or changes in the environment. This work demonstrates that population succession may be tightly coupled with ecological interactions in the field, dictated just much by resource competition and competitive exclusion as the fluctuation of abiotic factors in the environment. In particular, these data suggest that certain populations may be particularly adapted to bloom and non-bloom periods; this implies that the dynamics of adaptation and diversity below the species level may play an essential role in determining the timing and triggers of phytoplankton blooms, and may explain why blooms are so difficult to predict.
This work advances a decades-long debate on the relative importance of dispersal and the environment in structuring single-celled species at constant flux in the ocean environment. In terms of dispersal, for T. rotula, geographic distance plays only a minor role in limiting gene flow and structuring populations. This is in contrast to a strong isolation by distance pattern observed in global samples of the diatom Pseudo-nitzschia pungens . The two species may differ in their patterns of diversity for several reasons. For one, pennate and centric diatoms differ in their modes of sexual reproduction (Von Dassow and Montresor 2011).
These differences may contribute to their genetic structure, as discussed in chapter 2. More probable may be differences in dispersal capacity between the species, namely the ability for T. rotula to form resting spores, but not P. pungens . Resting spores are dormant cysts formed by some diatom species that remain viable for up to 100 years, and often settle in the sediment Harnstrom et al. 2011;. Because of this, they may provide a seed bank of diversity that can act as a stepping-stone of dispersal for these organisms (Godhe et al. 2013). The ability to form resting spores may increase the potential for T. rotula to disperse throughout the globe, across less favorable environmental conditions. Because P.pungens cannot form resting spores, this may explain why dispersal is a greater barrier to gene flow among populations.
More broadly, differences in genetic structure observed between these two species stresses the need for future research on population structure in more diatom species to uncover common patterns and shared selection pressures.
This is the first study to examine global diatom population structure alongside variations in the marine environment; thus, it is unclear if patterns observed in T.
rotula are common to other species. For T. rotula, these data demonstrate that the environment, rather than dispersal, isolates populations and dictates geneflow among them. The structuring of T. rotula populations across time and global geographic space demonstrates that this species is not genetically homogenous, nor fully admixed.
In T. rotula genetic composition can vary as much over time as it does over large spatial scales; this poses a caution to pooling samples over seasons or years, or grouping samples over space when exploring population-level diversity in diatoms.
In addition, these results suggest that more work is needed to understand the scale of variability in diatom populations over time. If geographic space is not a good predictor of T. rotula population structure, then care must be taken to better understand variations in population structure over time, and across diverse ecological conditions. This work broadens our understanding of these interactions, by attempting to tease apart variations diatom diversity over large spatial scales, time, and across a wide range of environmental conditions. Overall, this work demonstrates that distance may not be a great barrier to geneflow in these organisms, and instead, environmental selection may support high levels of diversity within diatom species, and dictate the extent of geneflow among populations.

A. Growth rates of T. rotula strains
The table below contains raw growth rate data for strains referenced in Chapters 2 and 4. Table contains culture/strain name, average growth rate, average growth rate standard error, and temperature of incubaion. A growth rate of "0" refers to cultures that were incubated at a certain temperature, but failed to grow.