POPULATION GENETIC STRUCTURE OF SARDINELLA AURITA AND SARDINELLA MADURENSIS IN THE EASTERN CENTRAL ATLANTIC REGION (CECAF) IN WEST AFRICA

Marine fishes represent a valuable resource for the global economy and food consumption, but many species experience high levels of exploitation necessitating effective management plans. Long term sustainability of these resources may be jeopardized from insufficient knowledge about intra-specific population structure. Restriction-site associated DNA sequencing (RAD-seq) methods are revolutionizing the field of population genomics in non-model organisms as they can generate a high number of single nucleotide polymorphisms (SNPs) even when no reference genomic information is available. Sardinella aurita and Sardinella maderensis are non-model species lacking reference genomes. Using double-digest restriction-site associated DNA (ddRAD) sequencing, we surveyed variation in 6,078 single nucleotide polymorphisms (SNPs) loci identified in S. aurita from Mauritania, Senegal, Ghana, Togo, and Benin in West Africa and 6,767 SNPs loci identified in S. maderensis from Mauritania, Senegal, Guinea, Togo, and Benin. Sardinella aurita populations revealed low levels of genetic differentiation (overall FST value = 0.001 and pairwise FST values ranging from -0.002 to 0.005, p > 0.05) and lack of population structure at the geographic locations surveyed, suggesting the presence of a single panmictic population in this region. Analysis of S. maderensis samples also demonstrated that genetic differentiation does not exists across the locations studied (overall FST = 0.002, and pairwise FST values ranged from -0.005 to 0.016). More research needs to be performed to extend the geographical and temporal sampling. Overall, results from this research will contribute to our understanding of the distribution of the two species to assist in the management of the fish resources in the Eastern Central Atlantic region (CECAF) of the Food and Agriculture Organization (FAO).


CHAPTER 1: LITERATURE REVIEW
The Eastern Central Atlantic (CECAF) region of the Food and Agriculture Organization (FAO), a management structure for ensuring sustainable use of marine resources, is characterized by the importance of small pelagic resources, including round sardinella (Sardinella aurita) and flat sardinella (Sardinella maderensis). The two species are strategic products for African populations by providing support for both artisanal and industrial fisheries (FAO, 1999). Their exploitation has great economic importance since they constitute the bulk of the catch in the countries located in the CECAF region by providing the raw material for industries and supporting a large number of jobs such as capture, processing and trade activities in the coastal countries in Africa (Tacon, 2004).
The CECAF region, covering the area of West Africa from Morocco to Angola is further divided into two zones; northern and southern. The northern zone includes coastal countries from Morocco to Guinea. The southern zone extends from Sierra Leone to Angola ( Figure   2.1).
These two sardinella species are among the most abundant commercially important migratory small pelagic species in West Africa and belong to the Clupedia family. Round sardinellas, S. aurita, are found in the Eastern Atlantic from the African coast to Gibraltar and Mediterranean Sea, and in the Western Atlantic from Cape Cod to Argentina. Flat sardinellas, S. maderensis, occur in the Mediterranean Sea and in the Eastern Atlantic from Gibraltar southward to Angola. The two species, which are distinguished by several morphological characteristics, are found together in large schools and are mostly combined in statistics and managed together as "sardinella species". As a result of their migratory nature, the degree of mixing among populations from various countries through the geographic range is crucial to defining fishery management units. As in other pelagic species, eggs and larvae of these species are passively transported by ocean currents, which may carry them to plankton rich coastal areas representing favorable nurseries until they reach the size at which they join the main adult fish stocks. Both species are tolerant to low salinities in estuaries. Sardinella aurita prefer clear water with a temperature below 24ºC, while S. maderensis prefer warmer waters above 24 0 C. Their habitat ranges from near surface, down to 350 m at the edge of continental shelf (Brainerd, 1991).
The abundance of S. aurita in most parts of the world is controlled by water temperature and other hydrographic parameters (Failler, 2014). The migration of this species in the northern zone of West Africa shows fish moving between Senegal and Morocco (Boely and Fre`on, 1979). Round sardinellas are found in Senegalese waters in December-January, concentrated along the edges of the shelf between the Cape Verde Peninsula in Senegal and Guinea Bissau until April. Their reproduction is mainly in the continental shelf during the upwelling season but with a major peak identified in May and June (Freon, 1988). At the end of June, the hydrographic conditions of Senegal waters become less favorable, and adults of the species move towards the north to reproduce in the waters surrounding the Arguin Bank in Mauritania from July to August (Boely et al., 1982). In August/September, upwelling is reduced and the fish leave Mauritania and migrate towards the north into Moroccan waters. In the Gulf of Guinea (southern CECAF), the migration of this species has been identified as tied up with the upwelling cycle (Brainerd, 1991) and as the upwelling starts in July, spawning is then at its maximum and the stock spreads out of the eastern half of Cote d'lvoire and toward the east as far as Togo and Benin. Sardinella maderensis on the other hand are less migratory than S. aurita and their movements have been well identified in the southern area (Congo-Angola region).
The abundance of both species is known to fluctuate greatly on a decadal time scale in the CECAF region (Longhurst and Pauly, 1987) and has been listed also as part of the International Union for Conservation of Nature (IUCN) Red List of Threatened Species.
The small-pelagic stocks in the CECAF region over the last decade have undergone significant fluctuations which can be caused by natural variability because changes in the environment can affect their recruitment (Belvèze, 1991;Cury and Roy, 1989;Zizah et al., 2001). For example, the S. aurita stock in the western Gulf of Guinea (fisheries sector in Ghana) has recorded consistent decline in terms of output over the years, with the canoe fishery's annual sardinella catch declining to just over 17,000 metric tonsin 2012 from 120,000 metric tons a dozen years earlier (Lazar et al., 2018). An accurate fish stock identification based on genetic studies on the population structure is limited and this limitation has contributed to imperfect scientific management of these fisheries within the region.

Stock management of sardinella species
In fishery management, a unit of stock is normally regarded as a group of fish exploited in a specific area or by a specific method (Carvalho and Hauser, 1995).
Population structure of fish stocks serves as the basis of effective fisheries management which defines the spatial boundaries of the stock associated with its seasonal migration and long-term stability within a defined genetic makeup. Given the nature of the marine seafood resources in the CECAF region, potential benefits can be derived if efforts are made to manage and develop these fishery resource efficiently (Brainard, 1991). The status of the sardinella fishery is monitored and evaluated by the regional working group of the Committee for the Eastern Central Atlantic Fisheries (CECAF) of FAO who ensures improved management of small pelagic resources in West Africa by assessing the state of the stocks and ensuring sustainable use of these resources. Based on the assessment, management recommendations are made for sustainability of the stocks. According to Lazar et al. (2018), the FAO/CECAF Working Group has agreed on the existence of four stocks for these two species in the southern CECAF area within the Gulf of Guinea: 1) Northern zone (Guinea-Bissau, Guinea, Sierra Leone and Liberia); 2) Western zone (Côte d'Ivoire, Togo, Ghana and Benin); 3) Central zone (Nigeria and Cameroon); and 4) Southern zone (Gabon, the Democratic Republic of the Congo, the Congo and Angola).
This stock differentiation is an assumption based on management needs and has been defined in the absence of information to match the biological boundaries of these two species with management strategies. Setting fisheries management strategies requires an understanding of fish stock boundaries and fish managers need information on the biological differences and genetic processes of discrete local groups of a species (Palumbi, 1996). Genetic assessment can be used to determine the population structure, determine whether individuals have moved among populations recently or in the distant past, suggest the typical size of a population and, thus, the effective reproducing population (Bernatchez and Wilson 1998;Taylor et al., 2001).

The use of population genetics in fisheries stock management
The most obvious forms of assessing fish populations involve counting or measuring individual fish, but another suite of characteristics that can be very informative is their genes. Although numerous complementary techniques exist to define fish stocks, it is now well established that genetic data analyses are essential to better delineate stock structure for sustainable management (Durand et al., 2013). Population genetic studies are used in determining evolutionary processes occurring in a population. Polymorphism in wild populations is affected by a range of evolutionary forces (Hedrick 2005) where geneflow reflects migration and leads to increased homogeneity among isolated populations, while genetic drift acting within populations leads to increased levels of differentiation among populations as a cause of random events between generations. Natal homing of individuals and spawning aggregations also contribute to stock structure in populations (Svedang et al., 2007). Inferring the degree of genetic exchange between populations of marine fish species is key to successfully managing exploited populations. This enables the identification of conservation units and assignment of individuals to geographic regions (Dichmont et al. 2012;Funk et al. 2012). Many exploited marine fish are characterized by little intraspecific genetic structuring even over large geographical distances (Bradbury et al. 2008;Ward et al. 1994). Studies have provided valuable information on spatial population structure for aquatic species of management and conservation concern since genetic assessment can be used to identify cryptic species, determine whether individuals have moved among populations recently or in the distant past, and evaluate bottlenecks and founder effects (Beaumont 1994;Nielson 1995).
Assessing these and other properties relies on identifying sets of genetic markers (White et al., 2005;DeHaan et al., 2006). Common genetic markers used in population identification include mitochondrial DNA sequencing, microsatellites, fragment length polymorphisms (RFLPs and AFLPs), single nucleotide polymorphisms (SNPs), and insertion-deletion polymorphisms (indels). The traditional process of marker development is costly (in time and research funding) and usually results in the generation of very few working markers. A decrease in cost of sequencing has allowed for the development of new high-throughput technologies for genotyping and population genomic studies. The genotyping-by-sequencing approach used for instance in restriction site associated DNA sequencing (RAD-Seq) combines the power of high throughput sequencing and large-scale polymorphism genotyping in one step (Baird et al. 2008;Hohenlohe et al. 2010).

Past Application of Genetic Techniques in Sardinella Species from West Africa
Population genetic studies have been carried out in other sardinella species such as S. albella in the Persian Gulf and Sea of Oman using mitochondrial DNA (Rahimi, S, P, Sh, & Rahnema, 2016), S. longiceps in Indian Ocean waters using microsatellites (Sukumaran et al., 2017), and S. aurita in the coastal waters of Florida, USA, using protein electrophoresis (Kinsey et al., 1994). However, population genetic studies dealing with species of economic importance along the rest of the West African coastline are rare, mostly using weakly polymorphic allozyme and maternally inherited mitochondrial DNA (mtDNA) (Durand et al., 2013). Atarhouch et al. (2006) demonstrated that a local Moroccan population of sardine (Sardina pilchardus) was genetically depleted as a probable cause of intensive fishing in the recent past. (Chi, 1998) studied allozyme variability in S. aurita from the Congo, Ghana, Ivory Coast and Venezuela. They recorded three polymorphic loci but extremely low levels of diversity and low differentiation between populations. More research is needed to identify the spatial structure of the round sardinella (Sardinella aurita) and flat sardinella (Sardinella maderensis) stocks within the CECAF FAO region.

The Use of RAD-seq Technique in Population Genetic Studies
The resolution of genetic differentiation allowed by a large number of genome wide polymorphic markers should enhance inference of neutral population structure (Benestan et al. 2015;Lamichhaney et al. 2012;Narum et al. 2013). One of the most popular highthroughput genotyping methods currently available, RAD-seq, combines restriction enzyme digestion of the genome with high throughput sequencing. RAD-seq is particularly relevant for non-model organisms as it allows discovering and genotyping thousands of single nucleotide polymorphisms (SNPs) in hundreds of individuals rapidly and at low cost regardless of size of the genome and prior genomic knowledge (Baird et al. 2008;Puritz et al. 2014). This methodology enables not only genotyping and SNP discovery, but also more complex analyses such as quantitative genetic, phylogeographic studies, and population differentiation studies. Examples of studies in marine fishes that have used RAD-Seq include genetic marker discovery in threespine stickleback (Catchen et al., 2014), a study of the neutral structure in populations of Pacific lamprey , and resolution of fine-scale structure in Atlantic salmon (Bourret et al. 2013). Consequently, a variety of RAD-seq protocols are increasingly used to identify and genotype genome-wide markers in non-model marine species to directly inform conservation and management efforts (e.g.

INTRODUCTION
Fish is a primary source of protein for at least one billion people and can contribute as much as 80% of the animal protein consumed in the world (FAO, 2009b). The commercial value of small pelagic fishes (oil sardines and anchovies) is low in export market, but in developing countries, they contribute to a substantial portion of the income from fishing due to their abundance (Sukumaran et al., 2016). The two sardinella species Sardinella aurita and S. maderensis are among the most abundant commercially important migratory small pelagic species in West Africa. These species have a wide distribution in the Atlantic and Pacific Oceans and the Mediterranean Sea. They belong to the Clupedia family and occur in the eastern Atlantic from Gibraltar southward to Angola (Bureau & Resources, 1999). The abundance of these species is known to fluctuate greatly on a decadal time scale in the Eastern Central Atlantic region (Longhurst and Pauly 1987) and have also been listed as part of the International Union for Conservation of Nature (IUCN) Red List of Threatened Species. The monitoring of the status of the small pelagic stocks is determined by the regional working group of the Committee for the Eastern Central Atlantic Fisheries (CECAF) of the Food and Agriculture Organization (FAO) (Lazar et al., 2018). For management purposes, the FAO/CECAF Working Group has agreed on the existence of four stocks for these two species in the southern CECAF region: 1) Northern zone (Guinea-Bissau, Guinea, Sierra Leone and Liberia); 2) Western zone (Côte d'Ivoire, Togo, Ghana and Benin); 3) Central zone (Nigeria and Cameroon); and 4) Southern zone (Gabon, the Democratic Republic of the Congo, the Congo and Angola). In the absence of information to match the biological boundaries of these two species with management strategies, this stock differentiation is an assumption based on management needs.
Fish managers need information on the biological differences between discrete local groups of a species as well as an understanding of the genetic and ecological processes that influence their discreteness. The genetic structure of fish populations is important not only because of fundamental interest in biotic evolution (Tudela et al., 1999) but also for the management of fisheries (Roldan et al., 2000). Even in general marine fishes show lack of genetic differentiation, recent studies using advanced markers have provided evidence for some level of differentiation in many marine fishes as these markers are able to resolve signatures of selection and adaptation in response to environmental and habitat change (Wang et al., 2013, Candy et al., 2015, Brennan et al., 2016. Management strategies should be aimed at conserving intra-specific genetic diversity as it has profound implications in deciding potential for recruitment and population recovery (Teacher et al., 2013). Hence, misdirected management actions without knowledge about the stock structure of marine fishes may result in inability to recover from environmental impacts (Sukumaran, et al., 2017).
An accurate fish stock identification based on genetic studies on the stock structure is limited for marine pelagic species in the CECAF region. This limitation has contributed to imperfect scientific management of fisheries within the region. Previous population genetic studies in sardinellas relied on mostly using weakly polymorphic allozyme and mitochondrial DNA (mtDNA) (Durand et al., 2013). Kinsey et al., (1994) studied the population structure of S. aurita in the coastal waters of Florida, USA, using protein electrophoresis, revealing low levels of genetic differentiation and lack of population structure. Chi (1998) studied allozyme variability in S. aurita from the Congo, Ghana, Ivory Coast and Venezuela, recording three polymorphic loci but low levels of diversity and low differentiation between populations. Because of the importance of identification of stock structure in stock assessment (Whithead, 1985), we investigated the genetic population structure of S. aurita and S. maderensis in five sites located along the Eastern Central Atlantic region using Restriction site-Associated DNA sequencing (RADSeq), a sensitive genotyping tool which has the ability to identify and score thousands of genetic markers (SNPs, indels) randomly distributed across the target genome from a group of individuals using Illumina next generation sequencing technology (Davey & Blaxter, 2010). Information from this study will be useful in the stock management of these important fish species by the CECAF working group for accurate delineation of the stocks of this two sardinellas in the CECAF region.

Sample Collection and DNA extraction
Sample collection of S. aurita and S. maderensis was coordinated by Najih Lazar,

Library preparation and sequencing
Library preparation for double digest RAD (ddRAD) and sequencing were carried out by Texas A&M Corpus Christi sequencing Core Center. This protocol was used for both species S. aurita and S. maderensis. Extracted genomic DNA was normalized to a concentration of 5ng in 50µl in 96 well plates. Restriction enzyme test was performed to determine which restriction enzyme would provide the most cut size combinations and will best target a higher number of loci and coverage. The most consistent enzyme combination across the samples was Hin1II and TasI. Genomic DNA was then processed into ddRAD libraries using these restriction enzymes purchased from New England Biolabs (NEB) following the ddRAD protocol (Peterson et al., 2012). Agarose gels (1%) were run on the DNA to determine the quality of the DNA and samples that had low molecular weight smears were used for SPRIselect size selection using a 0.4X ratio of SPRIselect beads to DNA volume. Fluorescent quantification of samples was carried out using AccuBlue High Sensitivity solution and a standard curve. An aliquot of 100ng of each sample was transferred to a new working plate and concentrated to 15µl DNA cleanup using 30µl AMPureXP beads (2X bead to DNA ratio), leaving the beads in after elution. Restriction digest was performed using NEB enzymes Hin1II and TasI and DNA was cleaned up using 22.5µl PEG NaCl buffer (1.5X buffer to DNA ratio) with retained beads from first cleanup.
Sample concentrations were normalized and adaptor ligated. Samples were pooled with unique barcodes and pools cleaned using AMPureXP beads at a 1.5X ratio of AMPure beads to DNA. Size selection was carried out using Blue Pippin size select targeting a range of 570-645bp. Libraries were PCR amplified using different uniquely indexed PCR primers for each pool. Each pool was separated into eight different PCR reactions. PCR product were pooled and cleaned twice using AMPureXP beads (1X ratio of beads to PCR product for both cleanups) after which pools were run on AATI Fragment Analyzer using the HS NGS kit to get the sizing of the pools. qPCR was performed on pools using KAPA Library Quant kit and combined in equimolar ratios adjusting as necessary for sample number after which it was send to sequencing center and libraries were sequenced in one lane on an Illumina HiSeq4000.

De novo Assembly, Read Mapping, SNP discovery
Quality filtering of raw reads and demultiplexing based on barcode was conducted using process radtags in the Stacks software package (Catchen et al., 2011). The dDocent pipeline (version 2.5.2) was used for read trimming and de novo assembly . For quality filtering, the program trimmomatic v0.32 (Bolger, Lohse, & Usadel, 2014) was used to trim low quality bases that are below quality score of 20 from the beginning and end of reads, and an additional sliding 5bp window that will trim bases when the average quality score drops below 10 and removes sequences corresponding to the Illumina adapters.
For de novo assembly of each of the reference genomes (S. aurita and S. maderensis) using the dDocent pipeline, unique sequences in each of the individuals were identified and their coverage counted in the entire data set. Results of the unique sequences with a coverage level of one(1) to twenty(20) were tabulated and unique sequences with a coverage level of four(4) was selected. Next was to select the unique sequences with a coverage of four that appear in most of the individuals for the assembly. A cut off value of the number of individuals a unique sequence should at least appear in was five (5) individuals. Next, the unique sequences with the selected cut off values of four and five were collapsed into FASTA format and assign the header (contig) for the next steps in the denovo assembly. The forward reads from each of the contigs were extracted from the FASTA file and the program CD-hit (Fu et al., 2012;Li & Godzik, 2006) was used to cluster the forward reads by 80% similarity into RAD loci. Next, the assembly program Rainbow (Chong, Ruan & Wu, 2012) in the dDocent pipeline was also used to recluster the results of the CD-hit program based on 90% similarity into groups representing alleles at the RAD loci. The longest contig for each cluster was selected as the representative reference sequence for that RAD locus. Finally, clustering of the reference sequences were rechecked by clustering again based on an overall sequence similarity of 90% using the program CD-HIT (Fu et al., 2012;Li & Godzik, 2006) and the reference assembly outputted as a FASTA file. SNP calling and genotyping were also performed using the dDocent pipeline. For read mapping, reads were mapped to the denovo reference file in the FASTA file using the MEM algorithm of BWA (Li & Durbin, 2009;Li & Durbin, 2010) with mismatch parameter lowered from 4 to 3, and the gap opening penalty lowered from 6 to 5. The program Freebayes (Garrison & Marth, 2012) was used to obtain raw variant calls and SNP genotyping which were subjected to several filtering steps to reduce false positives in the SNP calls.

SNP filtering
SNP filtering of the data was performed separately for each species. The criteria that was used in the filtering of the raw variants include the use of VCFtools (Danecek et al., 2011) and custom bash scripts. First, variants that had been successfully genotyped in 50% of individuals with a minimum quality score of 20 and a minor allele count of 3 were retained. Next, loci with a minimum depth less than 5 and minor allele frequency less than 1% were removed. Next, individuals with 60% missing data was filtered using the script (filter_missing_ind.sh) . The script (pop_missing_filter.sh) (Puritz et al., 2014) was used to filter out loci by population with 90% missing data. The next filter was based on freebayes generated VCF file using the criteria such as site depth, allelic balance at the heterozygote, properly paired site and maximum mean depth using the dDocent_filters script . Variant calls were then decomposed into SNP and INDEL calls with INDELS removed using VCFtools to produce a VCF file of only SNP calls. The bash script rad_haplotyper.pl (Hollenbeck et al., 2017) was used to filter out possible paralogs, possible low-coverage sites, and previously missing genotypes. Loci with minor allele frequency threshold (maf) < 0.05 were filtered, in order to remove uninformative SNPs. SNPs were then filtered to only include loci with two alleles using VCFtools. The program BAYESCAN (Foll and Gaggiotti, 2008) was used to identify individual outlier loci. The program was run with all default values, with 30 pilot runs and a thinning interval of 100. Significance of outlier loci was determined using a qvalue which directly corresponded to a false discovery rate of 0.05. The outlier loci were excluded from SNP loci and the neutral SNPs were utilized for downstream analysis.

Evaluation of genetic diversity between populations
Population genetic statistics (Wright's F statistics FIS, and observed and expected heterozygosity Ho, He) were calculated for the populations using divBasic function in diversity package in R (Keenan et al., 2013). The populations program in the stacks software (Catchen et al. 2011) was used to estimate the sites in each population, percentage polymorphic sites and the average frequency of the major allele (P) at the sites. Deviations from Hardy-Weinberg equilibrium was assessed using GENEPOP v4.0 (Rousset 2008).
The global estimate for genetic differentiation (FST) across all samples and loci was calculated following Weir and Cockerham (1984) using the program Adegenet (Jombart, 2008) in R statistical package.

Evaluation of population structure
A UPGMA dendrogram with bootstrap support to visualize the genetic distance between populations was calculated using the function aboot in the poppr package in R with 1000 bootstrap replicates (Jombart, 2008). Node labels represent bootstrap support greater than 50% (>50%). The diffcalf function in the R package diveRsity (Keenan et al., 2013) was used to calculate the pairwise FST values for each population and perform significance of genetic differentiation calculation of 95% confidence interval. A higher level of population structure thus individuals nested within populations and population nested within geographic region (southern and northern CECAF) was estimated using an Analysis of Molecular Variance (AMOVA) based approach (Excoffier et al. 1992) implemented in GenoDive (Meirmans and Van Tienderen, 2004)

Sardinella aurita: de novo assembly and SNP filtering
Samples from six populations of S. aurita (Mauritania, Senegal, Guinea, Ghana, Togo and Benin) were sequenced and genotyped using ddRAD sequencing. Only individuals with greater or equal to 500,000 reads were used in the analysis.

Genetic diversity in Sardinella aurita
The average observed (Ho) and expected (He) heterozygosity was observed to be similar across sampling sites, with no significant departure from Hardy-Weinberg Equilibrium (HWE). Observed heterozygosity (Ho) values ranged from 0.31 to 0.32 and expected heterozygosity (He) ranged from 0.26 to 0.27 (Table 2.2). The overall inbreeding coefficient (FIS) considering all the population was -0.562 and the mean FIS values ranged from -0.42 to -0.44 (considered as zero). The number of polymorphic loci identified in each location ranged from 5,133 in Togo to 5,675 in Mauritania. Nucleotide diversity was similar across all sampling sites and no private alleles were observed for any of the sampled populations.  The variation among populations within the groups was 2%.

Genetic diversity in Sardinella maderensis
Genetic diversity values across the populations surveyed was estimated using the

Population structure in Sardinella maderensis
The overall FST of the full data set was 0.002, and pairwise FST values ranged from -0.005 for the Benin-Togo comparison to 0.016 for the Guinea-Togo comparison (Table   2.6). Genetic differentiation between all population comparisons was not significant (P > 0.05). The two geographic groups (North CECAF and south CECAF) accounted for 1% of the total observed variation, and the remaining variation among populations was 1%. These results also support the pairwise FST values where no genetic differentiation is observed between the population. The UPGMA dendrogram generated using the Nei genetic distance as a distance metric also shows no apparent structure in the population. As a further test of population structure in this species, the population membership probability for each (Figure 2.4) clearly shows one grouping where Mauritania, Senegal, Guinea, Benin and Togo shows as one population with strong admixture between them.

DISCUSSION
In this study, our aim was to determine the population genetic structure of S. aurita and S. maderensis across selected localities in the coast of West Africa (Mauritania to Benin) in order to understand spatial patterns of population structure of these two sardinellas. The use of hundreds or thousands of genome-wide polymorphic markers (6,078 for S. aurita and 6,767 for S. maderensis) should allow for the detection of genetic differentiation where inferences from a single or a few marker-based inferences fail (Pukk et al. (2013). This study showed no apparent genetic differentiation and population structure in sardinella species in the coast of West Africa.
The results of this work are not unexpected for highly migratory species such as S.
aurita and S. maderensis, and indicate that no barriers to gene flow exist in the region extending from Mauritania to Benin. Generally, low FST values are observed in highly migratory pelagic fishes such as sardines, sardinellas, and anchovies (Ward et al., 1994). Sukumaran et al., (2017) reported the lack of genetic subdivision between mackerel populations from India suggesting adequate gene flow and panmixia. Homogeneity among populations of Indian oil sardine (sardinella longiceps) populations, as detected using mitochondrial genes, has also been reported (Sukumaran, et al., 2016). Studies performed using a comparable number of markers utilized in this study (106,652 SNPs obtained using RADseq) also found a single panmictic population in the Japanese eel that lives in various environments including fresh, brackish and coastal waters from Japan (Gong et al., 2019). this study were comparable to those found in a previous work by (Chi, 1998) who studied genetic differentiation using allozyme markers in S. aurita from Congo, Ghana, Ivory Coast and Venezuela (pairwise FST between 0 -0.0055). Our survey of 6,767 SNP loci from S. maderensis indicating no genetic differentiation (global FST = 0.001, p<0.01) is also comparable to (French et al., 1995) who previously carried out genetic differentiation of this species using allozyme markers from Senegal, Cote divoire, Ghana and Congo and found out that genetic differentiation (FST = 0.0085) exists for S. maderensis using allozyme markers however regarded this differentiation as to not to be biologically significant. The lack of population structure observed using RAD-SNPs is also consistent with the results of Kinsey et al. (1994) in S. aurita from the Gulf of Mexico to the coast of South Carolina, who studied population structure using allozyme markers and found no evidence of population structure and regarded the population to be in a state of panmixia.
S. aurita and s. maderensis are migratory fish and because of their pattern of migration (both north and south, as well as an inshore-offshore pattern of movements), there is the possibility of individual stocks to be found in the territorial waters of each of these countries at different stages of their life cycles (Brainerd, 1991). Also the current system creates seasonal upwellings which mainly account for the distribution of the fishery resources in the region. The abundance and distribution of these species rely on variability of the coastal upwelling intensity and the associated variations in phytoplankton production (Arístegui et al., 2006). For instance, S. aurita is found in Senegal waters in May -June where changes in temperature causes them to migrate to Mauritania. The Ghana-Cote d'Ivoire upwelling causes the species to migrate towards Cote d'Ivoire, Togo and Benin.
The absence of significant difference among locations and in allele frequencies among populations suggest gene flow occurring between these locations may neutralize population differentiation due to genetic drift or a balancing selection is maintained across these locations (Karl and Avise 1992). According to Palumbi and Baker (1994), pelagic fishes such as sardines are expected to show little panmictic subdivision because of the apparent lack of physical barriers in the marine realm, which favors a high level of egg, larva, and adult dispersion in the species, as well as greatly facilitate extensive gene flow among the populations.
One major concern in this study is the number of samples retained for downstream analysis. Sample numbers in each population (initial target was 20 per population) were limited by the number of individuals for which DNA of high enough quality for library preparation was obtained. All samples from Guinea were removed from the S. aurita analysis as a result of only seven individuals remaining as a representative for Guinea populations to call genotypes and determine allele frequencies, since small sample sizes can lead to an overestimation of genetic differentiation (Gong et al., 2019). given that the number of biallelic markers examined is high (n > 1000) (Willing et al., 2012).
As effective fishery management requires to correctly quantify the connectivity patterns among stocks, genetic studies are of much importance as they provide strength to indicate differentiation in species. The genetic analysis of S. aurita and S. maderensis has shown no subdivision in populations obtained from Mauritania to Benin and did not support the division of Guinea as a Northern stock, Mauritania, Senegal classified as to be in Northern CECAF and Ghana, Togo, Benin as Western stock. Based on these results, fish in this region should be considered as belonging to a single panmictic population.
Samples were not obtained for central stock (Nigeria and Cameroon) and southern stock (Gabon, the Democratic Republic of the Congo, the Congo and Angola), so this study should be expanded to include these regions. These results will therefore enhance the decision making towards the management of these species in the Eastern Central Atlantic region for sustainability and form a baseline for future analysis of this populations.
Moreover, this work has provided sequence information and assemblies for two sardinella species that can be used in the future to further develop other genotyping platforms, through mining of microsatellite markers or panels of informative SNPs.
Population genetics inferences have the potential to increase accuracy with greater sample sizes which can also increase the precision of allele frequency estimates as a representative of the population within that locality. Depending on the number of markers used in future studies, I will therefore recommend to incorporate larger sample sizes to ensure that most informative alleles can be sampled at frequencies that reflect those in the total population (Hale et al., 2012). Also, samples collected from different years can asist in estimation of changes in allele frequencies with respect to different temporal scale (Dunlap, 2017). I will also recommend sampling from different time frames (year) to further ascertain the pattern of differentiation spatially and temporally because the dynamics of temporal genetic structure may even be more informative than spatial dynamics in marine populations (Hedgecock et al., 2007).

CONCLUSION
This study detected low levels of genetic differentiation and lack of population structure in the S. aurita and S. maderensis from the geographic locations surveyed, suggesting the existence of one population from Mauritania, Senegal, Ghana, Togo and Benin as compared to the existence of subdivision proposed by the CECAF working group for either of the species.