Biology see attached

Anonymous
timer Asked: Oct 22nd, 2018
account_balance_wallet $9.99

Unformatted Attachment Preview

Phylogenomics and Coalescent Analyses Resolve Extant Seed Plant Relationships Zhenxiang Xi1, Joshua S. Rest2, Charles C. Davis1* 1 Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America, 2 Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York, United States of America Abstract The extant seed plants include more than 260,000 species that belong to five main lineages: angiosperms, conifers, cycads, Ginkgo, and gnetophytes. Despite tremendous effort using molecular data, phylogenetic relationships among these five lineages remain uncertain. Here, we provide the first broad coalescent-based species tree estimation of seed plants using genome-scale nuclear and plastid data By incorporating 305 nuclear genes and 47 plastid genes from 14 species, we identify that i) extant gymnosperms (i.e., conifers, cycads, Ginkgo, and gnetophytes) are monophyletic, ii) gnetophytes exhibit discordant placements within conifers between their nuclear and plastid genomes, and iii) cycads plus Ginkgo form a clade that is sister to all remaining extant gymnosperms. We additionally observe that the placement of Ginkgo inferred from coalescent analyses is congruent across different nucleotide rate partitions. In contrast, the standard concatenation method produces strongly supported, but incongruent placements of Ginkgo between slow- and fast-evolving sites. Specifically, fast-evolving sites yield relationships in conflict with coalescent analyses. We hypothesize that this incongruence may be related to the way in which concatenation methods treat sites with elevated nucleotide substitution rates. More empirical and simulation investigations are needed to understand this potential weakness of concatenation methods. Citation: Xi Z, Rest JS, Davis CC (2013) Phylogenomics and Coalescent Analyses Resolve Extant Seed Plant Relationships. PLoS ONE 8(11): e80870. doi:10.1371/journal.pone.0080870 Editor: Paul V. A. Fine, University of California, Berkeley, United States of America Received September 4, 2013; Accepted October 15, 2013; Published November 21, 2013 Copyright: © 2013 Xi et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This study was funded by a grant from the United States National Science Foundation DEB-1120243 to C.C.D. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. * E-mail: cdavis@oeb.harvard.edu Introduction which were previously thought to be sister to angiosperms based on morphological characters (i.e., the anthophyte hypothesis; [13,14]), are now grouped with other extant gymnosperms using molecular data. Establishing the phylogenetic placement of gnetophytes among extant gymnosperms, however, remains problematic. Recent molecular studies have suggested three conflicting hypotheses of gnetophyte relationships: the gnecup (i.e., gnetophytes sister to cupressophytes; [9,15]), gnepine (i.e., gnetophytes sister to Pinaceae; [7,8,10,16-24]), and gnetifer (i.e., gnetophytes sister to conifers; [5,25]) hypotheses (Figure 1C). In addition, early studies concatenating multiple genes placed Ginkgo alone as sister to conifers and gnetophytes within the extant gymnosperm clade [7-11,16-18,26-28]. However, more recent studies using additional genes have suggested that a clade containing cycads plus Ginkgo cannot be excluded as sister to all remaining extant gymnosperms (Figure 1D) [15,19,21-24,29,30]. In particular, attempts to include data that are less prone to saturation due to high rates of substitution (e.g., amino acid sequences and slow-evolving nucleotide sequences) have lead to increasing support for the placement Seed plants originated at least 370 million years ago [1] and include more than 260,000 extant species [2], making them the most species rich land plant clade. These species are placed in five main lineages: angiosperms, conifers, cycads, Ginkgo, and gnetophytes [3]. By far the greatest species diversity is found in the angiosperms; the remaining four lineages constitute the extant gymnosperms (Figure 1A), meaning “naked seeds”. Today’s gymnosperms are a shadow of their former glory–only ~1,000 species currently exist [2]. Nevertheless, they are of huge ecological and economic importance, especially for their timber and horticultural value. Despite tremendous efforts to resolve phylogenetic relationships among the five extant seed plant lineages using molecular data, these relationships remain uncertain. For example, early studies identified the monophyly of extant gymnosperms [4-11], but more recent studies using duplicate gene rooting have suggested that cycads are instead more closely related to angiosperms than they are to other extant gymnosperms (Figure 1B) [3,12]. Similarly, the gnetophytes, PLOS ONE | www.plosone.org 1 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo Figure 1. Conflicting phylogenetic relationships among extant gymnosperms. (A) The four main lineages of extant gymnosperms: (1) conifers (Pinus resinosa), (2) cycads (Cycas sp.), (3) Ginkgo biloba, and (4) gnetophytes (Ephedra chilensis). (B) Two main hypotheses for phylogenetic relationships of gymnosperms. (C) Three main hypotheses for the phylogenetic placement of gnetophytes. (D) Two main hypotheses for the phylogenetic placement of Ginkgo. doi: 10.1371/journal.pone.0080870.g001 PLOS ONE | www.plosone.org 2 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo of cycads plus Ginkgo as sister to all remaining extant gymnosperms [15,21,23,24]. For all of these reasons, a broader comparative phylogenomic assessment of these questions is warranted to better understand the evolution of extant seed plants. Advances in next-generation sequencing and computational phylogenomics represent tremendous opportunities for inferring species relationships using hundreds, or even thousands, of genes. Until now the reconstruction of broad seed plant phylogenies from multiple genes has relied almost entirely on concatenation methods [7-11,15-19,21,23,24,29,31-37], in which phylogenies are inferred from a single combined gene matrix [38]. These analyses assume that all genes have the same, or very similar, evolutionary histories. Theoretical and simulation studies, however, have shown that concatenation methods can yield misleading results, especially if gene trees are highly heterogeneous [39-43]. In contrast, recently developed coalescent-based methods estimate the species phylogeny from a collective set of gene trees, which permit different genes to have different evolutionary histories [44-46]. Both theoretical and empirical studies have shown that coalescent methods can better accommodate gene heterogeneity [44-48]. Here, our phylogenomic analyses of 14 species represent the first coalescent-based species tree estimation of seed plants. By incorporating hundreds of nuclear genes as well as a full complement of plastid genes, we also provide a direct comparison of phylogenetic relationships inferred from nuclear and plastid genomes. Table 1. Data sources of nuclear gene sequences included in our phylogenetic analyses. No. of sequences sequences used Average used in in phylogenetic GC- Sources clustering analyses content [50] 5,724 107 47.1% [51] 32,987 251 45.1% [50] 8,224 184 44.0% Cycas rumphii [50] 4,211 118 45.1% Ginkgo biloba [50] 3,739 88 44.7% Gnetum gnemon [50] 2,016 44 44.8% Nuphar advena [51] 68,266 266 48.1% Picea glauca [50] 23,693 288 44.7% Picea sitchensis [50] 13,298 283 44.9% Pinus contorta [50] 7,844 260 44.5% Pinus taeda [50] 28,670 271 44.8% [52] 21,094 305 54.3% [50] 3,170 80 43.9% [51] 11,104 214 45.0% Species Adiantum capillusveneris Amborella trichopoda Cryptomeria japonica Selaginella moellendorffii Welwitschia mirabilis Results and Discussion No. of coding Zamia vazquezii Species with sequenced genome is highlighted in bold. Taxon and gene sampling of nuclear and plastid genes doi: 10.1371/journal.pone.0080870.t001 Our nuclear gene taxon sampling included 12 species representing all major lineages of extant seed plants (i.e., angiosperms [Amborella trichopoda and Nuphar advena], conifers [Cryptomeria japonica, Picea glauca, Picea sitchensis, Pinus contorta, and Pinus taeda], cycads [Cycas rumphii and Zamia furfuracea], Ginkgo biloba, and gnetophytes [Gnetum gnemon and Welwitschia mirabilis]) [3]. One fern (Adiantum capillus-veneris) and one lycophyte (Selaginella moellendorffii) were included as outgroups (Table 1). Of these 14 species, the coding sequences of Selaginella were obtained from a wholegenome sequencing project, and the rest were from deeply sequenced transcriptomes that each included at least 6,000 assembled unigenes. Using a Markov clustering algorithm [49], the 234,040 protein-coding sequences (sequences with inframe stop codons or shifted reading frames were excluded prior to clustering) from these 14 species were grouped into 14,215 gene clusters, of which 496 passed our initial criteria for establishing low-copy nuclear genes as described in the Materials and Methods section. Following this initial filter, the average numbers of sequences and species for each gene cluster were ten and eight, respectively. Additionally, of these 496 gene clusters, 305 remained following our paralogue pruning filter (see Materials and Methods), and the average number of species and sites for each gene cluster were nine and 509, respectively (Table S1). The final concatenated nuclear gene matrix included 155,295 nucleotide sites and 37.1% missing data (including gaps and undetermined characters). To compare the evolutionary history between nuclear and plastid genomes, we obtained the annotated plastid genomes from 12 seed plants (i.e., angiosperms [Amborella trichopoda and Nuphar advena], conifers [Cryptomeria japonica, Picea abies, Picea morrisonicola, Pinus koraiensis, and Pinus taeda], cycads [Cycas revoluta and Zamia furfuracea], Ginkgo biloba, and gnetophytes [Gnetum parvifolium and Welwitschia mirabilis]), plus one fern (Adiantum capillus-veneris) and one lycophyte (Selaginella moellendorffii) as outgroups (Table 2). These 14 species represent the same taxonomic placeholders as those in our nuclear gene analyses. The 685 protein-coding sequences from the 14 plastid genomes were grouped into 59 gene clusters, of which 47 remained following the filtering criteria described above. The average number of species and sites for these 47 gene clusters were 12 and 1,063, respectively (Table S2). The final concatenated plastid gene matrix included 49,968 nucleotide sites and 14.1% missing data. PLOS ONE | www.plosone.org 3 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo nuclear genes support the gnepine hypothesis (i.e., gnetophytes sister to Pinaceae [Picea and Pinus]) with 64 BP and 85 BP, respectively (Figure 2A). In contrast, our coalescent and concatenation analyses of plastid genes support the gnecup hypothesis (i.e., gnetophytes sister to cupressophytes [Cryptomeria]) with 60 BP and 94 BP, respectively (Figure 2B). Moreover, in each of these cases the rival topology is rejected using the approximately unbiased (AU) test [60]: the gnecup placement is rejected for concatenated nuclear gene matrix (pvalue = 0.001) and the gnepine placement is rejected for concatenated plastid gene matrix (p-value = 0.001). This conflicting placement between the nuclear and plastid genomes is consistent with previous studies (e.g., 15,19,22), although our study is a direct comparison using a similar set of species for both genomes. These results suggest that the nuclear and plastid genomes of gnetophytes may have distinctly different evolutionary histories. An additional well-supported placement we uncovered here relates to cycads and Ginkgo. Our coalescent and concatenation analyses of nuclear genes strongly support (100 BP and 93 BP, respectively) cycads (i.e., Cycas and Zamia) plus Ginkgo as sister to all remaining extant gymnosperms (Figure 2A and see red dots in Figure 1D for clades under consideration). The rival placement of Ginkgo alone as sister to conifers and gnetophytes (i.e., the “Gingko alone” hypothesis) is rejected for the concatenated nuclear gene matrix (p-value = 0.004, AU test). In addition, our coalescent analyses of plastid genes similarly support (71 BP) the monophyly of cycads plus Ginkgo (Figure 2B). The concatenation analyses of plastid genes, in contrast, weakly support (56 BP) the “Gingko alone” hypothesis. Because sequences from both cycads and Ginkgo were not present in all 305 nuclear genes, we conducted an additional analysis using only those genes that included both cycads and Ginkgo (sequences from both cycads and Ginkgo were present in all 47 plastid genes; see Table 2). This allows us to test if the phylogenetic placement of Ginkgo inferred from nuclear genes is sensitive to missing data. Although the number of nuclear gene clusters declines to 69 when applying this taxon filter, the results are identical to those above: the coalescent and concatenation analyses strongly support (95 BP and 97 BP, respectively) cycads plus Ginkgo as sister to all remaining extant gymnosperms. To further investigate if the placement of Ginkgo is sensitive to the number of sampled genes, we randomly subsampled the 305 nuclear genes in four different gene size categories (i.e., 25, 47, 100, or 200 genes; 10 replicates each). We similarly subsampled the 47 plastid genes (i.e., 25 genes with 10 replicates). Even as the sample size declines, the coalescent and concatenation analyses of nuclear genes strongly support (≥80 BP) cycads plus Ginkgo as sister to all remaining extant gymnosperms. Support for this relationship only dropped below 80 BP when the number of subsampled nuclear genes was 25 for the coalescent analyses (Figure 3A). For the 25 subsampled plastid genes, the coalescent analyses also support cycads plus Ginkgo with ≥80 BP. In contrast, concatenation analyses of 25 subsampled plastid genes support the “Gingko alone” hypothesis with ≥80 BP (Figure 3A). Table 2. Data sources of plastid gene sequences included in our phylogenetic analyses. No. of sequences used in GenBank Species phylogenetic Average GC- accession number analyses content NC_004766 46 42.8% Amborella trichopoda NC_005086 44 40.1% Cryptomeria japonica NC_010548 46 38.0% Cycas revoluta NC_020319 47 40.3% Ginkgo biloba NC_016986 47 40.4% Gnetum parvifolium NC_011942 33 38.6% Nuphar advena NC_008788 44 40.6% Picea abies NC_021456 36 40.7% Picea morrisonicola NC_016069 35 40.7% Pinus koraiensis NC_004677 36 40.5% Pinus taeda NC_021440 36 40.4% NC_013086 47 50.8% NC_010654 32 37.2% 32 41.4% Adiantum capillusveneris Selaginella moellendorffii Welwitschia mirabilis Zamia furfuracea JQ770198JQ770303 doi: 10.1371/journal.pone.0080870.t002 Inferring Species Relationships Using Coalescent and Concatenation Methods Species relationships were first estimated from nucleotide sequences using the recently developed coalescent method: Species Tree Estimation using Average Ranks of Coalescence (STAR) [46]. Since this method is based on summary statistics calculated across all gene trees, a small number of outlier genes that significantly deviate from the coalescent model have relatively little effect on the accurate inference of the species tree [48]. We note that while all plastid genes are generally expected to share the same history, evidence of recombination, heteroplasmy, and incomplete lineage sorting in plastid genomes suggests that this may not always apply (e.g., 53-57). Thus, we additionally analyzed plastid genes using the coalescent method. We compared the results from coalescent analyses of both nuclear and plastid genes with those from concatenation analyses using maximum likelihood (ML) as implemented in RAxML [58]. Statistical confidence was established for both methods using a multilocus bootstrapping approach [59], in which genes were resampled with replacement followed by resampling sites with replacement within each gene. Our species trees inferred from coalescent and concatenation methods largely agree with each other (Figure 2). Similarly, analyses of nuclear and plastid genes are largely in agreement. All analyses strongly support (≥87 bootstrap percentage [BP]) the monophyly of extant gymnosperms. The lone placement that shows conflict between the nuclear and plastid gene trees is for the gnetophytes (i.e., Gnetum and Welwitschia). Our coalescent and concatenation analyses of PLOS ONE | www.plosone.org 4 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo Figure 2. Species trees inferred from (A) 305 nuclear genes and (B) 47 plastid genes using the coalescent method (STAR). Bootstrap percentages (BPs) from STAR/RAxML are indicated above each branch; an asterisk indicates that the clade is supported by 100 BPs from both STAR and RAxML. Branch lengths were estimated by fitting the concatenated matrices to the inferred topology from STAR. doi: 10.1371/journal.pone.0080870.g002 PLOS ONE | www.plosone.org 5 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo Figure 3. Summary of bootstrap percentages (BPs) from coalescent and concatenation analyses using different gene subsampling and rate partitions. (A) BPs from coalescent and concatenation analyses using different gene subsampling. The 305 nuclear genes were subsampled for four different gene size categories (i.e., 25, 47, 100, or 200 genes; 10 replicates each), and the 47 plastid genes were subsampled for 25 genes (10 replicates). Cells with hatching indicate that support for the placement of Ginkgo biloba from all replicates is below 80 BP; colored cells indicate relationships that received bootstrap support ≥80 BP from at least one replicate (pink = cycads plus Ginkgo as sister to all remaining extant gymnosperms, yellow = Ginkgo alone as sister to conifers and gnetophytes within extant gymnosperms; see also Figure 1D). (B) BPs from coalescent and concatenation analyses across different nucleotide rate partitions. Parsimony informative sites in concatenated matrices were sorted based on estimated evolutionary rates, and subsequently divided into two equal partitions. The index of substitution saturation (ISS) was used to measure nucleotide substitution saturation for sites within each rate partition. The two critical ISS values, i.e., ISS.C1 and ISS.C2, were estimated using an asymmetrical and symmetrical topology, respectively (for data including more than 32 species, only values estimated from 32 terminals are shown here). doi: 10.1371/journal.pone.0080870.g003 Accommodating rate heterogeneity in coalescent and concatenation analyses Thus, our results are robust to the number of genes sampled, including the discordant placements of Ginkgo between coalescent and concatenation analyses of plastid genes. PLOS ONE | www.plosone.org Despite the fact that our coalescent and concatenation analyses largely agree with each other, we are interested in exploring the influence of nucleotide substitution rates on phylogenetic inference of seed plant relationships. It has long been appreciated that elevated rates of molecular evolution 6 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo can lead to multiple substitutions ...
Purchase answer to see full attachment

Tutor Answer

LeoProfessor
School: Rice University

Attached.

Running Head: GENETICS, EVOLUTION AND BIODIVERSITY

Genetics, Evolution and Biodiversity (Manuscript)
Student’s Name

Institutional affiliation

1

GENETIC, EVOLUTION AND BIODIVERSITY

2

Abstract

The research on genetic variation within plant species depend on random markers on the
DNA, but the correlation with quantitative variation is contentious. The functional variation in the
quantitative characters plays a significant role in the determination of the evolutionary potential and
a tool for genetic resources. The markers are useful in the assessment of the genetic and
phylogenetic diversity of plants by directing on specific genes on plant families. According to the
research preview on the experiment, the data on trees derived from DNA sequences does not give a
full insight of the phylogeny of seed plants. The few living lineages represent less lineages
according to the fossil records. The nucleotide data in molecular analysis supports hypotheses that
are conflicting and thus advanced analysis is crucial. The experiment focuses on using an improved
way of sampling gymnosperm diversity that will help in the estimation of the seed plant phylogeny.
The research uses the targeted genes on the extracted DNA to calculate the significance of genetic
and the phylogenetic diversity which determines the rate of evolution. The laboratory research
involved extraction of the targeted samples of DNA from the seed samples of plants. The extracted
DNA was amplified on a specific region of the DNA for each sample using PCR. The amplified
sample was run through the agarose gel electrophoresis to obtain DNA sequences that were used to
construct a phylogenetic tree. The phylogenetic tree constructed was based on the results obtained
on visualization of the bases of DNA samples but the focus was on the cycads and the Ginkgo.

Introduction

The Study of the evolutionary pattern of plants has taken scientist year with the case of
environmental progression. The understanding of the evolutionary patterns through the analysis of
the genetic components, majorly the DNA play a significant role in the conservation of plant

GENETIC, EVOLUTION AND BIODIVERSITY

3

species. Plant biologists have specific goals in their research work; the evolutionary scientist has the
primary goal to acquire knowledge on the genetic diversification of plants. The research on the
evolutionary lineage on the seed plants through the study of physical characters gives less
information. In-situ conservation according to studies plays a significant role in genetic
conservation.

The discovery of large phylogenies is crucial to the scientist in their understanding of the
plant life. Macro-evolutionary patterns refer to the conspicuous characters of a plant and can help in
structuring the shape of plant life. According to previous research, phylogenies aid in the analysis
of evolutionary, systematic and ecologic relations of different plants including the non-flowering
plants such as algae. Although, a little experiment can explain the phylogeny of most seeded plants,
the limited resources to perform such laboratory studies explains the scientific gap present. The
irregular distribution of plant varieties and their presence throughout is a challenge to the research
based on evolutionary relationships (Mathews, 2009). Evolutionary biologists opine that
diversification of the plant species has its isolation based on the clade age. The diversification does
not depend on the range of clade ages under examination and thus provides a general conclusion
according to the experiments. The scientific studies on the phylogenetic relationship give a forum
to contrast the hypothesis on the evolutionary chain of the seeded plant species.

Despite the molecular phylogenetic reviews on different plant species, the phylogeny of
gymnosperms is unknown, and that of Gnetales is still a controversial issue. According to a
phylotranscriptomic study conducted on thirteen families from the angiosperms and gymnosperms,
cycads plus Ginkgo is closely related to the remaining gymnosperms in the classification. The rate
of molecular evolution on the angiosperms and Gnetales is common due to the similarity in

GENETIC, EVOLUTION AND BIODIVERSITY

4

selective pressure undergone during evolution. The scientific research in the present today aims at
filli...

flag Report DMCA
Review

Anonymous
Tutor went the extra mile to help me with this essay. Citations were a bit shaky but I appreciated how well he handled APA styles and how ok he was to change them even though I didnt specify. Got a B+ which is believable and acceptable.

Brown University





1271 Tutors

California Institute of Technology




2131 Tutors

Carnegie Mellon University




982 Tutors

Columbia University





1256 Tutors

Dartmouth University





2113 Tutors

Emory University





2279 Tutors

Harvard University





599 Tutors

Massachusetts Institute of Technology



2319 Tutors

New York University





1645 Tutors

Notre Dam University





1911 Tutors

Oklahoma University





2122 Tutors

Pennsylvania State University





932 Tutors

Princeton University





1211 Tutors

Stanford University





983 Tutors

University of California





1282 Tutors

Oxford University





123 Tutors

Yale University





2325 Tutors