Biology see attached

Anonymous
timer Asked: Oct 22nd, 2018
account_balance_wallet $20

Phylogenomics and Coalescent Analyses Resolve Extant Seed Plant Relationships Zhenxiang Xi1, Joshua S. Rest2, Charles C. Davis1* 1 Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America, 2 Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York, United States of America Abstract The extant seed plants include more than 260,000 species that belong to five main lineages: angiosperms, conifers, cycads, Ginkgo, and gnetophytes. Despite tremendous effort using molecular data, phylogenetic relationships among these five lineages remain uncertain. Here, we provide the first broad coalescent-based species tree estimation of seed plants using genome-scale nuclear and plastid data By incorporating 305 nuclear genes and 47 plastid genes from 14 species, we identify that i) extant gymnosperms (i.e., conifers, cycads, Ginkgo, and gnetophytes) are monophyletic, ii) gnetophytes exhibit discordant placements within conifers between their nuclear and plastid genomes, and iii) cycads plus Ginkgo form a clade that is sister to all remaining extant gymnosperms. We additionally observe that the placement of Ginkgo inferred from coalescent analyses is congruent across different nucleotide rate partitions. In contrast, the standard concatenation method produces strongly supported, but incongruent placements of Ginkgo between slow- and fast-evolving sites. Specifically, fast-evolving sites yield relationships in conflict with coalescent analyses. We hypothesize that this incongruence may be related to the way in which concatenation methods treat sites with elevated nucleotide substitution rates. More empirical and simulation investigations are needed to understand this potential weakness of concatenation methods. Citation: Xi Z, Rest JS, Davis CC (2013) Phylogenomics and Coalescent Analyses Resolve Extant Seed Plant Relationships. PLoS ONE 8(11): e80870. doi:10.1371/journal.pone.0080870 Editor: Paul V. A. Fine, University of California, Berkeley, United States of America Received September 4, 2013; Accepted October 15, 2013; Published November 21, 2013 Copyright: © 2013 Xi et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This study was funded by a grant from the United States National Science Foundation DEB-1120243 to C.C.D. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. * E-mail: cdavis@oeb.harvard.edu Introduction which were previously thought to be sister to angiosperms based on morphological characters (i.e., the anthophyte hypothesis; [13,14]), are now grouped with other extant gymnosperms using molecular data. Establishing the phylogenetic placement of gnetophytes among extant gymnosperms, however, remains problematic. Recent molecular studies have suggested three conflicting hypotheses of gnetophyte relationships: the gnecup (i.e., gnetophytes sister to cupressophytes; [9,15]), gnepine (i.e., gnetophytes sister to Pinaceae; [7,8,10,16-24]), and gnetifer (i.e., gnetophytes sister to conifers; [5,25]) hypotheses (Figure 1C). In addition, early studies concatenating multiple genes placed Ginkgo alone as sister to conifers and gnetophytes within the extant gymnosperm clade [7-11,16-18,26-28]. However, more recent studies using additional genes have suggested that a clade containing cycads plus Ginkgo cannot be excluded as sister to all remaining extant gymnosperms (Figure 1D) [15,19,21-24,29,30]. In particular, attempts to include data that are less prone to saturation due to high rates of substitution (e.g., amino acid sequences and slow-evolving nucleotide sequences) have lead to increasing support for the placement Seed plants originated at least 370 million years ago [1] and include more than 260,000 extant species [2], making them the most species rich land plant clade. These species are placed in five main lineages: angiosperms, conifers, cycads, Ginkgo, and gnetophytes [3]. By far the greatest species diversity is found in the angiosperms; the remaining four lineages constitute the extant gymnosperms (Figure 1A), meaning “naked seeds”. Today’s gymnosperms are a shadow of their former glory–only ~1,000 species currently exist [2]. Nevertheless, they are of huge ecological and economic importance, especially for their timber and horticultural value. Despite tremendous efforts to resolve phylogenetic relationships among the five extant seed plant lineages using molecular data, these relationships remain uncertain. For example, early studies identified the monophyly of extant gymnosperms [4-11], but more recent studies using duplicate gene rooting have suggested that cycads are instead more closely related to angiosperms than they are to other extant gymnosperms (Figure 1B) [3,12]. Similarly, the gnetophytes, PLOS ONE | www.plosone.org 1 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo Figure 1. Conflicting phylogenetic relationships among extant gymnosperms. (A) The four main lineages of extant gymnosperms: (1) conifers (Pinus resinosa), (2) cycads (Cycas sp.), (3) Ginkgo biloba, and (4) gnetophytes (Ephedra chilensis). (B) Two main hypotheses for phylogenetic relationships of gymnosperms. (C) Three main hypotheses for the phylogenetic placement of gnetophytes. (D) Two main hypotheses for the phylogenetic placement of Ginkgo. doi: 10.1371/journal.pone.0080870.g001 PLOS ONE | www.plosone.org 2 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo of cycads plus Ginkgo as sister to all remaining extant gymnosperms [15,21,23,24]. For all of these reasons, a broader comparative phylogenomic assessment of these questions is warranted to better understand the evolution of extant seed plants. Advances in next-generation sequencing and computational phylogenomics represent tremendous opportunities for inferring species relationships using hundreds, or even thousands, of genes. Until now the reconstruction of broad seed plant phylogenies from multiple genes has relied almost entirely on concatenation methods [7-11,15-19,21,23,24,29,31-37], in which phylogenies are inferred from a single combined gene matrix [38]. These analyses assume that all genes have the same, or very similar, evolutionary histories. Theoretical and simulation studies, however, have shown that concatenation methods can yield misleading results, especially if gene trees are highly heterogeneous [39-43]. In contrast, recently developed coalescent-based methods estimate the species phylogeny from a collective set of gene trees, which permit different genes to have different evolutionary histories [44-46]. Both theoretical and empirical studies have shown that coalescent methods can better accommodate gene heterogeneity [44-48]. Here, our phylogenomic analyses of 14 species represent the first coalescent-based species tree estimation of seed plants. By incorporating hundreds of nuclear genes as well as a full complement of plastid genes, we also provide a direct comparison of phylogenetic relationships inferred from nuclear and plastid genomes. Table 1. Data sources of nuclear gene sequences included in our phylogenetic analyses. No. of sequences sequences used Average used in in phylogenetic GC- Sources clustering analyses content [50] 5,724 107 47.1% [51] 32,987 251 45.1% [50] 8,224 184 44.0% Cycas rumphii [50] 4,211 118 45.1% Ginkgo biloba [50] 3,739 88 44.7% Gnetum gnemon [50] 2,016 44 44.8% Nuphar advena [51] 68,266 266 48.1% Picea glauca [50] 23,693 288 44.7% Picea sitchensis [50] 13,298 283 44.9% Pinus contorta [50] 7,844 260 44.5% Pinus taeda [50] 28,670 271 44.8% [52] 21,094 305 54.3% [50] 3,170 80 43.9% [51] 11,104 214 45.0% Species Adiantum capillusveneris Amborella trichopoda Cryptomeria japonica Selaginella moellendorffii Welwitschia mirabilis Results and Discussion No. of coding Zamia vazquezii Species with sequenced genome is highlighted in bold. Taxon and gene sampling of nuclear and plastid genes doi: 10.1371/journal.pone.0080870.t001 Our nuclear gene taxon sampling included 12 species representing all major lineages of extant seed plants (i.e., angiosperms [Amborella trichopoda and Nuphar advena], conifers [Cryptomeria japonica, Picea glauca, Picea sitchensis, Pinus contorta, and Pinus taeda], cycads [Cycas rumphii and Zamia furfuracea], Ginkgo biloba, and gnetophytes [Gnetum gnemon and Welwitschia mirabilis]) [3]. One fern (Adiantum capillus-veneris) and one lycophyte (Selaginella moellendorffii) were included as outgroups (Table 1). Of these 14 species, the coding sequences of Selaginella were obtained from a wholegenome sequencing project, and the rest were from deeply sequenced transcriptomes that each included at least 6,000 assembled unigenes. Using a Markov clustering algorithm [49], the 234,040 protein-coding sequences (sequences with inframe stop codons or shifted reading frames were excluded prior to clustering) from these 14 species were grouped into 14,215 gene clusters, of which 496 passed our initial criteria for establishing low-copy nuclear genes as described in the Materials and Methods section. Following this initial filter, the average numbers of sequences and species for each gene cluster were ten and eight, respectively. Additionally, of these 496 gene clusters, 305 remained following our paralogue pruning filter (see Materials and Methods), and the average number of species and sites for each gene cluster were nine and 509, respectively (Table S1). The final concatenated nuclear gene matrix included 155,295 nucleotide sites and 37.1% missing data (including gaps and undetermined characters). To compare the evolutionary history between nuclear and plastid genomes, we obtained the annotated plastid genomes from 12 seed plants (i.e., angiosperms [Amborella trichopoda and Nuphar advena], conifers [Cryptomeria japonica, Picea abies, Picea morrisonicola, Pinus koraiensis, and Pinus taeda], cycads [Cycas revoluta and Zamia furfuracea], Ginkgo biloba, and gnetophytes [Gnetum parvifolium and Welwitschia mirabilis]), plus one fern (Adiantum capillus-veneris) and one lycophyte (Selaginella moellendorffii) as outgroups (Table 2). These 14 species represent the same taxonomic placeholders as those in our nuclear gene analyses. The 685 protein-coding sequences from the 14 plastid genomes were grouped into 59 gene clusters, of which 47 remained following the filtering criteria described above. The average number of species and sites for these 47 gene clusters were 12 and 1,063, respectively (Table S2). The final concatenated plastid gene matrix included 49,968 nucleotide sites and 14.1% missing data. PLOS ONE | www.plosone.org 3 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo nuclear genes support the gnepine hypothesis (i.e., gnetophytes sister to Pinaceae [Picea and Pinus]) with 64 BP and 85 BP, respectively (Figure 2A). In contrast, our coalescent and concatenation analyses of plastid genes support the gnecup hypothesis (i.e., gnetophytes sister to cupressophytes [Cryptomeria]) with 60 BP and 94 BP, respectively (Figure 2B). Moreover, in each of these cases the rival topology is rejected using the approximately unbiased (AU) test [60]: the gnecup placement is rejected for concatenated nuclear gene matrix (pvalue = 0.001) and the gnepine placement is rejected for concatenated plastid gene matrix (p-value = 0.001). This conflicting placement between the nuclear and plastid genomes is consistent with previous studies (e.g., 15,19,22), although our study is a direct comparison using a similar set of species for both genomes. These results suggest that the nuclear and plastid genomes of gnetophytes may have distinctly different evolutionary histories. An additional well-supported placement we uncovered here relates to cycads and Ginkgo. Our coalescent and concatenation analyses of nuclear genes strongly support (100 BP and 93 BP, respectively) cycads (i.e., Cycas and Zamia) plus Ginkgo as sister to all remaining extant gymnosperms (Figure 2A and see red dots in Figure 1D for clades under consideration). The rival placement of Ginkgo alone as sister to conifers and gnetophytes (i.e., the “Gingko alone” hypothesis) is rejected for the concatenated nuclear gene matrix (p-value = 0.004, AU test). In addition, our coalescent analyses of plastid genes similarly support (71 BP) the monophyly of cycads plus Ginkgo (Figure 2B). The concatenation analyses of plastid genes, in contrast, weakly support (56 BP) the “Gingko alone” hypothesis. Because sequences from both cycads and Ginkgo were not present in all 305 nuclear genes, we conducted an additional analysis using only those genes that included both cycads and Ginkgo (sequences from both cycads and Ginkgo were present in all 47 plastid genes; see Table 2). This allows us to test if the phylogenetic placement of Ginkgo inferred from nuclear genes is sensitive to missing data. Although the number of nuclear gene clusters declines to 69 when applying this taxon filter, the results are identical to those above: the coalescent and concatenation analyses strongly support (95 BP and 97 BP, respectively) cycads plus Ginkgo as sister to all remaining extant gymnosperms. To further investigate if the placement of Ginkgo is sensitive to the number of sampled genes, we randomly subsampled the 305 nuclear genes in four different gene size categories (i.e., 25, 47, 100, or 200 genes; 10 replicates each). We similarly subsampled the 47 plastid genes (i.e., 25 genes with 10 replicates). Even as the sample size declines, the coalescent and concatenation analyses of nuclear genes strongly support (≥80 BP) cycads plus Ginkgo as sister to all remaining extant gymnosperms. Support for this relationship only dropped below 80 BP when the number of subsampled nuclear genes was 25 for the coalescent analyses (Figure 3A). For the 25 subsampled plastid genes, the coalescent analyses also support cycads plus Ginkgo with ≥80 BP. In contrast, concatenation analyses of 25 subsampled plastid genes support the “Gingko alone” hypothesis with ≥80 BP (Figure 3A). Table 2. Data sources of plastid gene sequences included in our phylogenetic analyses. No. of sequences used in GenBank Species phylogenetic Average GC- accession number analyses content NC_004766 46 42.8% Amborella trichopoda NC_005086 44 40.1% Cryptomeria japonica NC_010548 46 38.0% Cycas revoluta NC_020319 47 40.3% Ginkgo biloba NC_016986 47 40.4% Gnetum parvifolium NC_011942 33 38.6% Nuphar advena NC_008788 44 40.6% Picea abies NC_021456 36 40.7% Picea morrisonicola NC_016069 35 40.7% Pinus koraiensis NC_004677 36 40.5% Pinus taeda NC_021440 36 40.4% NC_013086 47 50.8% NC_010654 32 37.2% 32 41.4% Adiantum capillusveneris Selaginella moellendorffii Welwitschia mirabilis Zamia furfuracea JQ770198JQ770303 doi: 10.1371/journal.pone.0080870.t002 Inferring Species Relationships Using Coalescent and Concatenation Methods Species relationships were first estimated from nucleotide sequences using the recently developed coalescent method: Species Tree Estimation using Average Ranks of Coalescence (STAR) [46]. Since this method is based on summary statistics calculated across all gene trees, a small number of outlier genes that significantly deviate from the coalescent model have relatively little effect on the accurate inference of the species tree [48]. We note that while all plastid genes are generally expected to share the same history, evidence of recombination, heteroplasmy, and incomplete lineage sorting in plastid genomes suggests that this may not always apply (e.g., 53-57). Thus, we additionally analyzed plastid genes using the coalescent method. We compared the results from coalescent analyses of both nuclear and plastid genes with those from concatenation analyses using maximum likelihood (ML) as implemented in RAxML [58]. Statistical confidence was established for both methods using a multilocus bootstrapping approach [59], in which genes were resampled with replacement followed by resampling sites with replacement within each gene. Our species trees inferred from coalescent and concatenation methods largely agree with each other (Figure 2). Similarly, analyses of nuclear and plastid genes are largely in agreement. All analyses strongly support (≥87 bootstrap percentage [BP]) the monophyly of extant gymnosperms. The lone placement that shows conflict between the nuclear and plastid gene trees is for the gnetophytes (i.e., Gnetum and Welwitschia). Our coalescent and concatenation analyses of PLOS ONE | www.plosone.org 4 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo Figure 2. Species trees inferred from (A) 305 nuclear genes and (B) 47 plastid genes using the coalescent method (STAR). Bootstrap percentages (BPs) from STAR/RAxML are indicated above each branch; an asterisk indicates that the clade is supported by 100 BPs from both STAR and RAxML. Branch lengths were estimated by fitting the concatenated matrices to the inferred topology from STAR. doi: 10.1371/journal.pone.0080870.g002 PLOS ONE | www.plosone.org 5 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo Figure 3. Summary of bootstrap percentages (BPs) from coalescent and concatenation analyses using different gene subsampling and rate partitions. (A) BPs from coalescent and concatenation analyses using different gene subsampling. The 305 nuclear genes were subsampled for four different gene size categories (i.e., 25, 47, 100, or 200 genes; 10 replicates each), and the 47 plastid genes were subsampled for 25 genes (10 replicates). Cells with hatching indicate that support for the placement of Ginkgo biloba from all replicates is below 80 BP; colored cells indicate relationships that received bootstrap support ≥80 BP from at least one replicate (pink = cycads plus Ginkgo as sister to all remaining extant gymnosperms, yellow = Ginkgo alone as sister to conifers and gnetophytes within extant gymnosperms; see also Figure 1D). (B) BPs from coalescent and concatenation analyses across different nucleotide rate partitions. Parsimony informative sites in concatenated matrices were sorted based on estimated evolutionary rates, and subsequently divided into two equal partitions. The index of substitution saturation (ISS) was used to measure nucleotide substitution saturation for sites within each rate partition. The two critical ISS values, i.e., ISS.C1 and ISS.C2, were estimated using an asymmetrical and symmetrical topology, respectively (for data including more than 32 species, only values estimated from 32 terminals are shown here). doi: 10.1371/journal.pone.0080870.g003 Accommodating rate heterogeneity in coalescent and concatenation analyses Thus, our results are robust to the number of genes sampled, including the discordant placements of Ginkgo between coalescent and concatenation analyses of plastid genes. PLOS ONE | www.plosone.org Despite the fact that our coalescent and concatenation analyses largely agree with each other, we are interested in exploring the influence of nucleotide substitution rates on phylogenetic inference of seed plant relationships. It has long been appreciated that elevated rates of molecular evolution 6 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo can lead to multiple substitutions at the same site [61,62], which can be especially misleading for resolving deeper relationships if the substitution model fails to correct for high levels of saturation in fast-evolving sites [24,62-68]. This is especially relevant for inferring the phylogeny of early diverging gymnosperms given their ancient origin [69-72]. Here, to assess the effect of rate heterogeneity, we partitioned nucleotide sites in our concatenated matrices according to estimated evolutionary rates. The relative evolutionary rate of each site in our concatenated matrices was estimated using the Observed Variability (OV) method [62], which compares all sequences at a given site in a pair-wise manner, and uses the total number of mismatches between species as the measure of site variability. Importantly, since the OV is a tree-independent approach, it is free from systematic bias of estimating evolutionary rates using an inaccurate phylogeny [62]. We sorted all parsimony informative sites in our concatenated nucleotide matrices based on their relative evolutionary rates and then divided them into two equal partitions (Figures S1A and S1B). For nuclear genes each rate partition contains 25,647 sites, and for plastid genes each partition contains 8,369 sites. When analyzing data from each rate partition separately, the coalescent method supports (≥76 BP) cycads plus Ginkgo as sister to all remaining extant gymnosperms across all rate partitions for both nuclear and plastid genes (Figure 3B). In contrast, the concatenation method produces well supported, but incongruent results, across different rate partitions (Figure 3B). Here, the slow-evolving sites corroborate results from our coalescent analyses and place cycads sister to Ginkgo with 100 BP for both nuclear and plastid genes. However, fastevolving sites support the “Gingko alone” hypothesis with 82 BP and 99 BP for nuclear and plastid genes, respectively. Additionally, when the placement of cycads plus Ginkgo is inferred using the concatenation method, the rival placement of “Ginkgo alone” is rejected (p-value < 0.001, AU test). Similarly, in all cases when “Ginkgo alone” is supported, the rival placement of cycads plus Ginkgo is rejected (p-value < 0.001, AU test). To determine if nucleotide substitution saturation might influence the incongruent placements of Ginkgo in our concatenation analyses, we characterized sites within each of our rate partitions using an entropy-based index of substitution saturation (ISS) [73]. As ISS approaches 1, or if ISS is not smaller than the critical ISS value (ISS.C), then sequences are determined to exhibit substantial saturation [73]. Our analyses demonstrate that for plastid genes (Figure 3B), the slow-evolving sites exhibit no evidence of saturation (i.e., ISS is significantly smaller than ISS.C; p-value < 0.001, two-tailed t-test), while the fastevolving sites show evidence of substantial saturation (i.e., ISS is greater than ISS.C when the true topology is asymmetrical). In contrast, our analyses indicate that all rate partitions for nuclear genes show evidence of substantial saturation, but the slowevolving sites exhibit lower overall levels of saturation (Figure 3B). Thus, the nuclear and plastid genes together suggest that the incongruence we observe in the placement of Ginkgo across rate partitions using the concatenation method may be PLOS ONE | www.plosone.org related to higher overall levels of substitution saturation in fastevolving nucleotide sites. Further exploration of this question is warranted. Finally, since previous studies have established the importance of taxon sampling in determining the placement of Ginkgo [15], we re-analyzed three concatenated nucleotide matrices from previous studies to confirm that our results are not biased by insufficient taxon sampling. These three matrices include a wide breadth of taxon and gene sampling: i) 16 seed plants using 52 plastid genes from Zhong et al. [24], ii) 64 vascular plants using 53 plastid genes from Wu et al. [15], and iii) 193 green plants using six genes representing all three plant genomic compartments (i.e., nucleus, plastid, and mitochondrion) from Qiu et al. [29]. Our phylogenetic analyses of these three matrices mirror the results using the concatenation method summarized above. When including only those slow-evolving sites identified by the OV method (Figures S1C–S1E), the clade containing cycads plus Ginkgo is well supported (≥82 BP; Figure 3B). In contrast, analyzing only the fast-evolving sites supports (≥78 BP) the “Gingko alone” hypothesis (Figure 3B). Importantly, the slow-evolving sites in all three matrices exhibit no evidence of saturation (p-value < 0.001, two-tailed t-test); while the fast-evolving sites in two of three matrices show evidence of substantial saturation (Figure 3B). Conclusions Our phylogenomic analyses of seed plants identify three main results: i) extant gymnosperms are monophyletic, ii) gnetophytes exhibit discordant placements within conifers between their nuclear and plastid genomes, and iii) cycads plus Ginkgo form a clade that is sister to all remaining extant gymnosperms. Our results also show that standard concatenation analyses of both nuclear and plastid genes produce well supported, but conflicting placements of key taxa across sites with different substitution rates. Determining the causes of this incongruence, however, requires more empirical and simulation studies. Here, we hypothesize that this incongruence may be related to the way in which concatenation methods treat sites with elevated nucleotide substitution rates. Although our concatenation analyses of fastevolving nucleotide sites produced the “Ginkgo alone” topology, the signal from slow-evolving sites appears to have prevailed. Thus, we did not observe strongly conflicting placements of Ginkgo between coalescent and concatenation methods when analyzing all sites together. One interpretation of these results is that concatenation analyses of full data sets may not be heavily misled by a subset of sites with elevated substitution rates. However, an extrapolation of our specific results suggests that as saturated sites increase in phylogenomic data sets, standard concatenation methods may produce strongly supported but incorrect results. In contrast, coalescent analyses of the same data sets demonstrated consistent placement of cycads plus Ginkgo, suggesting that coalescent-based methods better deal with rate heterogeneity [44-48]. 7 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo tree has no more than one sequence per species. Subtrees produced by paralogue pruning were then filtered to include only those with i) seven or more species and ii) 60% of the species present in the original cluster from which they were derived. For the coalescent approach, individual gene trees were first inferred using RAxML with the GTRGAMMA substitution model from nucleotide sequences, species relationships were then estimated from gene trees using STAR as implemented in Phybase v1.3 [90]. For concatenation analyses, the concatenated nucleotide matrix was generated from individual genes using Phyutility, and the best-scoring ML tree was obtained using RAxML with the GTRGAMMA substitution model. Bootstrap support was estimated for both coalescent and concatenation methods using a multilocus bootstrap approach as described in the Results and Discussion section with 200 replicates. Alternative topology tests were performed in the ML framework using the AU test as implemented in scaleboot v0.3-3 [91]. All constrained searches were conducted in RAxML using the GTRGAMMA substitution model. How does this increased phylogenetic resolution enhance our understanding of seed plant evolution? Cycads and Ginkgo share a number of morphological characters, such as their unusual pattern of pollen tube development [74], flagellated male gametes [75,76], simple female strobili [77], and embryo development [78]. In light of the increasing support of cycads plus Ginkgo we identify here, some of these traits, which have been commonly thought to be symplesiomorphies of gymnosperms [13,78], may actually represent synapomorphies of the cycads plus Ginkgo clade [15]. Assessing these questions going forward will be challenging, however, given the phenomenally high rate of extinction suffered by gymnosperms [79]. A thoughtful assessment of this question is only likely to be answered with more exhaustive sampling of fossil lineages. Materials and Methods Data acquisition and sequence translation Gene sequences from both nuclear and plastid genomes were gathered for this study. For nuclear genes, assembled unique transcripts were obtained (Table 1) and then translated to amino acid sequences using prot4EST v2.2 [80]. For plastid genes, the fully annotated plastid genomes were obtained from NCBI GenBank (Table 2). Gene subsampling To subsample gene clusters, the 305 nuclear gene clusters were randomly selected for the sizes of 25, 47, 100, and 200 genes, and the 47 plastid gene clusters were randomly selected for the size of 25 genes. Ten sets of gene clusters were selected as replicates for each size. Species trees and bootstrap support were estimated using STAR and RAxML for each replicate as described above. Homology Assignment and Sequence Alignment The establishment of sequence homology for phylogenetic analyses followed Dunn et al. [81] and Hejnol et al. [82]. Briefly, sequence similarity was first assessed for all amino acid sequences using BLASTP v2.2.25 [83] with 10-20 e-value threshold, and then grouped using a Markov cluster algorithm as implemented in MCL v09-308 [49] with the inflation value equals 5.0. Clusters were required to i) include at least one sequence from Selaginella (for outgroup rooting), ii) include sequences from at least four species, iii) include at least 100 amino acids for each sequence [84], iv) have a mean of less than five sequences per species, and v) have a median of less than two sequences per species. Amino acid sequences from each cluster were aligned using MUSCLE v3.8.31 [85], and ambiguous sites were trimmed using trimAl v1.2rev59 [86] with the heuristic automated method. Sequences were removed from the alignment if they contained less than 70% of the total alignment length [87]. Nucleotide sequences were then aligned according to the corresponding amino acid alignments using PAL2NAL v14 [88]. For each cluster, the gene tree was inferred from nucleotide alignments using RAxML v7.2.8 with the GTRGAMMA substitution model. All but one sequence were deleted in clades of sequences derived from the same species, i.e., monophyly masking, using Phyutility v2.2.6 [89]. Estimation of evolutionary rate and substitution saturation assessment The OV method was used to measure the relative evolutionary rate of each site in all five concatenated matrices (Figure 3B) as described in the Results and Discussion section. Species trees and bootstrap supports were estimated using STAR and RAxML for each rate partition as described above. Nucleotide substitution saturation was measured using ISS as implemented in DAMBE [92]. ISS was estimated for each rate partition from 200 replicates with gaps treated as unknown states. Supporting Information Paralogue pruning and species tree assessment Figure S1. The estimated evolutionary rates for nucleotide sites in all five concatenated matrices analyzed in this study. Parsimony informative sites in each concatenated matrix were sorted based on the Observed Variability (OV) method, and subsequently divided into two equal partitions. (PDF) Paralogue pruning of each gene tree used for species tree assessment followed Hejnol et al. [82]. Briefly, we first identified the maximally inclusive subtree that contains no more than one sequence per species. This subtree is then pruned away and the remaining tree is used as a substrate for another round of pruning. The process is repeated until the remaining Table S1. Data characteristics for all 305 nuclear genes, including the locus ID of sequence from Selaginella moellendorffii in each gene, number of species per gene, number of nucleotide sites per gene, and percentage of gaps per gene. PLOS ONE | www.plosone.org 8 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo (PDF) discussion. We also thank Casey Dunn, Mike Ethier, and Alexandros Stamatakis for technical support. Table S2. Data characteristics for all 47 plastid genes, including number of species per gene, number of nucleotide sites per gene, and percentage of gaps per gene. (PDF) Author Contributions Conceived and designed the experiments: ZX JSR CCD. Performed the experiments: ZX. Analyzed the data: ZX JSR CCD. Wrote the manuscript: ZX JSR CCD. Acknowledgements We thank Dannie Durand, Andrew Knoll, and members of the Davis, Durand, and Rest laboratories for advice and References 1. Rothwell GW, Scheckler SE, Gillespie WH (1989) Elkinsia gen. nov., a late Devonian gymnosperm with cupulate ovules. Bot Gaz 150: 170-189. doi:10.1086/337763. 2. Fiz-Palacios O, Schneider H, Heinrichs J, Savolainen V (2011) Diversification of land plants: insights from a family-level phylogenetic analysis. BMC Evol Biol 11: 341. doi:10.1186/1471-2148-11-341. PubMed: 22103931. 3. Mathews S (2009) Phylogenetic relationships among seed plants: persistent questions and the limits of molecular data. Am J Bot 96: 228-236. doi:10.3732/ajb.0800178. PubMed: 21628186. 4. Goremykin V, Bobrova V, Pahnke J, Troitsky A, Antonov A et al. (1996) Noncoding sequences from the slowly evolving chloroplast inverted repeat in addition to rbcL data do not support Gnetalean affinities of angiosperms. Mol Biol Evol 13: 383-396. doi:10.1093/ oxfordjournals.molbev.a025597. PubMed: 8587503. 5. Chaw SM, Zharkikh A, Sung HM, Lau TC, Li WH (1997) Molecular phylogeny of extant gymnosperms and seed plant evolution: analysis of nuclear 18S rRNA sequences. Mol Biol Evol 14: 56-68. doi:10.1093/ oxfordjournals.molbev.a025702. PubMed: 9000754. 6. Samigullin TK, Martin WF, Troitsky AV, Antonov AS (1999) Molecular data from the chloroplast rpoC1 gene suggest deep and distinct dichotomy of contemporary spermatophytes into two monophyla: gymnosperms (including Gnetales) and angiosperms. J Mol Evol 49: 310-315. doi:10.1007/PL00006553. PubMed: 10473771. 7. Bowe LM, Coat G, dePamphilis CW (2000) Phylogeny of seed plants based on all three genomic compartments: extant gymnosperms are monophyletic and Gnetales' closest relatives are conifers. Proc Natl Acad Sci U S A 97: 4092-4097. doi:10.1073/pnas.97.8.4092. PubMed: 10760278. 8. Chaw SM, Parkinson CL, Cheng YC, Vincent TM, Palmer JD (2000) Seed plant phylogeny inferred from all three plant genomes: monophyly of extant gymnosperms and origin of Gnetales from conifers. Proc Natl Acad Sci U S A 97: 4086-4091. doi:10.1073/pnas.97.8.4086. PubMed: 10760277. 9. Nickrent DL, Parkinson CL, Palmer JD, Duff RJ (2000) Multigene phylogeny of land plants with special reference to bryophytes and the earliest land plants. Mol Biol Evol 17: 1885-1895. doi:10.1093/ oxfordjournals.molbev.a026290. PubMed: 11110905. 10. Gugerli F, Sperisen C, Büchler U, Brunner L, Brodbeck S et al. (2001) The evolutionary split of Pinaceae from other conifers: evidence from an intron loss and a multigene phylogeny. Mol Phylogenet Evol 21: 167-175. doi:10.1006/mpev.2001.1004. PubMed: 11697913. 11. Soltis DE, Soltis PS, Zanis MJ (2002) Phylogeny of seed plants based on evidence from eight genes. Am J Bot 89: 1670-1681. doi:10.3732/ ajb.89.10.1670. PubMed: 21665594. 12. Mathews S, Clements MD, Beilstein MA (2010) A duplicate gene rooting of seed plants and the phylogenetic position of flowering plants. Philos Trans R Soc Lond B Biol Sci 365: 383-395. doi:10.1098/rstb. 2009.0233. PubMed: 20047866. 13. Crane PR (1985) Phylogenetic analysis of seed plants and the origin of angiosperms. Ann Missouri Bot Gard 72: 716-793. doi: 10.2307/2399221. 14. Doyle JA, Donoghue MJ (1986) Seed plant phylogeny and the origin of angiosperms: an experimental cladistic approach. Bot Rev 52: 321-431. doi:10.1007/BF02861082. 15. Wu CS, Chaw SM, Huang YY (2013) Chloroplast phylogenomics indicates that Ginkgo biloba is sister to cycads. Genome Biol Evol 5: 243-254. doi:10.1093/gbe/evt001. PubMed: 23315384. PLOS ONE | www.plosone.org 16. Burleigh JG, Mathews S (2004) Phylogenetic signal in nucleotide data from seed plants: implications for resolving the seed plant tree of life. Am J Bot 91: 1599-1613. doi:10.3732/ajb.91.10.1599. PubMed: 21652311. 17. Hajibabaei M, Xia JN, Drouin G (2006) Seed plant phylogeny: gnetophytes are derived conifers and a sister group to Pinaceae. Mol Phylogenet Evol 40: 208-217. doi:10.1016/j.ympev.2006.03.006. PubMed: 16621615. 18. Qiu YL, Li LB, Wang B, Chen ZD, Dombrovska O et al. (2007) A nonflowering land plant phylogeny inferred from nucleotide sequences of seven chloroplast, mitochondrial, and nuclear genes. Int J Plant Sci 168: 691-708. doi:10.1086/513474. 19. Finet C, Timme RE, Delwiche CF, Marlétaz F (2010) Multigene phylogeny of the green lineage reveals the origin and diversification of land plants. Curr Biol 20: 2217-2222. doi:10.1016/j.cub.2010.11.035. PubMed: 21145743. 20. Regina TMR, Quagliariello C (2010) Lineage-specific group II intron gains and losses of the mitochondrial rps3 gene in gymnosperms. Plant Physiol Biochem 48: 646-654. doi:10.1016/j.plaphy.2010.05.003. PubMed: 20605476. 21. Zhong B, Yonezawa T, Zhong Y, Hasegawa M (2010) The position of gnetales among seed plants: overcoming pitfalls of chloroplast phylogenomics. Mol Biol Evol 27: 2855-2863. doi:10.1093/molbev/ msq170. PubMed: 20601411. 22. Wodniok S, Brinkmann H, Glöckner G, Heidel AJ, Philippe H et al. (2011) Origin of land plants: do conjugating green algae hold the key? BMC Evol Biol 11: 104. doi:10.1186/1471-2148-11-104. PubMed: 21501468. 23. Wu CS, Wang YN, Hsu CY, Lin CP, Chaw SM (2011) Loss of different inverted repeat copies from the chloroplast genomes of Pinaceae and cupressophytes and influence of heterotachy on the evaluation of gymnosperm phylogeny. Genome Biol Evol 3: 1284-1295. doi: 10.1093/gbe/evr095. PubMed: 21933779. 24. Zhong B, Deusch O, Goremykin VV, Penny D, Biggs PJ et al. (2011) Systematic error in seed plant phylogenomics. Genome Biol Evol 3: 1340-1348. doi:10.1093/gbe/evr105. PubMed: 22016337. 25. Ran JH, Gao H, Wang XQ (2010) Fast evolution of the retroprocessed mitochondrial rps3 gene in Conifer II and further evidence for the phylogeny of gymnosperms. Mol Phylogenet Evol 54: 136-149. doi: 10.1016/j.ympev.2009.09.011. PubMed: 19761858. 26. Qiu YL, Lee JH, Bernasconi-Quadroni F, Soltis DE, Soltis PS et al. (1999) The earliest angiosperms: evidence from mitochondrial, plastid and nuclear genomes. Nature 402: 404-407. doi:10.1038/46536. PubMed: 10586879. 27. Qiu YL, Lee J, Bernasconi-Quadroni F, Soltis DE, Soltis PS et al. (2000) Phylogeny of basal angiosperms: analyses of five genes from three genomes. Int J Plant Sci 161: S3-S27. doi:10.1086/317584. 28. Qiu YL, Li LB, Hendry TA, Li RQ, Taylor DW et al. (2006) Reconstructing the basal angiosperm phylogeny: evaluating information content of mitochondrial genes. Taxon 55: 837-856. doi: 10.2307/25065680. 29. Qiu YL, Li LB, Wang B, Chen ZD, Knoop V et al. (2006) The deepest divergences in land plants inferred from phylogenomic evidence. Proc Natl Acad Sci U S A 103: 15511-15516. doi:10.1073/pnas.0603335103. PubMed: 17030812. 30. Wu CS, Wang YN, Liu SM, Chaw SM (2007) Chloroplast genome (cpDNA) of Cycas taitungensis and 56 cp protein-coding genes of Gnetum parvifolium: Insights into cpDNA evolution and phylogeny of 9 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. extant seed plants. Mol Biol Evol 24: 1366-1379. doi:10.1093/molbev/ msm059. PubMed: 17383970. Rydin C, Kallersjo M, Friist EM (2002) Seed plant relationships and the systematic position of Gnetales based on nuclear and chloroplast DNA: conflicting data, rooting problems, and the monophyly of conifers. Int J Plant Sci 163: 197-214. doi:10.1086/338321. Burleigh JG, Mathews S (2007) Assessing among-locus variation in the inference of seed plant phylogeny. Int J Plant Sci 168: 111-124. doi: 10.1086/509586. Rai HS, Reeves PA, Peakall R, Olmstead RG, Graham SW (2008) Inference of higher-order conifer relationships from a multi-locus plastid data set. Botany 86: 658-669. doi:10.1139/B08-062. de la Torre-Bárcena JE, Kolokotronis SO, Lee EK, Stevenson DW, Brenner ED et al. (2009) The impact of outgroup choice and missing data on major seed plant phylogenetics using genome-wide EST data. PLOS ONE 4: e5764. doi:10.1371/journal.pone.0005764. PubMed: 19503618. Graham SW, Iles WJD (2009) Different gymnosperm outgroups have (mostly) congruent signal regarding the root of flowering plant phylogeny. Am J Bot 96: 216-227. doi:10.3732/ajb.0800320. PubMed: 21628185. Cibrián-Jaramillo A, De la Torre-Bárcena JE, Lee EK, Katari MS, Little DP et al. (2010) Using phylogenomic patterns and gene ontology to identify proteins of importance in plant evolution. Genome Biol Evol 2: 225-239. doi:10.1093/gbe/evq012. PubMed: 20624728. Lee EK, Cibrian-Jaramillo A, Kolokotronis SO, Katari MS, Stamatakis A et al. (2011) A functional phylogenomic view of the seed plants. PLOS Genet 7: e1002411. Huelsenbeck JP, Bull JJ, Cunningham CW (1996) Combining data in phylogenetic analysis. Trends Ecol Evol 11: 152-158. doi: 10.1016/0169-5347(96)10006-9. PubMed: 21237790. Mossel E, Vigoda E (2005) Phylogenetic MCMC algorithms are misleading on mixtures of trees. Science 309: 2207-2209. doi:10.1126/ science.1115493. PubMed: 16195459. Degnan JH, Rosenberg NA (2006) Discordance of species trees with their most likely gene trees. PLoS Genet 2: e68. doi:10.1371/ journal.pgen.0020068. PubMed: 16733550. Kubatko LS, Degnan JH (2007) Inconsistency of phylogenetic estimates from concatenated data under coalescence. Syst Biol 56: 17-24. doi:10.1080/10635150601146041. PubMed: 17366134. Rosenberg NA, Tao R (2008) Discordance of species trees with their most likely gene trees: the case of five taxa. Syst Biol 57: 131-140. doi: 10.1080/10635150801905535. PubMed: 18300026. Liu L, Edwards SV (2009) Phylogenetic analysis in the anomaly zone. Syst Biol 58: 452-460. doi:10.1093/sysbio/syp034. PubMed: 20525599. Liu L, Pearl DK, Brumfield RT, Edwards SV (2008) Estimating species trees using multiple-allele DNA sequence data. Evolution 62: 2080-2091. doi:10.1111/j.1558-5646.2008.00414.x. PubMed: 18462214. Degnan JH, Rosenberg NA (2009) Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol Evol 24: 332-340. doi:10.1016/j.tree.2009.01.009. PubMed: 19307040. Liu L, Yu L, Pearl DK, Edwards SV (2009) Estimating species phylogenies using coalescence times among sequences. Syst Biol 58: 468-477. doi:10.1093/sysbio/syp031. PubMed: 20525601. Liu L, Yu L, Edwards SV (2010) A maximum pseudo-likelihood approach for estimating species trees under the coalescent model. BMC Evol Biol 10: 302. doi:10.1186/1471-2148-10-302. PubMed: 20937096. Song S, Liu L, Edwards SV, Wu S (2012) Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model. Proc Natl Acad Sci U S A 109: 14942-14947. doi:10.1073/pnas.1211733109. PubMed: 22930817. Enright AJ, van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30: 1575-1584. doi:10.1093/nar/30.7.1575. PubMed: 11917018. Duvick J, Fu A, Muppirala U, Sabharwal M, Wilkerson MD et al. (2008) PlantGDB: a resource for comparative plant genomics. Nucleic Acids Res 36: D959-D965. PubMed: 18063570. Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L et al. (2011) Ancestral polyploidy in seed plants and angiosperms. Nature 473: 97-100. doi:10.1038/nature09916. PubMed: 21478875. Banks JA, Nishiyama T, Hasebe M, Bowman JL, Gribskov M et al. (2011) The Selaginella genome identifies genetic changes associated with the evolution of vascular plants. Science 332: 960-963. doi: 10.1126/science.1203810. PubMed: 21551031. Medgyesy P, Fejes E, Maliga P (1985) Interspecific chloroplast recombination in a Nicotiana somatic hybrid. Proc Natl Acad Sci U S A 82: 6960-6964. doi:10.1073/pnas.82.20.6960. PubMed: 16593619. PLOS ONE | www.plosone.org 54. Ogihara Y, Terachi T, Sasakuma T (1988) Intramolecular recombination of chloroplast genome mediated by short direct-repeat sequences in wheat species. Proc Natl Acad Sci U S A 85: 8573-8577. doi:10.1073/pnas.85.22.8573. PubMed: 3186748. 55. Rajora OP, Dancik BP (1995) Chloroplast DNA variation in Populus. III. Novel chloroplast DNA variants in natural Populus × canadensis hybrids. Theor Appl Genet 90: 331-334. PubMed: 24173921. 56. Wolfe AD, Randle CP (2004) Recombination, heteroplasmy, haplotype polymorphism, and paralogy in plastid genes: Implications for plant molecular systematics. Syst Bot 29: 1011-1020. doi: 10.1600/0363644042451008. 57. Jakob SS, Blattner FR (2006) A chloroplast genealogy of Hordeum (Poaceae): long-term persisting haplotypes, incomplete lineage sorting, regional extinction, and the consequences for phylogenetic inference. Mol Biol Evol 23: 1602-1612. doi:10.1093/molbev/msl018. PubMed: 16754643. 58. Stamatakis A (2006) RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688-2690. doi:10.1093/bioinformatics/btl446. PubMed: 16928733. 59. Seo TK (2008) Calculating bootstrap probabilities of phylogeny using multilocus sequence data. Mol Biol Evol 25: 960-971. doi:10.1093/ molbev/msn043. PubMed: 18281270. 60. Shimodaira H (2002) An approximately unbiased test of phylogenetic tree selection. Syst Biol 51: 492-508. doi: 10.1080/10635150290069913. PubMed: 12079646. 61. Olsen GJ (1987) Earliest phylogenetic branchings: comparing rRNAbased evolutionary trees inferred with various techniques. Cold Spring Harb Symp Quant Biol 52: 825-837. doi:10.1101/SQB. 1987.052.01.090. PubMed: 3454291. 62. Goremykin VV, Nikiforova SV, Bininda-Emonds OR (2010) Automated removal of noisy data in phylogenomic analyses. J Mol Evol 71: 319-331. doi:10.1007/s00239-010-9398-z. PubMed: 20976444. 63. Brinkmann H, Philippe H (1999) Archaea sister group of bacteria? Indications from tree reconstruction artifacts in ancient phylogenies. Mol Biol Evol 16: 817-825. doi:10.1093/oxfordjournals.molbev.a026166. PubMed: 10368959. 64. Hirt RP, Logsdon JM, Healy B, Dorey MW, Doolittle WF et al. (1999) Microsporidia are related to Fungi: evidence from the largest subunit of RNA polymerase II and other proteins. Proc Natl Acad Sci U S A 96: 580-585. doi:10.1073/pnas.96.2.580. PubMed: 9892676. 65. Philippe H, Lopez P, Brinkmann H, Budin K, Germot A et al. (2000) Early-branching or fast-evolving eukaryotes? An answer based on slowly evolving positions. Proc Biol Sci 267: 1213-1221. doi:10.1098/ rspb.2000.1130. PubMed: 10902687. 66. Gribaldo S, Philippe H (2002) Ancient phylogenetic relationships. Theor Popul Biol 61: 391-408. doi:10.1006/tpbi.2002.1593. PubMed: 12167360. 67. Pisani D (2004) Identifying and removing fast-evolving sites using compatibility analysis: an example from the arthropoda. Syst Biol 53: 978-989. doi:10.1080/10635150490888877. PubMed: 15764565. 68. Philippe H, Roure B (2011) Difficult phylogenetic questions: more data, maybe; better methods, certainly. BMC Biol 9: 91. doi: 10.1186/1741-7007-9-91. PubMed: 22206462. 69. Schneider H, Schuettpelz E, Pryer KM, Cranfill R, Magallón S et al. (2004) Ferns diversified in the shadow of angiosperms. Nature 428: 553-557. doi:10.1038/nature02361. PubMed: 15058303. 70. Smith SA, Beaulieu JM, Donoghue MJ (2010) An uncorrelated relaxedclock analysis suggests an earlier origin for flowering plants. Proc Natl Acad Sci U S A 107: 5897-5902. doi:10.1073/pnas.1001225107. PubMed: 20304790. 71. Clarke JT, Warnock RCM, Donoghue PCJ (2011) Establishing a timescale for plant evolution. New Phytol 192: 266-301. doi:10.1111/j. 1469-8137.2011.03794.x. PubMed: 21729086. 72. Magallón S, Hilu KW, Quandt D (2013) Land plant evolutionary timeline: gene effects are secondary to fossil constraints in relaxed clock estimation of age and substitution rates. Am J Bot 100: 556-573. doi:10.3732/ajb.1200416. PubMed: 23445823. 73. Xia X, Xie Z, Salemi M, Chen L, Wang Y (2003) An index of substitution saturation and its application. Mol Phylogenet Evol 26: 1-7. doi: 10.1016/S1055-7903(02)00326-3. PubMed: 12470932. 74. Friedman WE (1993) The evolutionary history of the seed plant male gametophyte. Trends Ecol Evol 8: 15-21. doi: 10.1016/0169-5347(93)90125-9. PubMed: 21236093. 75. Brenner ED, Stevenson DW, Twigg RW (2003) Cycads: evolutionary innovations and the role of plant-derived neurotoxins. Trends Plant Sci 8: 446-452. doi:10.1016/S1360-1385(03)00190-0. PubMed: 13678912. 76. Norstog KJ, Gifford EM, Stevenson DW (2004) Comparative development of the spermatozoids of cycads and Ginkgo biloba. Bot 10 November 2013 | Volume 8 | Issue 11 | e80870 Phylogenomics Resolve the Placement of Ginkgo Rev 70: 5-15. Available online at: 10.1663/0006-8101(2004)070[0005:CDOTSO]2.0.CO;2 77. Rudall PJ, Bateman RM (2010) Defining the limits of flowers: the challenge of distinguishing between the evolutionary products of simple versus compound strobili. Philos Trans R Soc Lond B Biol Sci 365: 397-409. doi:10.1098/rstb.2009.0234. PubMed: 20047867. 78. Wang L, Wang D, Lin MM, Lu Y, Jiang XX et al. (2011) An embryological study and systematic significance of the primitive gymnosperm Ginkgo biloba. J Syst Evol 49: 353-361. doi:10.1111/j. 1759-6831.2011.00123.x. 79. Crisp MD, Cook LG (2011) Cenozoic extinctions account for the low diversity of extant gymnosperms compared with angiosperms. New Phytol 192: 997-1009. doi:10.1111/j.1469-8137.2011.03862.x. PubMed: 21895664. 80. Wasmuth JD, Blaxter ML (2004) prot4EST: translating expressed sequence tags from neglected genomes. BMC Bioinformatics 5: 187. doi:10.1186/1471-2105-5-187. PubMed: 15571632. 81. Dunn CW, Hejnol A, Matus DQ, Pang K, Browne WE et al. (2008) Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452: 745-749. doi:10.1038/nature06614. PubMed: 18322464. 82. Hejnol A, Obst M, Stamatakis A, Ott M, Rouse GW et al. (2009) Assessing the root of bilaterian animals with scalable phylogenomic methods. Proc Biol Sci 276: 4261-4270. doi:10.1098/rspb.2009.0896. PubMed: 19759036. 83. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403-410. doi:10.1016/ S0022-2836(05)80360-2. PubMed: 2231712. 84. Liu QP, Xue QZ (2005) Comparative studies on codon usage pattern of chloroplasts and their host nuclear genes in four plant species. J Genet 84: 55-62. doi:10.1007/BF02715890. PubMed: 15876584. PLOS ONE | www.plosone.org 85. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792-1797. doi: 10.1093/nar/gkh340. PubMed: 15034147. 86. Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25: 1972-1973. doi:10.1093/bioinformatics/ btp348. PubMed: 19505945. 87. Jiao Y, Leebens-Mack J, Ayyampalayam S, Bowers JE, McKain MR et al. (2012) A genome triplication associated with early diversification of the core eudicots. Genome Biol 13: R3. doi:10.1186/gb-2012-13-1-r3. PubMed: 22280555. 88. Suyama M, Torrents D, Bork P (2006) PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res 34: W609-W612. doi:10.1093/nar/gkl315. PubMed: 16845082. 89. Smith SA, Dunn CW (2008) Phyutility: a phyloinformatics tool for trees, alignments and molecular data. Bioinformatics 24: 715-716. doi: 10.1093/bioinformatics/btm619. PubMed: 18227120. 90. Liu L, Yu L (2010) Phybase: an R package for species tree analysis. Bioinformatics 26: 962-963. doi:10.1093/bioinformatics/btq062. PubMed: 20156990. 91. Shimodaira H (2008) Testing regions with nonsmooth boundaries via multiscale bootstrap. J Stat Plan Infer 138: 1227-1241. doi:10.1016/ j.jspi.2007.04.001. 92. Xia X, Xie Z (2001) DAMBE: software package for data analysis in molecular biology and evolution. J Hered 92: 371-373. doi:10.1093/ jhered/92.4.371. PubMed: 11535656. 11 November 2013 | Volume 8 | Issue 11 | e80870
THE FLINDERS UNIVERSITY OF SOUTH AUSTRALIA COLLEGE OF SCIENCE AND ENGINEERING BIOL 2702 Genetics, Evolution and Biodiversity General Information and Laboratory Manual 2018 Image from Tree of Life: http://tolweb.org/tree/phylogeny.html BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 2 Table of Contents GENERAL INFORMATION .................................................................................................... 3 AIMS OF THE TOPIC .......................................................................................................... 3 EXPECTED LEARNING OUTCOMES ............................................................................... 3 STRUCTURE OF THE TOPIC ............................................................................................. 4 STAFF RESPONSIBLE FOR THE TOPIC .......................................................................... 4 GETTING STARTED............................................................................................................ 5 Attending Tutorials ............................................................................................................ 5 What if I miss a tutorial? .................................................................................................... 5 Attending workshops.......................................................................................................... 5 What if I miss a workshop? ................................................................................................ 5 Attending Practicals and CMLs ......................................................................................... 5 How do I prepare for practicals and CMLs? ...................................................................... 5 What do I need to bring to practicals and CMLs?.............................................................. 6 What if I miss a practical or CML? .................................................................................... 6 TUTORIAL AND WORKSHOP TIMETABLE ................................................................... 7 PRACTICAL TIMETABLE .................................................................................................. 8 TEXT AND REFERENCE BOOKS...................................................................................... 9 LABORATORY BOOKS AND REPORTS ........................................................................ 10 Guidelines for the Preparation of the Annotated Bibliography........................................ 11 Guidelines for the Preparation of the Introduction Assignment ...................................... 12 Guidelines for the Preparation of Scientific Manuscript .................................................. 12 EXPERIMENTS .................................................................................................................. 14 Experiment 1 (Plant Stream) ............................................................................................ 14 Experiment 2 (Reptile Stream) ......................................................................................... 24 Manual for the Molecular Phylogenetic Analysis Component of BIOL2702 ................. 34 APPENDICES ...................................................................................................................... 68 How to use a Pipetman ..................................................................................................... 68 Undergraduate Laboratory Safety Procedures ..................................................................... 71 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 3 GENERAL INFORMATION AIMS OF THE TOPIC The BIOL2702 topic, Genetics, Evolution and Biodiversity, aims to introduce genetic information flow from generation to generation, to the formation of population and species structure and provide the basis for introducing current biodiversity and how this has been shaped by the forces of evolution. The topic aims to be vertically integrated and so provide the students with an understanding of how events and processes at a cellular level, and that of the gene, effect and influence what happens at the level of a population and an ecosystem (and vice versa). The practical class will emphasise this vertical structure by following a single gene sequence from a variety of organisms to build a phylogenetic tree. The practical class provides the opportunity to introduce generic practical skills in laboratory-based biology. EXPECTED LEARNING OUTCOMES Genetics underpins virtually all of Biology. This topic provides students with a firm background in all the major areas of genetics, ranging from heredity, to Mendelian and population genetics, through to speciation genetics and ultimately phylogenetics. Theory: • Understand principles of basic Mendelian and population genetics • Understand principles of evolution • Understand the principles of natural and man-made selective forces that shape species characteristics and population structure • Understand the broad principles and theories of speciation • Understand how genetic data can be used to explore evolutionary history and the Tree of Life • Understand ecological processes influencing biodiversity and the importance of biodiversity for ecological functions • Understand the complexity of causes affecting biodiversity on the planet • Understand ecological processes influencing biodiversity and the importance of biodiversity for ecological functions • Understand the complexity of causes affecting biodiversity on the planet Practical: • Basic pipetting and DNA extraction • Setting up a PCR and agarose gel electrophoresis • Basic understanding of sampling and storage of field material for genetic purposes • Interpretation of DNA sequence information and bioinformatic tools of phylogeny. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE Students are also expected to have learned the following scientific skills: 1. how to maintain a laboratory notebook 2. how to use a set of pipetteman 3. how to use a spectrophotometer 4. how to perform a PCR 5. how to load and run an agarose gel 6. how do perform basic sequence edit and analysis programs 7. how to build and interpret a phylogenetic tree 8. how to develop a laboratory protocol 9. how to present data in tables and figures in the format used in scientific journal articles Students are also expected to have learned the following generic skills: 1. how to work independently and also in a team 2. how to solve problems by applying existing knowledge to new situations 3. how to communicate effectively (written communication) STRUCTURE OF THE TOPIC • • • 1 x 50 minute tutorial per week 6 x 50 minute workshops per semester (run fortnightly) 6 x 3-hour practicals/CMLS per semester (run fortnightly) STAFF RESPONSIBLE FOR THE TOPIC Topic Coordinators: A/Prof Peter Anderson (PAA) rm303B peter.anderson@flinders.edu.au A/Prof Mike Gardner (MG) rm161 michael.gardner@flinders.edu.au Dr Masha Smallhorn (MNS) rm034 masha.smallhorn@flinders.edu.au Tutorial Instructors: A/Prof Peter Anderson (PAA) A/Prof Mike Gardner (MG) Dr Michael Lee (ML) Dr Masha Smallhorn (MNS) Kailah Thorn (KT) Demonstrators: Debbie Charter Carmel Maher Robert O’Reilly Melissa Oxley Chloe Thompson-Peach Workshop tutors: A/Prof Peter Anderson (PAA) A/Prof Mike Gardner (MG) Dr Masha Smallhorn (MNS) 4 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 5 GETTING STARTED Teaching philosophy • Our teaching philosophy is student-centred and student-driven. We have designed our faceto-face teaching activities to be interactive. In these activities we provide an opportunity to apply knowledge gained through the online resources and extend understanding of the content. At least two members of the teaching team are present during all face-to-face activities to answer questions and help consolidate learning. Attending Tutorials • We strongly recommend that you attend tutorials. These are interactive sessions where students problem solve exam style questions in groups. Attendance will be recorded. You should prepare by watching and reading the relevant resources on FLO and completing the relevant pre-tutorial quiz. The pre-tutorial quizzes count towards final topic assessment. The tutorial paper will be uploaded to FLO each week following the repeat tutorial. What if I miss a tutorial? • If you miss a tutorial, some resources can be obtained from the FLO site for this topic at https://flo.flinders.edu.au/. You are advised to contact the relevant tutor for assistance with the material. Attending workshops • Attendance at workshops in not compulsory but it is strongly recommended. What if I miss a workshop? • If you miss a workshop, some resources can be obtained from the FLO site for this topic at https://flo.flinders.edu.au/ Attending Practicals and CMLs • • Attendance at practicals and CMLs is compulsory. Students should register themselves in a practical class by going to: https://stuadmin.flinders.edu.au/Student/login.aspx?ReturnUrl=%2fstudent%2fdefault.aspx Students should attend the same practical class and CML for the whole semester How do I prepare for practicals and CMLs? • Read the relevant practical and CML notes • View the relevant resources on FLO • Complete relevant Pre-Lab quiz Note: Successful completion of the relevant Pre-Lab quiz is compulsory for attendance at practicals and CMLs. At least a 4/5 must be achieved. You have unlimited attempts at the quiz. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE What do I need to bring to practicals and CMLs? • • • • • The topic manual, a laboratory notebook and a USB drive A blue or black ink pen, a fine water-resistant marker pen, a ruler and a calculator A laboratory coat*, closed-in shoes, not sandals and certainly not thongs or ballet flats Further details in the appendices *Laboratory coat is not required for CMLs What if I miss a practical or CML? • Practical and CML sessions are compulsory. If you cannot attend your usual practical/CML class as you are sick or have a personal circumstance that prevents you from attending, you should email masha.smallhorn@flinders.edu.au (with BIOL2702 somewhere in the subject heading). Please indicate which alternate session you can attend in the email. STUDENTS WHO MISS MORE THAN ONE PRACTICAL/CML CLASS WILL BE ADVISED TO WITHDRAW FROM THE TOPIC. 6 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 7 TUTORIAL AND WORKSHOP TIMETABLE Tutorials: Tuesday 1:00pm South Lecture Theatre 2 OR Wednesday 9:00am North Lecture Theatre 1 Workshops (odd weeks only): Thursday 10:00am OR 2:00pm, South Lecture Theatre 2 WEEK FACILITATOR 1 MNS; PAA MNS (PAA) 2 3 4 5 6 7 Introduction to topic WORKSHOP 1: Cell division in action PAA (MNS) Source of variation; mutation and epigenetics PAA (MNS) Changes in chromosome structure and number MNS (PAA) WORKSHOP 2: Pedigree interpretation MNS (PAA) Segregation analysis MNS (PAA) Diagnostic genetics for human disease MNS (PAA) WORKSHOP 3: Extensions to diagnostic genetics for human disease ML (KT) Fundamentals of Molecular Phylogeny ML (KT) Bayesian analysis MNS (PAA) 8 CONTENT ML (KT) WORKSHOP 4: Q & A session on writing an introduction Tempo and Mode of Evolution and inferring evolutionary history MIDSEMESTER BREAK 9 10 11 12 MG (MNS) Forms of selection and genetic drift MNS (MG) WORKSHOP 5: How to write a scientific manuscript MG (MNS) Population genetics MNS, MG, PAA MG (MNS) MNS, MG; PAA Current and future direction in genetics WORKSHOP 6: How to interpret your phylogenetic tree Revision Session **Pre-tutorial quiz weeks 1-11 ***Pre-lab quiz weeks 2, 4, 6, 8, 10 MG: Mike Gardner, ML: Michael Lee, PAA: Peter Anderson, MNS: Masha Smallhorn; KT: Kailah Thorn BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE PRACTICAL TIMETABLE • Practical sessions are scheduled for weeks 2, 4, 6, 8, 10, 12. Students must attend the allocated 3-hour practical session in each of these weeks. Attendance at the practicals is compulsory. WEEK PRACTICAL TITLE 1 2 3 4 5 6 7 8 No practicals DNA extraction No practicals PCR No Practical Gel electrophoresis of PCR products No Practical Sequence editing, multiple sequence alignment and sequence identification MID-SEMESTER BREAK No Practical Construction of phylogenetic tree No Practical Construction of phylogenetic tree, preparation of report 9 10 11 12 Practicals and CMLs are held in the Biology Discovery Centre on Level 1, lab 4 Practicals – Weeks 2, 4, 6 Group Day Time Room Demonstrator Stream 1 Tuesday 14:00 – 17:00 BDB Level1 lab 4 Masha, Carmel Reptile 2 Wednesday 10:00 – 13:00 BDB Level1 lab 4 Debbie, Robert Plant 3 Thursday 10:00 – 13:00 BDB Level1 lab 4 Masha, Chloe Plant 4 Thursday 14:00 – 17:00 BDB Level1 lab 4 Masha, Chloe Reptile Demonstrator Stream CMLs – Weeks 8, 10, 12 Group Day Time Room 1 Tuesday 14:00 – 17:00 BDB Level1 lab 4 2 Wednesday 10:00 – 13:00 BDB Level1 lab 4 3 Thursday 10:00 – 13:00 BDB Level1 lab 4 4 Thursday 14:00 – 17:00 BDB Level1 lab 4 Masha, Carmel, Melissa Debbie, Robert, Masha Masha, Chloe, Robert Masha, Chloe, Carmel Reptile Plant Plant Reptile 8 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 9 TEXT AND REFERENCE BOOKS Recommended text book: Griffith, A. J. F., Wessler, S. R., Carroll, S. B. and Doebley, J. (2015). Introduction to Genetic Analysis, 11th edition, New York: W. H. Freeman and Company. Recommended reference books: • Freeman, S. and Herron, J. C. (2006). Evolutionary Analysis, 4th edition. Upper Saddle River: Pearson Prentice Hall. • Rose, M.R. and Mueller, L. D. (2006). Evolution and Ecology of the Organism. Prentice Hall. • Hall, B. G. (2011). Phylogenetic Trees Made Easy: A how to manual, 4th edition, Sunderland, USA. • Cargill, M. and P. O’Connor (2009). Writing Scientific Research Articles. WileyBlackwell: Singapore. • Higgs, P. G. and Attwood, T. K (2004). Bioinformatics and Molecular Evolution, Blackwell Publishing: Oxford, UK. • Russell, P.J. (2002). iGenetics: A Mendelian Approach. San Francisco: Pearson/Benjamin Cummings BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 10 LABORATORY BOOKS AND REPORTS As a research scientist your laboratory book is the most important way in which you record your experimental procedures and the data you collect. If someone wanted to repeat the experiment would they be able to follow the procedures as outlined in your lab book? It is also a legal document so if there is patentable findings that are contested in court, then your lab book could be used as evidence. Therefore in a professional setting, your lab book should be page numbered and signed every day by your supervisor. It is expected that all students get their book signed before leaving the practical. After the experiments are conducted and the date collected, there are two major ways in which the data is presented:- one as a scientific manuscript submitted to a journal, or as a presentation at a scientific conference. However we only allow a written form of presentation as this is a more important approach for your training as a scientist. Details of this are presented below. Although you work in pairs for the practical exercises, each individual must keep their own laboratory notebook and scientific papers must be individually prepared. There will be the opportunity to ask questions about the requirements for laboratory notebooks and laboratory reports in the first tutorial. The basic format and guidelines for laboratory notebooks • • • • • • • • A record book with a permanent binding and the use of permanent ink is required. Record the facts as observed, and record them at the time of observation. It is unacceptable to rewrite the practical after you finish. It is acceptable (within this topic) to cut and paste hard copy versions of the methods described in this FLO site. We suggest you tick off when you have done a step. This is important as you may miss a step during a procedure. Please note that the laboratory notebook must be up to date BEFORE you leave the laboratory or CML. If you don’t have a hard copy of your methods you will need to write them by hand. If you make a written mistake in your notebook, simply draw a single line through the appropriate words and initial the entry. Don't attempt to cover up information or written errors. In most companies and institutions it is mandatory to be accountable for all information in a notebook and thereby avoid fraudulent record keeping (ie. no liquid paper cover-ups or post-experimental alterations of results or methods). Add a brief explanatory note if needed to explain the change (eg. calculation error, machine fault). Record page numbers and always date and initial entries. This will allow cross referencing of protocols carried out over a number of days and is essential when experiments have multiple procedures carried out concurrently. Create an index using page numbers, dates, and 4-10 word summaries of methods used. Use abbreviations where convenient, but create an up-to-date list of abbreviations in the back of your notebook as you introduce each one. Information should be well organised so that it can be retrieved by anyone reading your notebook. Be brief, but comprehensive. Don't editorialise or write a padded story-book style account. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 • • • PAGE 11 Record the unusual as well as the anticipated. This makes trouble shooting any problems possible and easier. Record any conclusions you draw from your data, including any working (eg. graphs and calculations). You may work on this aspect of your notebook outside the laboratory. The basic format for experiments includes: A title; A statement of aims; A Description of materials & methods (highlighting any changes from printed materials); Listing of observations and Recording of results and associated calculations; Brief statement of conclusions. Guidelines for the Preparation of the Annotated Bibliography **Note: This assessment task is a NGP. To receive the NGP you must meet all of the assessment criteria listed below. Assessment Criteria • • • • • • The Annotated Bibliography must be saved as a PDF file. Upload the document onto the BIOL2702 FLO site. The research question you have chosen should be included at the top of the annotated bibliography. The annotated bibliography should contain six peer reviewed references. Each peer reviewed reference should be referenced correctly using Harvard Style (see document on FLO on Harvard Style). Each peer reviewed reference should include a short paragraph of at least four sentences below the reference briefly explaining why the reference is relevant to the research question. Each peer reviewed reference should be cited in text correctly using Harvard Style in the short paragraph below each reference. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 12 Guidelines for the Preparation of the Introduction Assignment • • • • An assessment rubric can be found on the BIOL2702 FLO site. The Introduction Assignment must be saved as a PDF file. Turnitin statement: you are required to submit a draft of your assignment to the DRAFT Turnitin folder. A Turnitin reflection statement should be included at the top of your assignment. See FLO for further information. The final copy of your assignment should be submitted to the Introduction Assignment dropbox on FLO. Assessment Criteria • • • • • • The Introduction Assignment includes background information that concisely discusses the relevant previous literature. The importance of the area of study is highlighted. Aims of current study are outlined and hypothesis is clearly stated. Formal scientific language is used. Statements are supported by peer-reviewed literature. At least eight peer reviewed references should be included. The word limit is 600 words (+/- 10%) not including references or in-text citations. Guidelines for the Preparation of Scientific Manuscript • • • • • An assessment rubric can be found on the BIOL2702 FLO site. The Scientific Manuscript must be saved as a PDF file. Turnitin statement: you are required to submit a draft of your assignment to the DRAFT Turnitin folder. A Turnitin reflection statement should be included at the top of your assignment. See FLO for further information. Feedback statement: you are required to write a response to the feedback provided in the Introduction assignment. See FLO for further information. The final copy of your assignment should be submitted to the Scientific Manuscript dropbox on FLO. Assessment Criteria: • • The following headings should be used: Abstract, Introduction, Methods, Results, Discussion, Conclusion, References. Discussion and Conclusion can be included under one heading. The length of each paper must not exceed 10 x A4 pages typed inclusive of all diagrams and references. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 13 Guidelines for the Preparation of Scientific Manuscript (continue) The Approximate text length guidelines for the Scientific Manuscript are: Abstract: A brief description of the aims and what you found 250 words Introduction: Place the reader in the context of previous work. Identify the need for this study i.e. the hole in our knowledge and then state the aims of the study and the hypothesis tested (this is the Introduction previously submitted). 600 words Methods: Break up into sub-headings. Should describe all methods used to create results presented in manuscript. As required (concise) Results: Includes text section describing relevant results with reference to relevant figure/table. Each figure or table should have a detailed legend enabling the reader to interpret the data. At least two figures are required. This could be the annotated gel photo and a phylogenetic tree or two phylogenetic trees. Sub-headings can be used. Tables and figures not included in word count. 600 words Discussion/conclusion: This enables you to place your data in a broader context of the available literature. Break in similar sub-headings if necessary. Results are designed to present your data, whereas the discussion section is for interpretation of your data. Be careful not to mix the two up. 1000 words References (using the Harvard Style) As required (a minimum of 10-15 references from peer reviewed journal articles or recognised web pages). Resources will be made available but this is not intended to be an exhaustive supply of literature. You will have to do some searching yourself. (These are guides for how long to make each section - they are Maximum Lengths) • The current trend in many scientific journals is towards somewhat shorter length papers. There is then real skill in being concise, balancing the length, detail and clarity of a paper, whilst still convincing the reader (or reviewer) as to the significance of the work as a whole. Even if you never submit a scientific paper for publication, such general writing skills are vital to your training for any profession you may enter. • The ultimate purpose of Scientific Manuscript is to assess your ability to present and interpret data concisely, to draw conclusions from your data which are fully supported by your observations. You should also explain the significance of your observations and work in a broad context accessible to the general public. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 14 EXPERIMENTS Experiment 1 (Plant Stream) Identified hazards in this practical: Liquid nitrogen – causes severe burns and severe damage to eyes – wear safety glasses at all times and use thermal gloves when grinding tissue. Note also that some of the DNA extraction solutions are hazardous, so avoid contact with skin and wear safety glasses at all times. Midori Green is a DNA-binding dye. In general, any DNA-binding material should be regarded as potentially mutagenic and thus be handled with caution. Wear safety glasses, gloves and lab coats when handling solutions containing Midori Green. **IMPORTANT NOTICE: 1. All students to take part in the practical are required to wear a lab coat and safety glasses at all times. 2. Wear nitrile gloves in all activities in the lab except when using thermal gloves. 3. You will work in pairs, but will all associated assignments individually. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 15 Objectives: 1. To extract DNA from a leaf sample using a commercially available kit and precipitate DNA using ethanol. 2. To amplify sample DNA using Polymerase Chain Reaction (PCR) with primers that flank the regions: 5.8S ribosomal RNA (5.8S rRNA), inter-transcribed spacer 2 (ITS_2) and 28S ribosomal RNA (28S rRNA). 3. To visualise the DNA by electrophoresis of an agarose gel. 4. To produce an evolutionary lineage diagram using cladistic algorithms to investigate one of the research questions listed below and test a hypothesis of your own design based on the research question. Research Questions: 1. The last remaining Ginkgo Ginkgo biloba is the last extant member of a prehistoric plant genus. The species shares some unusual morphological characters, such as mobile sperm, with cycads and therefore has often been placed as a sister species to the cycads. However six different placements have been suggested by molecular data depending of what gene sequences were used. You will develop a hypothesis regarding the placement of the Ginkgo in a molecular phylogeny or some aspect regarding its unusual morphology and a molecular phylogeny. Background information on the Gingko can be found in Wu et al. (2013) and Zhou (2009). These readings are available on FLO. 2. Agathis and the Drowning of New Zealand It is believed that New Zealand was separated from the Gondwanan supercontinent eighty million years ago. The landmass of New Zealand was significantly reduced during the subsequent Oligocene period (26 to 38 million years ago). To this day there has been much debate about whether or not New Zealand was completely submerged. If New Zealand was completely submerged all living fauna and flora must have arrived following the submergence. However previous phylogenetic studies including the genus Agathis have provided evidence both in support and against this hypothesis. Molecular phylogenetic studies on New Zealand flora can be used to investigate this research question. Background information on the Drowning of New Zealand can be found in Knapp et al. (2007) and Biffin et al. (2010). These readings are available on FLO. 3. Molecular Phylogeny of Seed Plants Living seed plants include the angiosperms, cycads, Ginkgo, conifers and gnetophytes. There is much debate in the literature regarding the phylogeny of the living seed plants as molecular and morphological phylogenetic studies have produced conflicting results. Background information on the molecular phylogeny of seed plants can be found in Mathews (2009) and Qiu (2008). These readings are available on FLO. * You will need to choose a research question and design a hypothesis to test that question. You will need to indicate which research question you have chosen in your annotated bibliography (assessment item). Your hypothesis needs to be included in the Introduction (assessment item). You will get feedback on your hypothesis which can be used to improve the hypothesis for the Scientific Manuscript (assessment item). BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 16 SESSION 1 (Experiment 1 Plant Stream) Prior to this session it is advised that you what the following demonstrations 1. How to use micropipette? http://www.youtube.com/watch?v=uEy_NGDfo_8&feature=relmfu 2. Preparing an Agarose gel, running gel and various useful information regarding Agarose gel electrophoresis http://www.youtube.com/watch?v=wXiiTW3pflM&feature=relmfu Situation / Substance Risk Solvents, ethanol Flammable Cryogenic solution – liquid nitrogen Risk of serious damage to eyes and skin Grinding tissue with mortar and pestle Ceramic chips entering eye Bioline Plant DNA extraction kit Harmful in contact with skin and eyes Safety Keep away from sources of ignition Wear suitable protective clothing and eye / hand protection Wear suitable eye protection Wear suitable protective clothing, gloves and eye protection Materials per pair Note: There are 1000 mg in 1 gram. • • • • • • • Two different plant leaf samples (approximately 50 mg dry samples, 150 mg wet samples) o Ginkgo o Tobacco o Cycad o QLD Kauri o Wollemi o Bunya Pine o Pine Sterile microfuge tubes (1.5ml) x • Eskie filled with ice 10 • Set of suitable gloves for handling liquid nitrogen A set of pipettes and tips (p20, p200, p1000) • 65°C heating block Bioline Isolate Plant DNA Mini Kit • Preheated Elution Buffer PG (www.bioline.com/isolate) (65°C) o including spin columns and • Flat head spatula reagents) for 2 samples • Scissors Accessibility to benchtop • 70% ethanol for aseptic lab practice centrifuge • Kimwipes Safety glasses • Patty pan x 2 Mortar and Pestle • 1.5ml microfuge tube holder Demonstrators • Liquid Nitrogen dewer. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 17 DNA extraction 1. Under direction of the demonstrator collect the plant material (approximately 50 mg dry samples and 150 mg wet samples). Note: Some scales measure in grams (g) rather than milligrams (mg). Make sure you can do the conversion of g to mg before you start weighing the samples. 2. Place plant tissue into patty pan and dice up. 3. Put diced sample into mortar and pestle. Add liquid nitrogen to mortar and grind tissue hard but not fast as to spill it over the top. 4. Using a flat head spatula scrape the contents into a 1.5ml microfuge tube. 5. Add 400μl Lysis Buffer PA1. Vortex mixture thoroughly. 6. Add 10μl RNase A solution and thoroughly mix sample. 7. Incubate at 65°C for 30 minutes. 8. Place ISOLATE II Filter (violet) into a new 2ml Collection Tube and load lysate onto column. 9. Centrifuge 2 min at 11,000 x g (13000 rpm). Note: RCF or G-force= (1.1118 x 10-5) x radius x rpm2 10. Collect clear flow-through and discard ISOLATE II Filter. If a pellet is visible in flowthrough, transfer clear supernatant without disturbing pellet to a new 1.5ml microcentrifuge tube. 11. Add 450μl Binding Buffer PB. Mix thoroughly by pipetting up and down 5 times. 12. Place ISOLATE II Plant DNA Spin Column (green) into a new 2ml Collection Tube and load sample (max. of 700μl). 13. Centrifuge 1 min at 11,000 x g and discard the flow-through. 14. Add 400μl Wash Buffer PAW1. 15. Centrifuge 1 min at 11,000 x g and discard flow-through. 16. Add 700μl Wash Buffer PAW2. 17. Centrifuge 1 min at 11,000 x g and discard flow-through. 18. Add another 200μl Wash Buffer PAW2. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 18 19. Centrifuge 2 min at 11,000 x g to remove wash buffer and to dry silica membrane completely. 20. Place ISOLATE II Plant DNA Spin Column into a new 1.5ml microcentrifuge tube. 21. Add 50μl preheated Elution Buffer PG (65°C) onto center of silica membrane. Incubate 5 min at 65°C. 22. Centrifuge 1 min at 11,000 x g. 23. Repeat steps 21 and 22 with another 50μl Elution Buffer PG (65°C) and elute into same tube. 24. Correctly label and place samples in rack provided. Technical staff will place samples in -20°C freezer for next session. Note: Make sure that sample is clearly labelled and that you record the position in the rack in your laboratory notebook so that you can find it next session. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 19 SESSION 2 (Experiment 1 Plant Stream) Situation / Substance Solvents, ethanol Flammable Risk Materials per pair • Sterile microfuge tubes (1.5ml) x 10 • A set of pipettes and tips (p20, p200, p1000) • Vortex mixer • >95% ethanol • 70% ethanol for aseptic lab practice • 1.5ml microfuge tube holder • PCR reagents • PCR tubes and holder • Sterile water • DNA for PCR positive control • Accessibility to benchtop centrifuge Demonstrators • Access to sufficient amount of PCR machines for class. Safety Keep away from sources of ignition BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 20 Polymerase Chain Reaction (PCR) setup In this exercise you will be attempting to amplify a partial gene sequence from the rRNA gene. For the PCR setup, you will make up a PCR master mix from the reagents listed below. Reagent GoTaq Flexi Buffer (5x) 25mM MgCl2 10mM dNTP’s Forward primer “wol_deg2F”(10µM) Reverse primer “wol deg1R”(10µM) GoTaq polymerase, 5 units/μl Sterile purified H2O *use to make volume up to 25µl Template DNA **add separately to each reaction** Volume for 1 reaction (μl) 5.0 2.0 0.5 0.5 0.5 0.39 Total Volume (μl) 2.0 25.0 1. Determine how many PCR reactions needed for each pair. Consider what controls should be done. Record this information in your laboratory notebook. 2. Calculate the volume of each reagent in the mastermix required for the number of PCR reactions determined. 3. Make up the PCR master mix using the reagents supplied in a 1.5ml tube. 4. Briefly flick mix tube for 10 seconds. 5. Centrifuge master mix tube for 10 seconds at >8K rpm. 6. Correctly label PCR tubes (on the appropriate section of the tube) ready for samples, so they can be distinguished at a later date. Note: Handle PCR tubes with care, they are very easy to break! 7. Carefully pipette 23µl of the master mix into each of the PCR tubes. 8. Add 2µl of correct sample to each of the tubes. 9. Briefly spin down tubes and place on ice awaiting insertion into the PCR machine. 10. Record the position of your PCR reactions in the rack on ice for future identification. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 21 The PCR machine will cycle through the programmed conditions and will take approximately three hours to run. The PCR cycle is as follows: PCR cycling conditions Pre-PCR cycle 2 minutes @ 95°C. 34 x PCR cycle Step 1: 45 seconds @ 95°C; Step 2: 45 seconds @ 52°C; Step 3: 1 minute @ 72°C. Post-PCR cycle 5 minutes @ 72°C; ∞ @ 4°C. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 22 SESSION 3 (Experiment 1 Plant Stream) Situation / Substance Electrophoresis of DNA Risk Electric shock Gel photography – UV trans illuminator Risk of serious damage to eyes and skin, risk of cumulative effects Safety Use equipment as directed by demonstrator Wear suitable protective clothing and eye/face protection In this session, your amplified PCR products from your samples of last session will be visualised by electrophoresis through an agarose gel. The PCR buffer will act as a dye front for you to observe your sample products moving through the gel. Materials per pair • Accessibility to benchtop centrifuge. • TAE buffer. • Molten 1.5% Agarose, containing Midori Green™, casting tray and comb o Midori Green™ binds DNA and will allow us to visualise DNA using the BioRad Gel Doc system • Gel electrophoresis tank with appropriate power source for constant 90V. • 70% ethanol for aseptic lab practice • Sealed travel container for gel • 100 bp Promega DNA marker (500ug/ml) containing GoTaq® running buffer • Gel loading pipette and tips. Demonstrators • Access to Bio-Rad Gel Doc system. Students 1. Prepare the agarose gel casting tray in the gel tank and place the comb in position (instruction from demonstrator) 2. Pour the molten agarose into the casting tray 3. Leave for 20 minutes to set 4. Pour TAE buffer into tank until agarose gel is completely immersed. 5. Carefully remove comb from agarose gel. 6. Pipette 3µl of DNA marker into first lane 7. Pipette 10 µl of your samples into other lanes. 8. Take a note of which samples are in each lane. 9. Seal electrophoresis gel tank with correct polarity, set voltage to 90V and timer to 60 minutes. Note: During this session the Kingdom of Plants DVD by David Attenborough will be shown. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 23 10. Once timer is finished, take gel and place into travel container. 11. With the assistance of your demonstrator, take a photograph of the agarose gel using the Biorad GelDoc system. The gel contains Midori Green™ which binds DNA and fluoresces under UV light. 12. A copy of the gel photograph will be uploaded onto the FLO discussion board in week 6. Computers will be available during this practical session for you to annotate your gel photograph. Email a copy of your gel photograph to yourself before you leave the laboratory. 13. A hard copy of the gel photo should appear in your lab book with a descriptive title and a brief description of your results. The photo may also be included in your Scientific Manuscript as one of the two required figures. REFERENCES (These are available on FLO) Biffin, E., Hill, R. S. and Low, A. J. (2010). Did Kauri (Agathis: Araucariaceae) really survive the oligocene drowning of New Zealand. Systematic Biology, 59(5): 594-602. Coleman, A. W. (2003). ITS2 is a double-edged tool for eukaryote evolutionary comparisons. Trends in Genetics, 19(7): 370-375. Knapp, M., Mudaliar, R., Havell, D., Wagstaff, S. J. and Lockhard, P. J. (2007). The drowning of New Zealand and problem of Agathis. Systematic Biology 56(5): 862-870. Mathews, S. (2009). Phylogenetic relationships among seed plants: persistent questions and the limits of molecular data. American Journal of Botany, 96(1): 228-236. Qiu, Y. L. (2008). Phylogeny and evolution of charophytic algae and land plants. Journal of Systematics and Evolution, 46: 287-306. Wu C.-S., Chaw S.-M. & Huang Y.-Y. (2013). Chloroplast Phylogenomics Indicates that Ginkgo biloba Is Sister to Cycads. Genome Biology and Evolution 5, 243-54. Zhou Z. Y. (2009). An overview of fossil Ginkgoales. Palaeoworld 18, 1-22. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 24 Experiment 2 (Reptile Stream) Identified Hazards in this practical: FTA punch:- sharp edge, use with caution; Flammable and hazardous solvents (ethanol and isopropanol); Benchtop centrifuge – close lids and balance correctly; Midori Green is a DNA-binding dye. In general, any DNA-binding material should be regarded as potentially mutagenic and thus be handled with caution. Wear safety glasses, gloves and lab coats when handling solutions containing Midori Green. UV radiation – wear safety glasses and protect skin. **IMPORTANT NOTICE: 1. All students to take part in the practical are required to wear a lab coat and safety glasses at all times. 2. Wear nitrile gloves in all activities in the lab except when using thermal gloves. 3. You will work in pairs, but will submit all associated assignments individually. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 25 Objectives: 1. To obtain clean punch disc from a blood sample on filter paper. 2. To extract genomic DNA from the clean punch disc and from frozen tissue. 3. To amplify sample DNA using Polymerase Chain Reaction (PCR) with primers located within conserved domains of the mitochondrial cytochrome b gene 4. To visualise the DNA by electrophoresis of an agarose gel. 5. To produce an evolutionary lineage diagram using cladistic algorithms to investigate one of the research questions listed below and test a hypothesis of your own design based on the research question. Research Questions: 1. Evolution of parental care in squamates Parental care is very limited and also very rare in the lizards and snakes but some incidences have been reported including in some Australian lizards. Parental care may be at a more simple level such as tolerance of young within an adult’s home range. It has not been determined if the incidences of parental care have a common evolutionary origin or are an example of convergent evolution. Background reading on parental care in squamates can be found in While et al. (2015), Langkilde et al. (2007) and Gardner et al. (2015). These readings are available on FLO. 2. Evolution of colouration in squamates Colouration has long captured people’s attention and there are some spectacular examples within the squamates. For example male tawny dragons have variously yellow/orange coloured throats which may be used for sexual signalling. There are two main pigment systems that cause yellow red colouration in the skin of snakes and lizards: carotenoids and/or pteridine pigments. Carotenoids have to be synthesized from compounds in the diet while pteridine pigments are synthesized by the animals themselves. It is currently unknown if the pigment use by squamates has a phylogenetic basis. Background information on coloration in squamates can be found in Olsson et al. (2013) and on the phylogeny of squamates in Pyron et al. (2013) These readings are available on FLO. 3. Evolution of Body Form in squamates Squamates have a great diversity in body form ranging from four limbs through to limbless snake-like body form. Previous morphological phylogenetic studies place all or most snake-like burrowers into a single clade. Results from molecular phylogenetic studies and studies which combine both molecular and morphological analyses, contradict these original findings. Background information on the evolution of body form can be found in Sites et al. (2011) and Weins et al. (2010). These readings are available on FLO. *You will need to choose a research question and design a hypothesis to test that question. You will need to indicate which research question you have chosen in your annotated bibliography (assessment item). Your hypothesis needs to be included in the Introduction (assessment item). You will get feedback on your hypothesis which can be used to improve the hypothesis for the Scientific Manuscript (assessment item). BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 26 SESSION 1 (Experiment 2 Reptile Stream) Prior to this session it is advised that you what the following demonstrations 1. How to use micropipette? http://www.youtube.com/watch?v=uEy_NGDfo_8&feature=relmfu 2. Preparing an Agarose gel, running gel and various useful information regarding Agarose gel electrophoresis http://www.youtube.com/watch?v=wXiiTW3pflM&feature=relmfu Situation / Substance Solvents, ethanol Risk Flammable Heat block Heat may cause burns Safety Keep away from sources of ignition Wear suitable heat proof gloves Materials per pair • • • • • • • • • A FTA® paper hole punch. Sterile microfuge tubes (1.5ml) x 4 A set of pipettes and tips (p20, p200, p1000). Access to a 56°C and 95°C heating block. Access to a vortex Access to benchtop centrifuge. Timer 2ml of sterile water. 70% ethanol for aseptic lab practice. • • • • Kimwipes. Pencil. Tweezers. 1.5ml microfuge tube holder. • QIAGEN: DNeasy Blood and Tissue kit (www.qiagen.com) • including spin columns and reagents) for 2 samples Demonstrators Accessibility to FTA cards. Note: Blood samples on FTA cards are provided from reptiles housed at the Flinders University Animal House. Flinders samples were collected by technical staff at the animal house following strict protocols. Crocodile Meat. Note: Crocodile meat was purchased from the Adelaide Central Markets and frozen before use. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 27 DNA extraction from FTA cards The DNA extraction from FTA cards is adapted from the protocol outlined in Smith and Burgoyne (2004). 1. Your demonstrator will give you an FTA® card to punch out a blood sample using your FTA® hole punch. 2. Record the details of which animal the blood sample came from in your laboratory notebook. Record the scientific name if known. 3. Take your FTA® hole punch and punch out part of a blank card to remove any lingering DNA from previous use. 4. Punch out a disc of blood following the instructions provided by your demonstrator and place the blood disc in a 1.5ml microfuge tube. 5. Add 500µl of sterile water to the blood disc. This allows you to rinse the blood disc. 6. Vortex the closed microfuge with the blood disc and sterile water for 5 seconds. 7. Remove the sterile water using a sterile pipette and dispose of water in the white bucket. Ensure that blood disc remains in microfuge tube and close. 8. Centrifuge for 5 seconds using the bench top centrifuge. 9. Carefully pipette off any remaining liquid and dispose of liquid in the white bucket. Note: Take care not to disturb the blood disc. 10. Add 50 µl of sterile water to the blood disc in the microfuge tube. Note: The disc should be completely submerged under the sterile water. 11. Heat the sample at 95°C for 30 mins using the heating blocks provided. Note: Ensure the lid of the microfuge tube is securely fastened. 12. Remove the microfuge from the heating block. 13. Gently tap the sample in the microfuge tube approximately 60 times. 14. Centrifuge the sample for 30 seconds using the bench top centrifuge. Note: This will separate the matrix (the card) from the elute (water) which now contains the purified DNA. 15. Correctly label and place samples in the rack provided. Technical staff will place samples at 4°C until next session. Note: Make sure that sample is clearly labelled and that you record the position in the rack in your laboratory notebook so that you can find it next session. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 28 DNA extraction from crocodile tissue: The DNA extraction from crocodile tissue is using the QIAGEN: DNeasy Blood and Tissue kit Steps 1 to 5 will be completed by technical staff. Students will start the DNA extraction from step 6. 1. Prepare 25mg of frozen tissue. Cut tissue into small pieces to enable more efficient lysis. 2. Transfer to a 1.5 mL microcentrifuge tube. 3. Add 180 μl Buffer ATL. 4. Add 20 μl proteinase K, mix by vortexing, and incubate at 56°C until completely lysed. This will take approximately 3 hours. 5. Vortex occasionally during incubation. 6. Vortex for 15 seconds. 7. Add 200 μl Buffer AL. Mix thoroughly by vortexing. 8. Incubate blood samples at 56°C for 10 min. 9. Add 200 μl ethanol (96–100%). Mix thoroughly by vortexing. 10. Pipet the mixture into a DNeasy Mini spin column placed in a 2 ml collection tube. 11. Centrifuge at ≥ 6000 x g (8000 rpm) for 1 min. Discard the flow-through and collection tube. Note: RCF or G-force= (1.1118 x 10-5) x radius x rpm2 12. Place the spin column in a new 2 ml collection tube. 13. Add 500 μl Buffer AW1. Centrifuge for 1 min at ≥6000 x g. Discard the flow-through and collection tube. 14. Place the spin column in a new 2 ml collection tube. 15. 500 μl Buffer AW2, and centrifuge for 3 min at 20,000 x g (13,500 rpm). Discard the flow-through and collection tube. 16. Transfer the spin column to a new 1.5 ml or 2 ml microcentrifuge tube. 17. Elute the DNA by adding 200 μl Buffer AE to the center of the spin column membrane. 18. Incubate for 1 min at room temperature (15–25°C). 19. Centrifuge for 1 min at ≥6000 x g. 20. Correctly label and place samples in the rack provided. Technical staff will place samples at 4°C until next session. Note: Make sure that sample is clearly labelled and that you record the position in the rack in your laboratory notebook so that you can find it next session. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 29 SESSION 2 (Experiment 2 Reptile Stream) Situation / Substance Solvents, ethanol Flammable Risk Safety Keep away from sources of ignition Materials per pair • Sterile microfuge tubes (1.5ml) x 10 • A set of pipettes and tips (p20, p200, p1000) • Vortex mixer • 70% ethanol for aseptic lab practice • 1.5ml microfuge tube holder • PCR reagents • PCR tubes and holder • Sterile water • DNA for PCR positive control Demonstrators • Access to sufficient number of PCR machines for class. Note: Care must be taken in this practical to avoid contaminating your samples as the primers used will also amplify your own DNA BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 30 Polymerase Chain Reaction (PCR) setup Objective: In this exercise you will be attempting to amplify a partial segment of the mitochondrial cytochrome b gene. For the PCR setup, you will make up a PCR master mix from the reagents listed. PCR SETUP (per pair) Reagent GoTaq® Buffer (5x) 25mM MgCl2 10mM dNTP’s Forward primer“cytb F”(10µM) Reverse primer“cytb R”(10µM) Sterile purified H2O *use to make volume up to 25µl GoTaq® polymerase, 5 units/μl Template DNA * add separately Total Volume (μl) Volume for 1 reaction (μl) 5.0 2.0 0.5 0.5 0.5 0.39 2 25 1. Make up the PCR master mix using the reagents supplied in a 1.5ml tube. Consider what controls you should do. How many reactions are you doing? List the reactions in your laboratory notebook. 2. Briefly flick mix master mix tube for 10 seconds. 3. Centrifuge master mix tube for 10 seconds at >8K rpm. 4. Correctly label PCR tubes (on the appropriate section of the tube) ready for samples, so they can be distinguished at a later date. Note: Handle PCR tubes with care, they are very easy to break. 5. Carefully pipette 23µl of the master mix into each of the PCR tubes. 6. Add DNA to each of the tubes. 7. Briefly spin down tubes and place on ice awaiting insertion into the PCR machine. 8. Record position of your samples in the PCR machine in your laboratory notebook to enable you to find them next practical session. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 31 The PCR machine will cycle through the programmed conditions and will take approximately three hours to run. The PCR cycle is as follows: PCR cycling conditions Pre-PCR cycle 2 minutes @ 95°C. 34 x PCR cycle Step 1: 45 seconds @ 95°C; Step 2: 45 seconds @ 52°C; Step 3: 1 minute @ 72°C. Post-PCR cycle 5 minutes @ 72°C; ∞ @ 4°C. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 32 SESSION 3 (Experiment 2 Reptile Stream) Situation / Substance Electrophoresis of DNA Risk Electric shock Gel photography – UV trans illuminator Risk of serious damage to eyes and skin, risk of cumulative effects Safety Use equipment as directed by demonstrator Wear suitable protective clothing and eye/face protection In this session, your amplified PCR products from your samples of last session will be visualised by electrophoresis through an agarose gel. The PCR buffer will act as a dye front for you to observe your sample products moving through the gel. Materials per pair • Accessibility to benchtop centrifuge. • TAE buffer. • Molten 1.5% Agarose, containing Midori Green™, casting tray and comb • Gel electrophoresis tank with appropriate power source for constant 90V. • 70% ethanol for aseptic lab practice • Sealed travel container for gel • 100 bp Promega DNA marker (500ug/ml) containing GoTaq® running buffer • Gel loading pipette and tips. Demonstrators • Access to Bio-Rad Gel Doc system. Students 1. Prepare the agarose gel casting tray in the gel tank and place the comb in position (instruction from demonstrator) 2. Pour the molten agarose into the casting tray 3. Leave for 20 minutes to set 4. Pour TAE buffer into tank until agarose gel is completely immersed. 5. Carefully remove comb from agarose gel. 6. Pipette 3µl of DNA marker into first lane 7. Pipette 10 µl of your samples into other lanes. 8. Take a note of which samples are in each lane. 9. Seal electrophoresis gel tank with correct polarity, set voltage to 90V and timer to 60 minutes. Note: During this session the Life in Cold Blood DVD by David Attenborough will be shown. 10. Once timer is finished, take gel and place into travel container. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 33 11. With the assistance of your demonstrator, take a photograph of the agarose gel using the Biorad GelDoc system. The gel contains Midori Green™ which binds DNA and fluoresces under UV light. 12. A copy of the gel photograph will be uploaded onto the FLO discussion board in week 6. Computers will be available during this practical session for you to annotate your gel photograph. Email a copy of your gel photograph to yourself before you leave the laboratory. 13. A hard copy of the gel photo should appear in your lab book and the fully annotated version used in your Scientific Manuscript. REFERENCES (These are available on FLO) Gardner, M. G., Pearson, S. K., Johnston, G. R., & Schwarz, M. P. (2015). Group living in squamate reptiles: a review of evidence for stable aggregations. Biological Reviews. doi:10.1111/brv.12201 Langkilde T. D. O‚Connor and Shine, R. (2007). The benefits of parental care: do juvenile lizards obtain better-quality habitat by remaining with their parents? Austral Ecology, 32: 950-954. Olsson, M., Stuart-Fox, D. and Ballen, C. (2013). Genetics and evolution of colour patterns in reptiles. Seminars in Cell and Developmental Biology, 24: 529– 541. Pyron R.A., Burbrink F.T. & Wiens J.J. (2013). A phylogeny and revised classification of Squamata, including 4161 species of lizards and snakes. Bmc Evolutionary Biology, 13, 1-54. Sites, J. W., Reeder, T. W. and Weins, J. J. (2011). Phylogenetic insights on evolutionary novelties in lizards and snakes: sex, birth, body, niches and venom. Annual Review of Ecology, Evolution and Systematics, 42: 227-244. Skinner, A. and Lee, M.S.Y. (2009). Body form evolution in the scincid lizard clade Lerista and the mode of macroevolutionary transitions. Evolutionary Biology, 36: 292-300. Smith, L., and Burgoyne, L. (2004). Collecting, archiving and processing DNA from wildlife samples using FTA(R) databasing paper. BMC Ecology, 4(1): 4. Weins, J. J., Kuczynski, C. A., Townsend, T., Reeder, T. W., Mulcahy, D. G. and Sites, J. W. (2010). Combining phylogenomics and fossils in higher-level squamate reptile phylogeny: molecular data change the placement of fossil taxa. Systematic Biology, 59: 674-688. Weins, J. J., Hutter, C. R., Mulcahy, D. G., Noonan, B. P., Townsend, T. M., Sites, J. W. and Reeder, T. W. (2012). Resolving the phylogeny of lizards and snakes (Squamata) with extensive sampling of genes and species. Biology Letters, 8(6): 1043-1046. While G.M., Chapple D.G., Gardner M.G., Uller T. & Whiting M.J. (2015). Egernia lizards. Current Biology, 25, R593-R5. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE Manual for the Molecular Phylogenetic Analysis Component of BIOL2702 INTRODUCTION Welcome to the bioinformatics section of the topic! This section of the manual should be followed in conjunction with the online video resources which take you through each step of the bioinformatics analysis. You may wish to bring headphones with you to review the videos during the bioinformatics sessions. Please refer to the sections of the bioinformatics information relevant to your practical stream. 34 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 35 RESOURCES REQUIRED MEGA 7.0.26 Software MEGA is an integrated tool for conducting automatic and manual sequence alignment, inferring phylogenetic trees, mining web-based databases, estimating rates of molecular evolution, and testing evolutionary hypotheses. This software is pre-loaded on all teaching computers in the BDC laboratories. It is also available on the computers in the CML suites in Physical Sciences. Students who use their own devices can download the free software from: https://www.megasoftware.net/ Please download MEGA 7.0.26 rather than the more recent MEGA X as this is the software we will be using in class. This should be done before you attend the session in week 8. The reference for the MEGA software is as follows: Kumar, S., Stecher, G. and Tamura, K. (2016). MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets. Molecular Biology and Evolution, 33: 1870-1874. Sequencing Files The plant and reptile sequencing files can be downloaded from the FLO topic page under the week 8 tab. These files will need to be unzipped. This should be completed before you attend the session in week 8. Laboratory notebook Steps of the bioinformatics analysis should be recorded in the laboratory notebook so you can repeat the analysis if necessary. Access to National Centre for Biotechnology Information (NCBI) This database will be used to find additional nucleotide sequences relevant to your region of interest and research question. Change ‘all databases’ to ‘nucleotide’ and search for the relevant information in the space provided. URL Link: https://www.ncbi.nlm.nih.gov/ BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 36 TIPS AND TROUBLESHOOTING This tips and troubleshooting guide outlines some of the common issues associated with the bioinformatics analysis. A MEGA manual is available on FLO under the week 10 resources. More information about troubleshooting can be found on the MEGA home page: https://www.megasoftware.net/ Tips • • • • • • Saving files: Create a folder on your computer to save all the edited files. Save each file as you go. If you are using a class computer, save the information on a USB. Orientation of sequence: Unsure if your sequence is in the reverse orientation? If the forward primer was found at the 5’ end of the sequence, the sequence is in the forward orientation. If the reverse primer was found at the 5’ end of the sequence, the sequence is in the reverse orientation. Only these sequences need to be reverse complemented. File names: Change the file name while in alignment explorer. The name listed in this file for each species is what will appear in the final tree. Scientific names should be used. Not sure of the scientific name, BLAST the sequence. Outgroups: The alignment works best if the outgroup(s) is added to the top of the file. These can be moved around once all the sequences have been added but not after the sequences are aligned. Making the tree: If you have many sequences creating the tree could take hours or even overnight. If this is the case, create a tree initially with the replicon number set to 100. Editing the tree: To remove a sequence from the phylogenetic tree without deleting it from the alignment, unclick the relevant sequence in data explorer view and run the tree again. Editing sequencing files The editing of the sequencing files (.ab1 files) requires the search function, the highlight function and the delete function. We recommend that you bring an external mouse as this may help you with this editing. We have found that this is more difficult to achieve with a Mac rather than a PC. This is because you may need to use the ‘right click’ option available on the PC mouse. For those of you who are Mac savvy, this shouldn’t be an issue. PCs will always be available for use in the CML sessions. Opening the a particular file type There is not one particular way to open all file types in MEGA. Please review the instructions in the manual. It outlines how to open each file type. Alignment issues Does your alignment look strange? Is there little similarity between the sequencing files? Remember, only sequence of the SAME region of DNA should be added to the alignment. Perhaps you have downloaded a different region of DNA and added it to the alignment. Alternatively, you may have the correct region but also a lot of additional sequence ie: the whole cytochrome b gene or regions 5’ and 3’ to the ribosomal spacer region used in the analysis. Ask your demonstrator to look at the alignment. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 37 PLANT STREAM Overview During the wet labs, the ribosomal spacer region, specifically regions: 5.8S ribosomal RNA (5.8S rRNA), inter-transcribed spacer 2 (ITS_2) and 28S ribosomal RNA (28S rRNA), were PCR amplified using the degenerative primer pair wol deg2F and wol deg1R. The PCR products were then cloned into the pGEMt easy vector (promega ©), sequenced at the Australian Genomic Research Facility (AGRF) in Brisbane using SP6 and T7 primers, and provided as .ab1 files. The SP6 and T7 primers amplify from the multiple cloning site of the pGEMt easy vector thus allowing the start of the PCR product to be sequenced. The .ab1 files provided are called electropherograms or chromatograms and can be viewed using the MEGA software (figure 1). Figure 1: A chromatogram viewed using the MEGA software. Each peek represents one of the four bases, A, T, C and G. This tells us the sequence of the PCR product. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 38 PLANT STREAM SESSION 1 (SEQUENCE EDITING, SEQUENCE IDENTIFICATION AND MULTIPLE SEQUENCE ALIGNMENT) There are three objectives for session 1: 1) Edit the sequencing files a) Sequencing files (.ab1 files) are provided for gingko, cycad, tobacco and wollemi. b) Determine orientation of the file. c) Remove the primer and vector sequence. d) Confirm the identity of the sequencing file by running it through NCBI. 2) Find additional sequences TIP: Additional sequences files do not need to be edited before added to the alignment. Only sequence of the SAME region of DNA should be added to the alignment. a) Three text files (.fa) are provided in the folder on FLO; bunya pine, NZ kauri and QLD kauri. b) The accession numbers for maize, pinus and gingko are also provided and can be found using NCBI. c) The following outgroups are suggested: i) Seed and Gingko research question: Asplenium ceterach and Gnetum neglectum ii) Podocarpus genus can be used for the Drowning of New Zealand research question. 3) Build an alignment a) All sequences can be added to the sequence alignment. This alignment will grow and change as sequences relevant to your research question are found. Instructions below outline how to add the edited files to the alignment as you go. Not all sequences provided will be relevant to your research question. Editing the sequencing files TIP: Create a folder on your computer to save all the edited files. Save each file as you go. If you are using a class computer, save the information on a USB. 1) Open the gingko SP6.ab1 file. This file name indicates that this PCR product has been sequenced using the SP6 primer. To open the file, open MEGA 7, go to align, edit/view sequencer files, find the gingko.ab1 file previously downloaded and click open (figures 2 and 3). Figure 2: How to open the .ab1 files through MEGA. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 39 Figure 3: Gingko SP6.ab1 chromatogram opened through MEGA. 2) Determine the orientation of the gingko sequence. Search for the region of DNA flanking the multiple cloning site of the pGEM-T easy vector from the SP6 transcriptional start site (figure 4, arrow). To search for sequence within the chromatogram, use the binocular icon (figure 5). Once the GTGATT sequence is identified, determine whether the forward (wol degRF) or reverse primer (wol deg1R) is three prime (3’) of this sequence (figure 6). If the forward primer is present, this tells you the sequence is in the forward orientation. If the reverse primer is present, this tells you the sequence is in the reverse orientation. Figure 4: Multiple cloning site from pGEMT-easy taken from Promega® Technical Manual #TM042. Arrow indicates bases to search for: GTGATT. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 40 Figure 5: Find the position of the sequence using the binocular function. This can also be done using the binocular function under search. Plant Forward Primer (wol deg2F): CYC TCR SCA AYG G Plant Reverse Primer (wol deg1R): CGC CYG ACC TGG GG Reverse Complement of Plant Forward Primer: CCR TTG SYG AGR G Reverse Complement of Plant Reverse Primer: CCC CAG GTC RGG CG Code for redundancy is as follows C = cytidine G = guanosine T = thymidine A = adenine D = A or G or T H = A or C or T K = G or T M = A or C N = A or C or G or T R = A or G S = C or G V = A or C or G W = A or T Y = C or T Figure 6: The plant forward and reverse primers wol deg2F and wol deg1R. Spaces in primers are for readability purposes only. Use the code for redundancy to determine the possible bases for ‘Y’, ‘R’ etc. 3) Remove the vector and primer sequence at the start of the sequence. To do this, highlight all the bases you wish to delete and press the delete key. 4) Search for the primer sequence opposite of what previously found. For example, if the forward primer was at the start of the sequence, search for the reverse primer. If the sequence is found, remove the primer sequence and all sequence downstream. If it is not found this means that the chromatogram data at the end of the sequence is poor. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 41 5) Sequences in the reverse orientation should be reverse complemented at this point. This can be done using the edit > reverse complement function (figure 7). Figure 7: Sequences in the reverse orientation should be reverse complemented. 6) Run the sequencing file through BLAST to confirm the identity (figure 8). Use the BLAST function indicated by the arrow. Record the results of the BLAST search. Figure 8: BLAST search through NCBI allows you to confirm the sequence identity. It also provides the scientific name of the sequence and will identify other similar species. 7) Add the sequence to your alignment using the ‘add to alignment’ function (figure 9). This is indicated by the arrow. This can also be found under data > add to alignment explorer. 8) Follow steps 1-7 to edit the other three chromatograms provided (.ab1 files). Add each to the original alignment once it is fully edited. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 42 Figure 9: Add the edited sequence to alignment explorer. Use the ‘add to alignment explorer’ function indicated by the arrow. This can also be done under data > add to alignment explorer. Find additional sequences 1) Add the text files provided to the alignment. To do this, go to the alignment explorer screen, go to edit > insert sequence from file (figure 10). Select the folder with the relevant information. Change the file type from .ab1 to .txt so that the files are visible (figure 11). Click on the relevant file and open it. This allows it to be added to the alignment. Figure 10: How to add a text sequence to your alignment. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 43 Figure 11: Change the file type to ‘Text’. This allows you to add the additional text files provided to your alignment. Please note that some files may be accession numbers, thus will need to be found via BLAST. 2) To search for sequences corresponding to accession numbers, go to web > query GenBank (figure 12). Copy and paste the accession number provided into the box and click on search (figure 13). Figure 12: How to find additional sequences using GenBank through NCBI. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 44 Figure 13: How to look up an accession number. 3) To add the sequence to the alignment, change the format to FASTA text. Then click on ‘add to alignment’ (figure 14). 4) It is recommend that the file names within the alignment are updated before too many additional sequences are found. Right click on the sequence name to edit. You may want to consider using scientific names with a common name in brackets if appropriate. Figure 14: Importing sequence into your alignment. First choose FASTA (text). Then the sequence can be added to alignment using the ‘+ add to alignment’ function. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 45 PLANT STREAM SESSION 2 (BUILDING THE ALIGNMENT AND CREATING THE PHYLOGENETIC TREE) There are four objectives for session 2: 1) Find and add additional sequences relevant to research question • This can be done using BLAST through MEGA or using Query GenBank as described in previous session instructions. 2) Carry out the alignment • The alignment may need to be trimmed if some sequences are a lot longer than others. 3) Explore the data • Watch the video on how to edit the text file and how to treat any indels or missing data. 4) Create the phylogenetic tree • Watch the video on how to draw the phylogenetic tree. Find and add additional sequence relevant to research question 1) Consider your research question and hypothesis. Which of the sequences in your alignment from session 1 are relevant? What additional sequences are required? What is available in the database? 2) To search for additional relevant sequences, follow the instruction on how to BLAST sequences from session 1. Sequences can also be searched for through query GenBank. Type in the search box the region of DNA used for the analysis and the common or scientific name of the sequence required. The region of DNA used in the analysis is: 5.8S ribosomal RNA (5.8S rRNA), intertranscribed spacer 2 (ITS_2) and 28S ribosomal RNA (28S rRNA). Carry out the alignment *TIP: The alignment works best if the outgroup(s) is added to the top of the file. These can be moved around once all the sequences have been added but not after the sequences are aligned. 1) Once all the sequences are in alignment explorer, you can align the sequences using the option, alignment > align by ClustalW (figure 15). Click ok. a b Figure 15: a) To align the sequences, click on align by ClustalW. b) The parameter message will appear. Click ok. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 46 2) Trim the sequences as appropriate. For example, one sequence may be a lot longer than the others as it includes an additional region of DNA. This can be removed by highlighting the text and clicking delete (Figure 16). The sequences will need to be realigned following any deleted sequence. Figure 16: The alignment can be trimmed to remove sequences which contain additional data. Highlight the relevant sequences and click delete. 3) Check the resulting alignment. Are there large sections of sequence which are aligned? The “*” denotes identical bases for all sequences at that site (figure 17). Ask a demonstrator if you are unsure if the sequence is aligned. Figure 17: Example of aligned data. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 47 Exploring the data 1) Export the alignment. To do this, click on the data menu > export alignment > MEGA format (figure 18). Figure 18: How to export the data to a .mega file. 2) Save the file in the appropriate folder. A message will ask you if this is protein-coding. Click on ‘no’ for the plant stream analysis of the ribosomal spacer region. If you later look at a gene such as rbcL you will need to click on ‘yes’. 3) Watch the video Plant prelab 5 Exploring data which shows you how to edit and analyse the data before you create the phylogenetic tree. 4) To start editing the data, go to file > edit a text file > open a file and choose the .meg file previously saved. 5) As outlined in the Plant prelab 5 Exploring data video you are now ready to consider the missing data and indels. Missing data is represented as a ‘-‘ at the end of each sequencing file. Indels are represented as ‘-‘ within the data (figure 19). Indels Missing data Figure 19: Edit a text file. Missing data should be replaced with ?. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 48 6) To replace missing data with ‘?’ highlight the relevant data > search > replace. Replace – with ?. Remember to click on ‘selected text only’ as you want to maintain the indels within the sequence. Do this for all missing data in the file (figure 20). Once completed, save the file as a new name. File > save as. This is now your edited .meg file. Figure 20: Replace missing data with a ?. 7) To start exploring the data, make sure MEGA is open, go to file > open a file/session, then open the .meg file previously edited. 8) Use the functions shown in the video to highlight the variable, conserved, singleton and parsimony informative sites (figure 21). Record this data. Figure 21: Exploring the data. View the variable, conserved, singleton and parsimony informative sites by clicking on the relevant icon. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 49 Create the phylogenetic tree TIP 1: If you have many sequences creating the tree could take hours or even overnight. If this is the case, create a tree initially with the replicon number set to 100. TIP 2: to remove a sequence from the phylogenetic tree without deleting it from the alignment, unclick the relevant sequence and run the tree again (figure 19). 1) Close the data exploring window. 2) Go back to the ‘blue’ window and go to Phylogeny > Maximum Parsimony tree (figure 22). Under test of phylogeny, choose bootstrap method. Use the default settings in MEGA and change the Replicon number to 1000 (figure 23). Select compute. MEGA will produce a consensus tree of the samples (figure 24). 3) Once you have your tree you will need to set the ‘root’ of the tree. This is where you ‘tell’ the program which branch represents the root of your data. 4) Click on the bootstrap consensus tree. Then select the root tool in the subtree menu and click the sample lineage to root the branch (figures 25-26). Demonstrators will be available to assist in the interpretation of your results. This is the most important output for you so it is important that you understand the different parameters and what they do. Remember, this is your first phylogenetic tree and may not be the tree you include in the scientific manuscript. You may want to run the analysis again, with additional sequences or explore another region of DNA. GOOD LUCK! Figure 22: How to create the phylogenetic tree. Choose the Construct/Test Maximum Parsimony tree option. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE Figure 23: Choose the Bootstrap method. Change the number of bootstrap replicons to 1000. Figure 24: Choose the bootstrap consensus tree option indicated by the arrow. 50 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 51 Figure 25: How to set the root of the tree. First click on the ‘root’ function. Then click on the branch representing the root of the tree. Figure 26: The final tree. Save your phylogenetic tree. A standard MEGA caption can be outputted by selecting the Caption option at the top of the window. Make sure that the caption is edited and paraphrased correctly. Not all of the information in the MEGA caption will be relevant to your analysis. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 52 REPTILE STREAM Overview During the wet labs, a partial region of the cytochrome b gene (cyto b) was PCR amplified using the degenerative primer pair cytb F and cytb R. The PCR products were then directly sequenced using the primers cytb F and cytb R at the Australian Genomic Research Facility (AGRF) in Brisbane. These sequences have been provided as .ab1 files. Note: only one primer is needed for a sequencing reaction. All PCR products were sequences from both directions to allow good coverage of the PCR product. This is why two sleepy lizard.ab1 files are available. The .ab1 files provided are called electropherograms or chromatograms and can be viewed using the MEGA software (figure 1). Figure 1: A chromatogram viewed using the MEGA software. Each peek represents one of the four bases, A, T, C and G. This tells us the sequence of the PCR product. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 53 REPTILE STREAM SESSION 1 (SEQUENCE EDITING, SEQUENCE IDENTIFICATION AND MULTIPLE SEQUENCE ALIGNMENT) There are three objectives for session 1: 1) Edit the sequencing files a) Create a consensus sequence using sleepy lizard a and sleepy lizard b .ab1 files. i) Firstly remove the primer sequence. b) Edit the crocodile and Egernia stokesii consensus sequences provided. c) Confirm the identity of the sequencing file by running it through NCBI. 2) Find additional sequences * Additional sequences files do not need to be edited before added to the alignment. Only sequence of the SAME region of DNA should be added to the alignment. a) The following accession numbers have been provided. Not all will be relevant to your research question. o DQ001031 Wall lizard Italy o FJ536153 Galápagos Islands marine iguana o JQ217202 sea snake o U69851 carpet python o AY194412 Yellow crowned amazon o AB177977 Sulphur crested cockatoo o AB167711.1 Gila monster **Note: complete mitochondrial genome – will need to find only cyto b for analysis o NC_008778.1 Nile monitor **Note: complete mitochondrial genome – will need to find only cyto b for analysis b) The following outgroup is suggested: i) Tuatara 3) Build an alignment a) All sequences can be added to the sequence alignment. This alignment will grow and change as sequences relevant to your research question are found. Instructions below outline how to add the edited files to the alignment as you go. Remember, not all files provided will be relevant to your research question. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 54 Editing the sequencing files TIP: Create a folder on your computer to save all the edited files. Save each file as you go. If you are using a class computer, save the information on a USB. 1) Open the sleepy lizard A.ab1 file. To open the file, open MEGA 7, go to align, edit/view sequencer files, find the sleepy lizard A.ab1 file previously downloaded and click open (figures 2 and 3). Figure 2: How to open the .ab1 files through MEGA. Figure 3: Sleepy lizard.ab1 chromatogram opened through MEGA. 2) Search for the forward primer (figure 4). To search for sequence within the chromatogram, use the binocular icon (figure 5). If the forward primer is present, this tells you the sequence is in the forward orientation. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 55 Reptile Forward Primer: 5’AAA AAG CTT CCA TCC AAC ATC TCA GCA TGA TGA AA 3’ Reptile Reverse Primer: 5’AAA CTG CAG CCC CTC AGA ATG ATA TTT GTC CTCA 3’ Reverse Complement of Reptile Forward Primer: TTT CAT CAT GCT GAG ATG TTG GAT GGA AGC TTT TT Reverse Complement of Reptile Reverse Primer: TGA GGA CAA ATA TCA TTC TGA GGG GCT GCA GTT T Code for redundancy is as follows C = cytidine G = guanosine T = thymidine A = adenine D = A or G or T H = A or C or T K = G or T M = A or C N = A or C or G or T R = A or G S = C or G V = A or C or G W = A or T Y = C or T Figure 4: The reptile forward and reverse primers cytb F and cytb R. Spaces in primers are for readability purposes only. Use the code for redundancy to determine the possible bases for ‘Y’, ‘R’ etc. Figure 5: Find the position of the sequence using the binocular function. This can also be done using the binocular function under search. 3) Remove the primer sequence at the start of the sequence and all sequence 5 prime (5’ or before) the primer. To do this, highlight all the bases you wish to delete and press the delete key. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 56 4) Search for the primer sequence opposite of what previously found. For example, if the forward primer was at the start of the sequence, search for the reverse primer. If the sequence is found, remove the primer sequence and all sequence downstream. If it is not found this means that the chromatogram data at the end of the sequence is poor. 5) Trim unclear sequence at the 3 prime (3’) end of the chromatogram. Ask your demonstrator what is meant by unclear sequence if you are unsure. 6) Sequences in the reverse orientation should be reverse complemented at this point. This can be done using the edit > reverse complement function (figure 6). TIP: Unsure if your sequence is in the reverse orientation? If the forward primer was found at the 5’ end of the sequence, the sequence is in the forward orientation. If the reverse primer was found at the 5’ end of the sequence, the sequence is in the reverse orientation. Only these sequences need to be reverse complemented. Figure 6: Sequences in the reverse orientation should be reverse complemented. 7) Add the sequence to your alignment using the ‘add to alignment’ function (figure 7). This is indicated by the arrow. This can also be found under data > add to alignment explorer. 8) Open the sleepy lizard B.ab1 file and follow the instructions outlined in steps 1-7 above searching for the opposite primer to that found in the sleepy lizard A.ab1 file. TIP: Sequences in the reverse orientation should be reverse complemented before they are added to the alignment. Figure 7: Add the edited sequence to alignment explorer. Use the ‘add to alignment explorer’ function indicated by the arrow. This can also be done under data > add to alignment explorer. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 57 9) Once both sleepy lizard A and sleepy lizard B sequences have been edited and added to the same alignment, you are ready to create the consensus sequence. 10) Align the sequences using the option, alignment > align by ClustalW (figure 8). Click ok. a b Figure 8: a) To align the sequences, click on align by ClustalW. b) The parameter message will appear. Click ok. 11) Check that the sequences are aligned. There should be a region of overlap between the two sequences. If you are unsure, ask your demonstrator. 12) Insert a blank sequence to the alignment. Edit > insert blank sequence. 13) Copy the whole forward sequence into the blank sequence. Copy >paste. 14) Copy the end of the reverse sequence (following the match) into the blank sequence. Add this sequence 3’ to the forward sequence just added. 15) Delete the original forward and reverse sequence. There now should be one sequence left in the alignment which represents the cyto b region PCR amplified from sleepy lizard. 16) Save the sequence by exporting the alignment as a .fasta file. This allows you to import this sequence into a new alignment if necessary. Data > export alignment > fasta. 17) Run the sequencing file through BLAST to confirm the identity (figure 9). Right click on the sequence and click on BLAST option. Record the results of the BLAST search. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 58 Figure 9: BLAST search through NCBI allows you to confirm the sequence identity. It also provides the scientific name of the sequence and will identify other similar species. 18) Open the crocodile consensus sequence provided (croc consensus sequence.mas). This PCR product was sequenced from both orientations and provided to us as a consensus file (figure 10) Figure 10: To open the .mas file, go to Data > open a file/session and click on the relevant file. 19) Edit the croc consensus file using the steps outlined in steps 2-4 above searching for the relevant primers and removing this sequence. TIP: Remember to save the file as a new name once you have finished editing the sequence. 20) Edit the Egernia stokesii consensus sequence using the steps previously described. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 59 Find additional sequences 1) Add the three edited sequences to the alignment explorer. To do this, open one of the saved edited sequence files as outlined in figure 10. Go to the alignment explorer screen, go to edit > insert sequence from file (figure 11). Select the folder with the relevant information. Click on the relevant file and open it. This allows it to be added to the alignment. Figure 11: How to add an edited sequence to your alignment 2) To search for sequences corresponding to accession numbers, go to web > query GenBank (figure 12). Copy and paste the accession number provided into the box and click on search (figure 13). Figure 12: How to find additional sequences using GenBank through NCBI. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 60 Figure 13: How to look up an accession number. 3) To add the sequence to the alignment, change the format to FASTA text. Then click on ‘add to alignment’ (figure 14). 4) It is recommend that the file names within the alignment are updated before too many additional sequences are found. Right click on the sequence name to edit. You may want to consider using scientific names with a common name in brackets if appropriate. Figure 14: Importing sequence into your alignment. First choose FASTA (text). Then the sequence can be added to alignment using the ‘+ add to alignment’ function. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 61 REPTILE STREAM SESSION 2 (BUILDING THE ALIGNMENT AND CREATING THE PHYLOGENETIC TREE) There are four objectives for session 2: 1) Find and add additional sequences relevant to research question • This can be done using BLAST through MEGA or using Query genbank as described in previous session instructions. 2) Carry out the alignment • The alignment may need to be trimmed if some sequences are a lot longer than others. 3) Explore the data • Watch the video on how to edit the text file and how to treat any indels or missing data. 4) Create the phylogenetic tree • Watch the video on how to draw the phylogenetic tree. Find and add additional sequence relevant to research question 1) Consider your research question and hypothesis. Which of the sequences in your alignment from session 1 are relevant? What additional sequences are required? What is available in the database? 2) To search for additional relevant sequences, follow the instruction on how to BLAST sequences from session 1. Sequences can also be searched for through query GenBank. Type in the search box the region of DNA used for the analysis and the common or scientific name of the sequence required. The region of DNA used in the analysis is: a partial region of the cytochrome b gene. Carry out the alignment *TIP: The alignment works best if the outgroup(s) is added to the top of the file. These can be moved around once all the sequences have been added but not after the sequences are aligned. 1) Once all the sequences are in alignment explorer, you can align the sequences using the option, alignment > align by ClustalW (figure 15). Click ok. a b Figure 15: a) To align the sequences, click on align by ClustalW. b) The parameter message will appear. Click ok. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 62 2) Trim the sequences as appropriate. For example, one sequence may be a lot longer than the others as it includes an additional region of DNA. This can be removed by highlighting the text and clicking delete (figure 16). The sequences will need to be realigned following any deleted sequence. Figure 16: The alignment can be trimmed to remove sequences which contain additional data. Highlight the relevant sequences and click delete. 3) Check the resulting alignment. Are there large sections of sequence which are aligned? The “*” denotes identical bases for all sequences at that site (figure 17). Ask a demonstrator if you are unsure if the sequence is aligned. Figure 17: Example of aligned data. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 63 Exploring the data 1) Export the alignment. To do this, click on the data menu > export alignment > MEGA format (figure 18). Figure 18: How to export the data to a .mega file. 2) Save the file in the appropriate folder. A message will ask you if this is protein-coding. Click on ‘yes’ for the reptile stream analysis of the partial region of the cytochrome b gene. 3) Watch the video Reptile prelab 5 Exploring data which shows you how to edit and analyse the data before you create the phylogenetic tree. 4) To start editing the data, go to file > edit a text file > open a file and choose the .meg file previously saved. 5) As outlined in the Reptile prelab 5 Exploring data video you are now ready to consider the indels. Indels are represented as a ‘-‘ in the data (figure 19). 6) To replace indels with ‘?’ highlight the relevant data > search > replace. Replace – with ?. This will replace all indels in the file (figure 20). Check that the sequence file names haven’t been altered. This can create an error in the file. Once completed, save the file as a new name. File > save as. This is now your edited .meg file. Figure 19: Indels are shown as a ‘-‘ in the file. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 64 Figure 20: Replace all ‘-‘ with ‘?’ using the replace text function. 7) To start exploring the data, make sure MEGA is open, go to file > open a file/session, then open the .meg file previously edited. 8) Use the functions shown in the video to highlight the variable, conserved, singleton and parsimony informative sites (figure 21). Record this data. Figure 21: Exploring the data. View the variable, conserved, singleton and parsimony informative sites by clicking on the relevant icon. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 65 Create the phylogenetic tree TIP 1: If you have many sequences creating the tree could take hours or even overnight. If this is the case, create a tree initially with the replicon number set to 100. TIP 2: To remove a sequence from the phylogenetic tree without deleting it from the alignment, unclick the relevant sequence and run the tree again (figure 21). 1) Close the data exploring window. 2) Go back to the ‘blue’ window and go to Phylogeny > Maximum Parsimony tree (figure 22). Under test of phylogeny, choose bootstrap method. Use the default settings in MEGA and change the Replicon number to 1000 (figure 23). Select compute. MEGA will produce a consensus tree of the samples (figure 24). 3) Once you have your tree you will need to set the ‘root’ of the tree. This is where you ‘tell’ the program which branch represents the root of your data. 4) Click on the bootstrap consensus tree (figure 24). Then select the root tool in the subtree menu and click the sample lineage to root the branch (figures 25-26). Demonstrators will be available to assist in the interpretation of your results. This is the most important output for you so it is important that you understand the different parameters and what they do. 5) You are now equipped with the skills to conduct phylogenetic analysis on sequence data. You may want to test the hypothesis again but this time with a different gene. Your thorough understanding of these analysis tools is critical to data interpretation and your scientific manuscript. GOOD LUCK! Figure 22: How to create the phylogenetic tree. Choose the Construct/Test Maximum Parsimony tree option. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE Figure 23: Choose the Bootstrap method. Change the number of bootstrap replicons to 1000. Figure 24: Choose the bootstrap consensus tree option indicated by the arrow. 66 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 67 Figure 25: How to set the root of the tree. First click on the ‘root’ function. Then click on the branch representing the root of the tree. Figure 26: The final tree. Save your phylogenetic tree. A standard MEGA caption can be outputted by selecting the Caption option at the top of the window. Make sure that the caption is edited and paraphrased correctly. Not all of the information in the MEGA caption will be relevant to your analysis. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 APPENDICES How to use a Pipetman PAGE 68 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 69 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 70 BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 71 College of Science & Engineering Biological Sciences Building Biology Discovery Centre (BDC) Undergraduate Laboratory Safety Procedures Tanya Rodaro, Work Health Safety Officer Updated Jul 2017 General The Flinders University Work Health and Safety Policy (2013) details the University commitment to the proper management of work health and safety and injury management in compliance with the South Australian Work Health and Safety Act and Regulations 2012. Available at: http://www.flinders.edu.au/ppmanual/health-safety/work-health-and-safetypolicy.cfm. The University is committed to providing a safe and healthy workplace for all persons in the workplace. Under the University WHS policy, Supervisors (including persons supervising students) are responsible and accountable for the day-to-day health and safety within their areas of responsibility. It is the responsibility of all students to comply with all safety instructions or directions, take care to protect their own health and safety and to avoid affecting adversely the health and safety of all others. In particular both staff and students are responsible for: • complying with relevant University work health and safety policies, procedures and programmes; • obeying any reasonable instruction aimed at protecting their health and safety: o in particular adhering to the general and specific safety information provided in each practical session; • using any equipment provided to protect their health and safety in the workplace; • • reporting any incident or hazard in the workplace to their demonstrator or supervisor: on-line reporting through the FlinSafe portal http://www.flinders.edu.au/whs/ • considering and providing feedback on any matters which may affect their health and safety directly to the Work Health Safety Officer tanya.rodaro@flinders.edu.au or via their supervisor/demonstrator. Further University safety policies, procedures and information: http://www.flinders.edu.au/whs/ The standard laboratory environment Each laboratory has: • Door signage of basic laboratory safety requirements; • Evacuation and emergency procedures; • Fire extinguishers (for use by trained staff only); • First Aid Kits with First Aid response procedures; BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 72 • Manual eyewash stations with instructions (ground level BDC) for flushing chemical or • • • • • • biological contaminants from eyes or built-in eyewash and/or safety shower complexes (Level 1 BDC & Lab 5 Biological Sciences); Spill kits with instructions for containing minor chemical or biological fluid spills; Safety glasses with alcohol wipes to clean before use; Nitrile gloves for protection of hands from accidental exposure to hazardous substances; Handwashing facilities; A disposal trolley and/or procedure with containers and signage directing safe disposal of waste; Access to the FlinSafe incident/accident reporting system. Each laboratory bench may have: • A yellow ‘Sharps’ container for disposal of syringes, needles, scalpels, glass pasteur pipettes, microscope slides and coverslips, and razor blades; • Clear buckets containing hypochlorite solution or a liner for decanting supernatants and disposing of pipette tips, swabs, microtubes and similar small, contaminated items and solutions; • 70% Ethanol spray bottles and tissues to wipe and clean bench. The responsibilities of STUDENTS Each student must: • Read the above section on the standard laboratory environment and familiarise yourself with locations of all items and information; • Read and comply with any risk and safety information provided in your manual and/or provided by Topic Coordinators, Lecturers, Demonstrators and Laboratory Services Staff; • Inform your Demonstrator of any allergies or medical condition that needs to be taken into consideration or that could have health or safety implications; • Report any issues that have the potential to impact on your health and safety or that of others to Laboratory Services Staff or your Demonstrator; • Immediately report accidents to your Demonstrator and complete a FlinSafe on-line incident report (your Demonstrator is your ‘supervisor’). General: • Closed toe footwear and a laboratory coat or equivalent must be worn in all wet laboratories and/or as directed; • Long hair must be tied back; • Do not run in the laboratory or behave in a manner that could affect your own or any other person’s safety; • No food or drink is to be consumed in the laboratory; • Use of and media devices (e.g. iPods, MP players) is banned in all wet laboratories in the Faculty; • Use of mobile phone camera apps may be permitted but with due consideration of chemical and biological contamination risks; • Keep work books and pens etc clear of your bench work area to avoid exposure to contaminants; • Wipe down your bench on completion and clean up used materials as directed; • Remove your labcoat and wash your hands before leaving the laboratory. BIOL 2702 – GENETICS EVOLUTION AND BIODIVERSITY – 2018 PAGE 73 The responsibilities of DEMONSTRATORS Each Demonstrator must: • Be aware of the potential hazards associated with each exercise and familiarise themselves with control and containment provisions; • Raise specific hazards and controls to the attention of students AND enforce compliance; • In the event of an accident, complete the supervisor section of the Accident/Incident form; • Report any potential safety issues to the Work Health Safety Officer and/or the Teaching Support Technical Coordinator. The responsibilities of Laboratory Services technical staff The Laboratory Services technical staff must: • Identify and control all hazards in the laboratories; • Risk assess and review all safety issues in exercises and implement controls; • Ensure that appropriate hazard and safety information is provided AND enforced by Demonstrators; • Implement elimination or substitution whenever possible to reduce the risks; • Report safety issues and make considered safety recommendations to the Safety Manager with the aim to continuously improve undergraduate laboratory safety. Tanya Rodaro, Work Health Safety Officer 8201 2368 tanya.rodaro@flinders.edu.au Room 201 Biological Sciences Building
Flinders University Bachelor of Science (Plant Biology) Assignment Cover Sheet DETAILS: Topic: BIOL 2702: Genetics, Evolution and Biodiversity Class No/Room/time of practical: Student Name: Student ID: Assignment Title: Manuscript Word count: Date: Abstract To attain a satisfactory conclusion, the researchers (we) extracted the target DNA from sample seed plants and amplified a section of DNA from each sample using PCR and ran the samples through agarose gel electrophoresis. Then the researchers constructed a phylogenetic tree based on the results of the sequences of the DNA. Introduction Within the scenario of the progression of the environment from the origin and development of land plants, it took decades for the scientists to identify and decipher the secrets behind the origin and evolutionary pattern of the same, says Qiu (2008). To be specific, proper understanding of the evolutionary pattern of seed plants is essential for the protection of plant species. Lee et al., (2011) makes the point that a main objective for evolutionary biologists has been to gain more knowledge toward the genomic and genetic basis of diversification of plants. Seed plants’ lineage have been a target of investigation but with no real breakthrough. Finding their most suitable habitats may help in the protection and promotion of growth of different variety of plants. Similarly, macro-evolutionary patterns are helpful for the scientists to have deeper understanding of plant life. Smith and Brown (2018) point out that large phylogenies can help explain the macro-evolutionary patterns that informs the understanding of how the tree of life is shaped. The authors state that these phylogenies equally serve as a tool that assist other evolutionary, systematic, and ecologic analysis. The authors conducted a study, which demonstrates that molecular experiment results can explain the phylogeny of seed plants. They had limited resources and therefore a lot more was needed to be done to improve the end results. Still, there are several hurdles and challenges for the scientists like the secret behind plant variety on the earth. According to Harris and Davies (2006), irregular distribution and abundance throughout plant varieties have always been challenging for evolutionary biologists to explain. With the help of scientific experiment results, the authors opine that diversification of plants is isolated to Clade Age, irrespective of the range of age of the clades being examined. They are of the opinion that molecular phylogenetic studies can provide an opportunity to contrast compelling hypothesis about the evolution of the seeds plant species. Following a large amount of time spent on molecular phylogenetic reviews, the phylogeny of gymnosperms has not been answered. The phylogeny of Gnetales remains a controversial issue in seed plant evolution. A phylotranscriptomic study was conducted which involved a sampling of all 13 families of main lineages of angiosperms and gymnosperms. The authors claim that their study produced good finding of the phylogeny of seed plants. The experiment also revealed that Cycads plus Ginkgo is sister to the remaining gymnosperms. It found that angiosperms and Gnetales are common in their rate of molecular evolution. This may mean that these two might have undergone similar selective pressure in the history of their evolution, says Jin-Hug et al. (2018). Due to the gaps in previous studies such as the inability to authoritatively state the evolutionary tree of seed plants, it is vital to continue to conduct molecular studies to answer the question of the origin of living seed plants. Mathews (2009), maintains that the conflict between molecular and morphological phylogenetic studies on the living seed plants can be resolved by sampling sequence data from more taxa, especially from extra gymnosperms as a way of representing living seed plant diversity. To be specific, tests using molecular sequences can be helpful to unmask the secret behind the evolution of seed plants. Bowe et al. (2000) states that the question of the evolution of seed plant can be solved by using molecular sequences with slower rates of nucleotide substitution. Bowe et al. (2000) states that DNA extraction can be attained and amplified by using Polymerase Chain Reaction (PCR) as a way of helping to understand the genomes and genetic composition of seed plants thus helping to understand the evolutionary patterns. Molecular experiments are the basis for understanding the phylogeny of seeds plants, says Qiu (2008). This study hypothesizes that molecular experiments can helpful in identifying the phylogeny of seed plants. Working within the confines of this hypothesis, the study will purpose to answers the research question, which is, can Molecular experiments reveal a clear evolutionary chain for the seed plant species? Materials and Methods DNA Extraction Plant materials were collected from Ginkgo, Cycad, Wollemi, Pine, Tobacco, QLD Kauri and Bunya pine plants, each two different leaf samples – 50 mg dry samples and 150 mg wet samples. The plant tissue was placed into a patty pan and diced up. After dicing, the sample was put into a mortar and pestle before adding liquid nitrogen and hard-grinding the tissue. After grinding, the contents were transferred into a 1.5 ml microfuge tube using a flat spatula before adding 400 𝜇l Lysis Buffer PA1 and vortexing the the mixture. After vortexing, 10 𝜇l of RNase A solution was added into the microfuge tube and mixed thoroughly before incubation at 65℃ for about 30 minutes. In the next step, ISOLATE II Filter was placed into a 2 ml Colelction Tube, followed by loading of lysate onto column. The tube was then centrifuged at 13,000 rpm for two minutes. The clear flow-through was collected into a new 1.5 ml microfuge tube, while ISOLATE II Filter and pellets were discarded. To the tube, 450 𝜇l of Binding Buffer PB was added and mixed thoroughly by pipetting 5 times. ISOLATE II Plant DNA Spin Column was then placed into another 2 ml Collection Tube for loading the sample. The sample was then centrifuged at 13,000 rpm, and the flowthrough discarded. A 400 𝜇l Wash Buffer PAWI was added into the tube, and the mixture centrifuged again at 13,000 rpm for about a minute and the flow-through discarded. This step was repeated by adding 700 𝜇l Wash Buffer PAW2 to the sample, followed by 200 𝜇l Wash Buffer PAW2. The sample was then centrifuged at 13,000 rpm for about 2 minutes to eliminate wash buffer and dry silica membrane. ISOLATE II Plant DNA Spin Column was then placed in a new 1.5 ml microfuge tube. To this tube, 50 𝜇l of preheated Elution Buffer PG added onto silica membrane before incubation at 65℃ for about five minutes, followed by centrifuging at 13,000 rpm for a minute. This step was repeated using another 50 𝜇l Elution Buffer PG in the same tube. The samples were then labeled and placed in a freezer at -20℃ for PCR. Polymerase Chain Reaction (PCR) After determining the number of PCR reactions required for each pair, the necessary controls were made. The volume of each reagent in the mastermix needed for the number of PCR reactions determined was then calculated and recorded. The PCR mastermix was then prepared using the reagents provided in a 1.5 ml tube, mixed for ten seconds, followed by centrifuging at 8,000 rpm for ten seconds. The tubes were then labeled for easy identification. To each PCR tube, 23 𝜇l of the master mix was added using a pipette, then adding 2 𝜇l of correct sample to each PCR tube. The tubes were then carefully spanned down and positioned on ice as they awaited insertion into the PCR machine. The PCR was run for three hours with the following cycling conditions: ▪ Pre-PCR cycle: 2 minutes at 95℃. ▪ 34 x PCR cycle: Step 1: 45 sec at 95℃, Step 2: 45 sec at 52℃, Step 3: 60 sec at 72℃. ▪ Post-PCR cycle: 5 minutes at 72℃; ∞ at 4℃. Visualization of PCR products by electrophoresis through an agarose gel The agarose gel casting tray was prepared in the gel tank and the comb placed in position before pouring the molten agarose into the tray. After about 20 minutes of setting, TAE buffer was added into the tank until agarose gel was completely immersed. The comb was then carefully removed from the agarose gel. Using a pipette, 3 𝜇l of DNA marker was placed into the first lane, followed by 10 𝜇l of each sample into other lanes. The electrophoresis gel tank was then sealed with correct polarity, with the voltage and timer set at 90V and 60 minutes respectively. After 60 minutes, the gel was removed and placed into travel container. The Biorad GelDoc system was then used to photograph the agarose gel. The gel consists of Midori Green which binds DNA and fluoresces in the presence of UV light. Data acquisition and sequence of the PCR product The ribosomal spacer regions (5.8S rRNA, ITS_2 and 28 rRNA) in the wet lab were PCR-amplified by use of the degenerative primer pair wol degIR and wol deg2F. Cloning of the PCR products was done in the pGEMt easy vector, and then sequencing at the Australian Genomic Research Facility (AGRF) using T7 and SP6 primers to provide electropherograms that can be viewed using the MEGA software. Sequence editing, sequence identification and multiple sequence alignment (Mega software) Samples were examined to establish the orientation of the sequence. This was done by finding where the forward and reverse primer were present. The vector and primer sequence before the forward primer was deleted as well as the sequence downstream of the reverse primer. A blast search was then done on all of the samples to confirm the sequence identity. The scientific name and similar species were also recorded from the blast search. The samples were then added to the alignment. Building an alignment and constructing a phylogenetic tree (Mega software) Other species relevant to the research question were then added to the alignment. These were found from both the results of the blast search as well as using Gen Bank through NCBI. The alignments with any extra sequences not needed were deleted from there sequence. Once all sequences were aligned the data was exported to mega format. A question mark was added to any missing data from the start and end of the sequences in the alignment. All data was explored and recorded as variable, conserved, singleton and parsimony sites. All the relevant sequences were then analyses and outputted into a phylogenic tree. The root of the tree was then selected in relation to the research question. Results Conclusion/ Discussion References Bowe, L., Coat, G., and DePamphilis, C. (2000). Phylogeny of seed plants based on all three genomic compartments: Extant gymnosperms are monophyletic and Gnetales' closest relatives are conifers. Proceedings of the National Academy of Sciences of the United States of America, 97(8), 4092-4097. Harris, L., and Davies, T. (2016). A Complete Fossil-Calibrated Phylogeny of Seed Plant Families as a Tool for Comparative Analyses: Testing the 'Time for Speciation' Hypothesis. PLoS ONE, 11(10), E0162907 Jin-Hua, R., Ting-Ting S., Ming-Ming, W. and Xiao-Quan, W. (2018). Phylogenomics resolves the deep phylogeny of seed plants and indicates partial convergent or homoplastic evolution between Gnetales and angiosperms. Proceedings of Royal Society B Journal Issues 13 Research Articles in June 27, 2018, Edition. (2018). Lee, E., Cibrian-Jaramillo, A., Kolokotronis, S., Katari, M., Stamatakis, A., Ott, M., . . . Sanderson, M. (2011). A Functional Phylogenomic View of the Seed Plants (A Functional Phylogenomic View of the Seed Plants). PLoS Genetics, 7(12), E1002411. Mathews, S. (2009), Phylogenetic relationships among seed plants: Persistent questions and the limits of molecular data. American Journal of Botany, 96: 228-236. Qiu, Y.L. (2008). Phylogeny and evolution of charophytic algae and land plants. Journal Of Systematics And Evolution, 46(3), 287-306. Smith, S., and Brown, J. (2018). Constructing a broadly inclusive seed plant phylogeny. American Journal of Botany, 105(3), 302-314.
American Journal of Botany 96(1): 228–236. 2009. PHYLOGENETIC RELATIONSHIPS AMONG SEED PLANTS: PERSISTENT QUESTIONS AND THE LIMITS OF MOLECULAR DATA1 Sarah Mathews2 The Arnold Arboretum of Harvard University, 22 Divinity Avenue, Cambridge, Massachusetts 02138 USA Trees inferred from DNA sequence data provide only limited insight into the phylogeny of seed plants because the living lineages (cycads, Ginkgo, conifers, gnetophytes, and angiosperms) represent fewer than half of the major lineages that have been detected in the fossil record. Nevertheless, phylogenetic trees of living seed plants inferred from sequence data can provide a test of relationships inferred in analyses that include fossils. So far, however, significant uncertainty persists because nucleotide data support several conflicting hypotheses. It is likely that improved sampling of gymnosperm diversity in nucleotide data sets will help alleviate some of the analytical issues encountered in the estimation of seed plant phylogeny, providing a more definitive test of morphological trees. Still, rigorous morphological analyses will be required to answer certain fundamental questions, such as the identity of the angiosperm sister group and the rooting of crown seed plants. Moreover, it will be important to identify approaches for incorporating insights from data that may be accurate but less likely than sequence data to generate results supported by high bootstrap values. How best to weigh evidence and distinguish among hypotheses when some types of data give high support values and others do not remains an important problem. Key words: DNA sequences; fossils; morphology; phylogeny; seed plants. Living seed plants comprise the cycads, Ginkgo, conifers, gnetophytes (together, extant gymnosperms), and angiosperms. Extinct gymnosperms that cannot be assigned to living groups include hydraspermans, medullosans, peltasperms, glossopterids, Caytonia, Pentoxylon, Callistophyton, corystosperms (all these are often referred to as pteridosperms or seed ferns), Bennettitales (sometimes referred to as cycadeoids), Erdtmanithecales, Cordaitales, Paleozoic and Mesozoic conifers, and ginkgophytes. Seed plant diversity is great enough, and the surviving lines divergent enough, that there have been those who hesitated or were unwilling to include them in a single lineage (e.g., Chamberlain, 1935; Arnold, 1948). Arnold (1948, p. 3) took fellow botanists to task for being “completely satisfied to group together quite unrelated plants” based on the character of the seed alone. He and others placed seed plants in at least three groups that were thought to be linked with different groups of free-sporing plants: angiosperms, cycadophytes (“seed ferns,” cycads, Bennettitales) and coniferophytes (Cordaitales, ginkgos, conifers, with or without gnetophytes). Chamberlain (1935) included gnetophytes in coniferophytes, while Arnold (1948) placed them in a separate group, the Chlamydospermophytes. Discussions of seed plant origins shifted focus after the startling discovery of a connection between Archaeopteris (fragments of fern-like fronds from the Devonian) and Callixylon (permineralized twigs, branches, and trunks with wood that linked them with gymnosperms), leading to the recognition of progymnosperms (e.g., Beck, 1960a, b, 1966). Beck hypothesized a diphyletic origin of seed plants from progymnosperms, arguing that cycadophytes and coniferophytes likely arose from different progymnosperms in the order Aneurophytales (Beck 1960b, 1966; Stein and Beck, 1987; see also Bierhorst, 1971). Rothwell (1982) argued for a monophyletic 1 origin from an aneurophytalean ancestor, with both coniferophytes and cycadophytes being derived from within hydrasperman seed plants. The question of whether seed plants are monophyletic remains open to this day. It can only be partially tested with sequence data, despite statements by molecular systematists who claim that seed plant monophyly has been clearly confirmed by molecular phylogenetic studies that include both seed and freesporing plants (e.g., Qiu, 2008). Sequence data could refute monophyly by placing seed plants with different groups of living free-sporing plants, but they are powerless to distinguish between the hypotheses proposed by Beck (1960b, 1966) and Rothwell (1982). To do so requires a matrix of morphological data that includes all possible representatives of the closest relatives of seed plants (progymnosperms), representatives of all seed plant lineages, living and extinct, as well as an ample diversity of lycophytes and ferns. A maximum of three progymnosperms have been included in previous phylogenetic analyses, one of which (Cecropsis) can be scored only for the anatomy and organization of the fertile shoot system (e.g., Rothwell and Serbet, 1994; Hilton and Bateman, 2006). In one of these studies (Rothwell and Serbet, 1994), lycophytes, trimerophytes, equisetalean and filicalean ferns were included in a preliminary analysis from which was inferred a hypothetical ancestor, which was then included to root the seed plant phylogeny. In the other study (Hilton and Bateman, 2006), lycophytes and ferns were not included; a progymnosperm (Tetraxylopteris) was designated as the outgroup. No criticism is intended in these observations. It is difficult to obtain the needed data because fossils are fragmentary or remain uncharacterized and because it is challenging to assess homology of morphological characters in both living and extinct taxa across seed and free-sporing plants. Relationships within seed plants also remain ambiguous. Morphological analyses have not supported the cycadophyte concept (Crane, 1985; Doyle and Donoghue, 1986; Nixon et al., 1994; Rothwell and Serbet, 1994; Doyle, 1996, 2001, 2006; Hilton and Bateman, 2006). These studies found “seed ferns” to be polyphyletic, consistent with their extreme heterogeneity and the wide range of sophistication in their reproductive Manuscript received 26 May 2008; revision accepted 17 November 2008. The author thanks S. Renner and one anonymous reviewer for suggestions for improvements of this manuscript. 2 E-mail: smathews@oeb.harvard.edu doi:10.3732/ajb.0800178 228 January 2009] Mathews—Phylogenetic relationships among seed plants structures, and failed to unite cycads and Bennettitales. Coniferophytes also receive little support in results from morphological analyses, although several of the inferred phylogenetic trees include a clade that unites fossil and living conifers with Cordaitales (e.g., Crane, 1985; Nixon et al., 1994; Rothwell and Serbet, 1994; Hilton and Bateman, 2006). DNA sequence data can provide only limited insight into the question. The living lines are almost certainly more closely related to various extinct groups than to each other, particularly in the cases of cycads and angiosperms (e.g., Fig. 1). Nevertheless, trees from sequence data can refute relationships inferred in analyses that include fossils. For example, trimming fossils from the optimal trees inferred in recent morphological analyses (Doyle, 2006; Hilton and Bateman, 2006) would leave the living taxa united as depicted in Fig. 2, with angiosperms nested in gymnosperms, united with gnetophytes, and with cycads sister to all other seed plants. This hypothesis apparently is refuted by analyses of sequence data. Instead, molecular trees differ in a way that highlights three persistent and long-debated phylogenetic questions: What is the sister group of the angiosperms? What is the position of the gnetophytes? What is the rooting of the crown seed plants (spermatophytes sensu Cantino et al., 2007)? 229 THE SISTER GROUP OF THE ANGIOSPERMS In a 1960 speech on the origin of angiosperms, T. M. Harris asked his listeners “to look back, not on a proud record of the success of famous men, but on an unbroken record of failure” (Beck, 1976, p. 1). Writing 16 years later, Beck’s analysis of progress toward understanding angiosperms was considerably more optimistic. Nonetheless, he was writing at a time when the timing of their origin was more controversial than it is today, when the identities of the earliest diverging members were obscure, when the place and habitat of origin were more controversial, when angiosperm monophyly remained to be tested in phylogenetic analyses, and when not all agreed that the angiosperm sister group was to be found among the gymnosperms. Significant advances have been achieved on all of these fronts (Crane, 1985; Doyle and Donoghue, 1986; Mathews and Donoghue, 1999, 2000; Parkinson et al., 1999; Qiu et al., 1999; Graham and Olmstead, 2000; Feild et al., 2003, 2004; Magallón and Sanderson, 2005), due in large part to the advent of molecular systematics and the development of computational approaches and resources. The question that persists concerns the relationship of angiosperms to other seed plants. Fig. 1. Seed plant phylogeny inferred by Doyle (2006, fig. 6), with angiosperms and conifers collapsed to one or two branches, respectively. 230 American Journal of Botany Fig. 2. Tree of extant taxa obtained by trimming fossils from the tree inferred by Doyle (2006, fig. 6). The tree in Fig. 2 is compatible with the anthophyte concept as articulated by Doyle and Donoghue (1987) for a clade of taxa with aggregations of sporophylls that were interpreted as flower-like. The clade included angiosperms, gnetophytes, Bennettitales, and Pentoxylon (e.g., Crane, 1985; Doyle and Donoghue, 1986, 1992; Nixon et al., 1994; Rothwell and Serbet, 1994), or in an expanded version, it also included glossopterids and Caytonia in a clade referred to as glossophytes (Doyle, 1996, 2006; Hilton and Bateman, 2006). Nearly all analyses of DNA sequence data contradict the concept of anthophytes or glossophytes by failing to resolve gnetophytes either as paraphyletic or as sister to the angiosperms. The exceptions are maximum parsimony (MP) or neighbor-joining (NJ) trees inferred from nuclear ribosomal DNA (rDNA; Stefanović et al., 1998; Rydin et al., 2002; but see Chaw et al., 1997, and Burleigh and Mathews, 2004, fig. 2) or RNA (rRNA; Hamby and Zimmer, 1992), and in one case, from rbcL (Rydin and Källersjö, 2002). These exceptional trees unite gnetophytes and angiosperms, but without even moderate bootstrap support. Rather, a highly supported topology from analyses of sequence data (Bowe et al., 2000; Chaw et al., 2000; Nickrent et al., 2000; Gugerli et al., 2001; Soltis et al., 2002; Burleigh and Mathews, 2004) is shown in Fig. 3A. Not only are the gnetophytes nested within conifers (discussed next), but angiosperms and extant gymnosperms are each resolved as monophyletic, suggesting that angiosperms have no close relatives among living gymnosperms. THE POSITION OF GNETOPHYTES “The Gnetales, like Minerva, seem to have sprung, full armed, from the head of Jove.” —Chamberlain (1935, p. 433) Given such a viewpoint, perhaps Chamberlain would not have been surprised when the results from analyses of sequence data suggested that gnetophytes had sprung from conifers (Fig. 3A; [Vol. 96 Bowe et al., 2000; Chaw et al., 2000; Nickrent et al., 2000; Gugerli et al., 2001). However, amid a community that had largely embraced anthophytes, the results were surprising (e.g., Palmer et al., 2004). Even botanists who were more familiar with characters that suggested a link with conifers or who argued that putative synapomorphies for angiosperms and gnetophytes were homoplasies (e.g., Kubitzki, 1990) greeted the idea that gnetophytes had sprung from within conifers with caution (e.g., Donoghue and Doyle, 2000). Conifer monophyly is apparently supported by a number of synapomorphies, including resin canals, tiered proembryos, single copy condition of the plastid inverted repeat, and the ovulate cone scale (Chamberlain, 1935; Crane, 1985; Hart, 1987; Raubeson and Jansen, 1992; Donoghue and Doyle, 2000). Nevertheless, trees from sequence data have consistently united gnetophytes with Pinaceae in a highly supported “gnepine” clade and placed gnepines as sister to a clade of the other conifer families (Cupressophyta sensu Cantino et al., 2007). There are notable, well-supported, exceptions, and in this sense, the results from sequence analyses extend rather than resolve the puzzle surrounding the position of the gnetophytes that has persisted through the years (Arber and Parkin, 1907, 1908; Wettstein, 1907; Thompson, 1918; Chamberlain, 1935; Bailey, 1944; Eames, 1952; Nixon et al., 1994; Doyle, 1996). One of these is depicted in Fig. 3B, which resolves gnetophytes as sister to all other seed plants. This topology is well supported in certain analyses, mostly of concatenated data sets. However, the topology is rarely supported in maximum likelihood (ML) analyses or in parsimony analyses that exclude faster-evolving sites (e.g., Rydin et al., 2002; Burleigh and Mathews, 2004; Hajibabaei et al., 2006; for exceptions, see Burleigh and Mathews, 2007a, and Rai et al., 2008), and it may possibly result from error in reconstruction (Sanderson et al., 2000; Burleigh and Mathews, 2007b). While the gnepine hypothesis remains controversial, a link between conifers, gnetophytes, and Ginkgo was implicit in Chamberlain’s (1935) placement of gnetophytes in coniferophytes (although not without reservation [Chamberlain, 1935, p. 433]). Conifers and gnetophytes share linear leaves, reduced sporophylls, and circular bordered pits with tori in the protoxylem, and together with Ginkgo, they uniquely share metaxylem that lacks scalariform pitting (Bailey, 1944; Bierhorst, 1971; Carlquist, 1996; Doyle, 1996). Thus, a clade in which monophyletic conifers are sister to monophyletic gnetophytes (referred to as a “gnetifer” clade) apparently would be consistent with other lines of evidence. However, gnetifer trees have rarely been inferred in molecular analyses (exceptions are in Chaw et al., 1997; Rydin and Källersjö, 2002; Hajibabaei et al., 2006; Burleigh and Mathews, 2007a). THE ROOTING OF THE CROWN SEED PLANTS “A position of the root between the cycad and Ginkgo nodes might be very difficult to detect, because this branch is so short compared to the long branches to angiosperms and Gnetales.” —Donoghue and Doyle (2000, p. R108) Both angiosperms and gnetophytes are nested well within trees that include living and fossil taxa (Fig. 1), whereas the best-supported rootings of molecular trees are along the branches to angiosperms (Fig. 3A) or gnetophytes (Fig. 3B). These are two of the longest (if not the longest) branches in most molecular trees (see Graham and Iles, 2009, pp. 216–227 in this issue); conversely, the branch between the cycad and Ginkgo nodes is very short in trees that do unite these branches January 2009] Mathews—Phylogenetic relationships among seed plants 231 Fig. 3. The hypotheses of seed plant phylogeny most commonly inferred in analyses of concatenated DNA data sets, with most data sets giving both trees, depending on the analytical method (Bowe et al., 2000; Chaw et al., 2000; Nickrent et al., 2000; Gugerli et al., 2001; Magallón and Sanderson, 2002; Rydin et al., 2002; Soltis et al., 2002; Burleigh and Mathews, 2004, 2007a, b; Hajibabaei et al., 2006; Rai et al., 2008). in a clade. The concern voiced by Donoghue and Doyle in the opening quote is that a long branch from the outgroup may be unlikely to attach to such a short branch. Consistent with this, there is evidence that the rooting along the gnetophyte branch may result from long-branch attraction (Sanderson et al., 2000; Burleigh and Mathews, 2007b). Both trees imply that the first dichotomy in the seed plant phylogeny splits angiosperms (or gnetophytes) from all other extant seed plants, which is inconsistent with currently available stratigraphic evidence (Doyle, 1998). ISSUES WITH DNA SEQUENCE DATA It would be an oversimplification to say that these questions remain unresolved as a result of conflict between molecular and morphological data; there is ambiguity in both types of data. On the one hand, Doyle (2006) found that morphological trees placing gnetophytes within conifers (although not with Pinaceae) are just one step longer than the most parsimonious trees, which are anthophyte trees, but neither of these results is robust. On the other hand, a single clear signal has not emerged from molecular studies. Although there have been several efforts to sample multiple loci and/or concatenate data from previously published seed plant studies to increase the number of characters and loci analyzed (e.g., Bowe et al., 2000; Chaw et al., 2000; Nickrent et al., 2000; Gugerli et al., 2001; Rydin et al., 2002; Soltis et al., 2002; Rai et al., 2003; Burleigh and Mathews, 2004; Hajibabaei et al., 2006), consensus remains elusive. Exploration of some of these data sets has identified several factors that may result in erroneous trees, including high taxonomic sampling error (due to extinctions), saturation at nucleotide sites (due to the age of divergence among major clades), high rate variation across sites and across clades, conflicting signal within and among genetic loci that are used as phylogenetic markers (e.g., Chaw et al., 2000; Sanderson et al., 2000; Magallón and Sanderson, 2002; Rydin et al., 2002; Soltis et al., 2002; Burleigh and Mathews, 2004, 2007a; Hajibabaei et al., 2006), and error and bias in phylogenetic reconstruction (Sanderson et al., 2000; Burleigh and Mathews, 2007b). One effective approach for reducing conflicting signal in single and concatenated data sets is to bin sites based on estimated rates of evolution and to experiment with removing different rate classes (Burleigh and Mathews, 2004; see also Rodríguez-Ezpeleta et al., 2007). For example, Burleigh and Mathews (2004) found that removal of fast-evolving positions from a 13-locus concatenated seed plant data set resulted in convergence of both MP and ML on a gnepine tree, an apparent resolution of the conflict between results from parsimony analyses of all sites, which favored gnetophytes as sister to all seed plants, and likelihood analyses of the same, which favored gnepine trees. However, this does not mean that the gnepine tree is correct, only that one signal is enhanced and the other is dampened when rapidly evolving sites are removed. Both signals cannot be correct, but both may be erroneous. Intuitively, removing noisy sites that may hinder resolution of the question of interest makes sense, but because there is evidence of bias in both slowly and rapidly evolving sites (Burleigh and Mathews, 2007b), reducing noise does not necessarily reduce error. An additional, potentially confounding factor is heterotachy, or shifts in site-specific rates of evolution across time. Heterotachous sites are likely to exist in seed plant data sets and their presence and effects should be explored. TAXONOMIC SAMPLING The best analytical approaches yield limited insight when too few taxa are sampled. Analyses of sequence data from seed plants have included very few extant gymnosperms, fewer than half of the genera and 6% of the species. Most of the highly 232 [Vol. 96 American Journal of Botany cited seed plant studies have included 10, 11, 19, or 21 of ~1100 gymnosperms in 85 genera (Bowe et al., 2000; Chaw et al., 2000; Gugerli et al., 2001; Rydin et al., 2002; Soltis et al., 2002; Rai et al., 2003; Burleigh and Mathews, 2004). The negative effects of the factors just outlined on phylogenetic accuracy are likely to be exacerbated when taxonomic sampling is so limited, even when using appropriate models of nucleotide evolution, removing certain classes of sites, and using analytical approaches that are more robust to error. Increasing taxa can increase accuracy (e.g., Hillis, 1996, 1998; Graybeal, 1998; Stefanović et al., 2004; Philippe et al., 2005) and the efficiency with which a method converges on an accurate tree (e.g., Kim, 1998). Just one significant effort to increase taxonomic sampling has been made in a study that included 69 gymnosperms (Rydin et al., 2002). The fact that Bayesian or ML analysis of their data yields a highly supported gnetifer tree is intriguing (Burleigh and Mathews, 2007a; S. Mathews, unpublished data). However, it is unclear whether this might result from increased taxonomic sampling, from the choice of loci (Burleigh and Mathews, 2007a), or both. The result may be misleading, or it may be that the set of loci analyzed by Rydin et al. (2002) serendipitously captured the signal of the species phylogeny. Analyses of morphological data also have included relatively few taxa. Because the fossil record suggests that there are many distinctive lineages that cannot be assigned to modern groups, the pattern of seed plant evolution cannot be determined without analyses of morphological evidence. However, the detailed morphological investigations of living taxa that are required to properly interpret fossil material are often lacking (Crane et al., 2004). A further challenge to interpreting the fossils is the difficulty and slow pace of reconstructing entire fossil plants from dispersed fossil organs. Thus, while whole-plant reconstructions are the standard for which we should strive, it also will be important to experiment with the inclusion of incomplete fossils because these may increase phylogenetic accuracy (Wiens, 2003, 2005). CHARACTER SAMPLING The increasing ease with which nucleotide characters can be accumulated means that it is particularly important to grapple with the question of how best to do so and/or with the question of how best to analyze concatenated data sets. Although adding characters may increase phylogenetic accuracy (e.g., Graybeal, 1998), both theoretical and empirical studies have shown that it does not always do so and that, in fact, adding characters in some cases increases support for an erroneous tree (e.g., Felsenstein, 1978; Kolaczkowski and Thornton, 2004; Stefanović et al., 2004; Philippe et al., 2005; Matsen and Steel, 2007; Rodríguez-Ezpeleta et al., 2007). In at least some cases, gene trees will not match the species tree, and for some combinations of branch lengths in the species trees, incongruent trees may actually be more likely than congruent gene trees (Degnan and Rosenberg, 2006; Kubatko and Degnan, 2007). In these cases, the most frequently observed gene tree in combined data will be an incorrect estimate of the species tree (Degnan and Rosenberg, 2006). Thus, when data are concatenated from many loci, it is important to explore the different methods available for analyzing these data sets, particularly those appropriate for highly heterogenous data sets (e.g., Nylander et al., 2004; Brown and Lemmon, 2007; Edwards et al., 2007; Liu and Pearl, 2007). A contrasting problem exists with respect to morphological characters. Relatively few structural characters have been identified that can be scored for morphological analyses. Here both effort and new techniques (e.g., Friis et al., 2007) are needed. One concern surrounding the paucity of morphological characters that can be included in a phylogenetic matrix is that if added to a matrix of nucleotide characters, their signal would be swamped. With this in mind, it would be interesting to test the results of combining morphological characters with subsets of a nucleotide matrix. For example, in the case of seed plant analyses, where the faster evolving sites are likely to be saturated and may have little information regarding deeper divergences in the tree, one might combine just the slowest evolving sites with the morphological characters. SOME RECENT STUDIES One of the largest character sets to date has been assembled by Rai and Graham (Rai et al., 2008) to address both conifer and higher order seed plant relationships. Their study uses a strategy of sampling 17, noncontiguous and functionally diverse regions of the plastid genome, in total comprising approximately 14.1 kb unaligned, about one ninth of the genome. Two trees have been inferred from these data, sampled from 38 species (28 of which are gymnosperms). The parsimony tree is identical to the tree in Fig. 3B, with gnetophytes sister to all seed plants, but the topology of the ML tree is novel: gnetophytes are sister to all seed plants, but conifers are sister to a clade in which Ginkgo is sister to cycads + angiosperms. If the rooting of this tree is wrong and if it were to be rerooted between Ginkgo and cycads, it would give a coniferophyte clade (sensu Chamberlain, 1935) on the one hand and a clade of cycads and angiosperms on the other. Substantially larger plastid data sets were analyzed by Wu et al. (2007) and McCoy et al. (2008), sampling 56 and 57 plastid genes, respectively. However, each study included only four gymnosperm genera (Cycas, Ginkgo, Pinus, and Gnetum in Wu et al., 2007; Cycas, Ginkgo, Pinus, and Welwitschia in McCoy et al., 2008) and so cannot be used to test the relationships of conifers and gnetophytes. As in previously published studies, either Gnetum or Welwitschia and Pinus are sister taxa (e.g., Fig. 3A; all trees in Wu et al., 2007; ML and Bayesian trees in McCoy et al., 2008), or Gnetum or Welwitschia are sister to all other seed plants (e.g., Fig. 3B; MP and NJ trees in McCoy et al., 2008). An alternative approach for assembling a large character set is to sample EST databases, which has the added value of sampling nuclear genes. A recent analysis of seed plant EST data from Cycas, Ginkgo, Pinus, and Gnetum (de la Torre et al., 2006) placed Gnetum and Pinus in a well-supported clade. The utility of ESTs may be best exemplified in a recent study in which a combination of newly generated and published EST data were analyzed to resolve multiple long standing phylogenetic questions in animal phylogeny (Dunn et al., 2008). What may have been a key in the apparent success of the study was the strategic accumulation of new EST data to fill in critical taxonomic gaps. Supermatrices are an alternative to phylogenomic approaches that use orthologous genes from whole genome or EST sequences of a relatively small number of taxa. Supermatrices assembled from data in GenBank take advantage of the large number of sequences deposited there from phylogenetic and January 2009] Mathews—Phylogenetic relationships among seed plants 233 Fig. 4. (A) Relationships of seed plant phytochromes; angiosperms have PHYA, PHYB, and PHYC, while gymnosperms have PHYN, PHYO, and PHYP, with the exception that PHYO apparently is missing from the gnetophytes (Mathews, 2006). (B) Species tree in the PHYN/PHYA clade when all nucleotide sites are included in the analysis. (C) Species tree in the PHYN/PHYA clade including only nucleotides sites estimated to be evolving most slowly. population studies. Due to very heterogenous sampling (few taxa represented by many genes, many taxa represented by few genes), these supermatrices may have sequences from many more taxa, but will also have a high percentage of missing data (e.g., Driskell et al., 2004; McMahon and Sanderson, 2006). More than 700 gymnosperms are represented in GenBank by at least one sequence and approximately 680 were included in a supermatrix assembled by Burleigh and Mathews (unpublished data). The matrix has 88 815 sites, but 95.4% of the data cells are empty. Relationships among the major seed plant clades are highly supported in trees inferred from this sparse supermatrix, and gnetophytes are united not with Pinaceae but with cupressophytes (all conifer families but Pinaceae). This is true of both the ML and MP bootstrap trees, except for the MP trees that include outgroup sequences, in which case, gnetophytes are sister to all other seed plants (J. G. Burleigh and S. Mathews, unpublished data). However, analyses of a denser matrix (taxa trimmed to include only those with a minimum of 10 000 nucleotides of data in the matrix, leaving 38 gymnosperms, 12 angiosperms, and 4 outgroups) yield gnepine trees, except again in the case where parsimony is used to analyze the matrix that includes outgroup data, which yields gnetophytes as sister to all other seed plants. Overall, these data thus reduce confidence in gnepine trees, but provide additional support for a link between conifers and gnetophytes. Duplicate gene data sets allow the inference of rooted species trees without the inclusion of sequences from outgroups (Gogarten et al., 1989; Iwabe et al., 1989; Doolittle and Brown, 1994; Mathews and Donoghue, 1999, 2000). This may be particularly worth exploring in analyses of seed plant molecular data because free-sporing and seed plants last shared a common ancestor up to 380 million years ago (Pryer et al., 2004), and because all the basal seed plant lineages are extinct, making it hard to employ the strategy of adding taxa to break up the very long branch from free-sporing plants to extant seed plants. Some preliminary results from analyses of a duplicate phytochrome gene data set from seed plants (S. Mathews and M. J. Donoghue, unpublished data) are worth commenting on here because they indicate a level of uncertainty in the rooting of seed plant phylogenies inferred from sequence data that has not been suggested by other studies. These analyses focus on three phytochrome genes, PHYN/A, PHYO/C, and PHYP/B, which are related as depicted in Fig. 4A. The data sets are incomplete, and I highlight here just two patterns observed in the PHYN/A clade, where the data are most complete. One question being addressed in these analyses is whether different topologies are inferred when sites are successively excluded from searches based on their rate class category, beginning with removal of the fastest sites and ending with inclusion of only the slowest. In particular, what do topologies inferred from the sites estimated to be evolving most slowly suggest about the rooting of the seed plant phylogeny and about the position of the gnetophytes? A rationale for this approach is the expectation that at least some rapidly evolving sites may be essentially randomized with respect to deep divergences (e.g., Swofford et al., 1996). Saturated sites will contribute to 234 [Vol. 96 American Journal of Botany phylogenetic accuracy in many cases (Yang, 1998), but as noted by Burleigh and Mathews (2004), sites in different rate classes may favor different topologies. This appears to be the case where the placement of the root is concerned. In analyses that differed with respect to which sites were included based on their rate class assignment, two topologies were recovered, one that has a gnepine clade and that places angiosperms as sister to a gymnosperm clade (Fig. 4B) and one that is novel, uniting cycads and angiosperms in a clade that is sister to the remaining gymnosperms (Fig. 4C). The relationship between topologies and the set of rate classes included in the analysis is complex, but generally, as faster evolving sites are successively excluded, ML bootstrap support for cycads being sister to the remaining gymnosperms tends to drop while support for a clade of cycads and angiosperms increases. In contrast, support for the gnepine clade is remarkably consistent across the analyses, and even when just sites in the four most slowly evolving rate classes are analyzed, the clade receives 100% maximum likelihood bootstrap support. The gnepine result is not unexpected, but the nonmonophyly of extant gymnosperms in gnepine trees is surprising given the support this split that is seen in other analyses (Bowe et al., 2000; Chaw et al., 2000; Nickrent et al., 2000; Gugerli et al., 2001; Soltis et al., 2002; Burleigh and Mathews, 2004). CONCLUDING REMARKS Significant uncertainty persists in seed plant phylogenies inferred from both molecular and morphological data. Analyses of supermatrices (J. G. Burleigh and S. Mathews, unpublished data) and plastid genome data sets (Chumley et al., 2008) bring a new twist to the question of the position of gnetophytes, maintaining a link with conifers but placing them sister to cupressophytes. This adds to the number of published DNA sequence data sets that have yielded highly supported but conflicting trees, all of which cannot be correct. To some extent, analytical issues encountered in the estimation of seed plant phylogeny may arise from the fact that given the nature of the problem, only limited insight is gained from data sets with few taxa and many characters. This can be addressed by sampling sequence data from more taxa, particularly from extant gymnosperms, so that living seed plant diversity is better represented in nucleotide data sets. Still, our best efforts to sample extant taxa more adequately for sequence data will leave fundamental questions unanswered. Perhaps chief among these, and most relevant to this volume, is the identity of the angiosperm sister group. Resolution of this question, as well as a general understanding of seed plant evolution, will not be obtained without rigorous morphological analyses, and therein lies a challenge. This will require that we identify approaches for incorporating our insights from data that may be accurate but perhaps less likely than sequence data to generate results supported by high bootstrap values. High bootstrap values give us confidence in the groups we are trying to delineate. However, the knowledge that erroneous clades can be highly supported should temper our thinking, especially in cases where other lines of evidence are contradictory, even if not well supported. It is possible that our tendency to prefer the hypothesis with high support values, and to be uncomfortable with uncertainty, may at least sometimes lead us astray. How best to weigh evidence and distinguish among hypotheses when some types of data are likely to give high support values and others are not remains an important problem in plant systematics. LITERATURE CITED Arber, E. A. N., and J. Parkin. 1907. On the origin of angiosperms. Botanical Journal of the Linnean Society 38: 29–80. Arber, E. A. N., and J. Parkin. 1908. Studies on the evolution of the angiosperms. The relationship of the angiosperms to Gnetales. Annals of Botany 22: 489–515. Arnold, C. A. 1948. Classification of gymnosperms from the viewpoint of paleobotany. Botanical Gazette (Chicago, Ill.) 110: 2. Bailey, I. W. 1944. The development of vessels in angiosperms and its significance in morphological research. American Journal of Botany 31: 421–428. Beck, C. B. 1960a. Connection between Archaeopteris and Callixylon. Science 131: 1524–1525. Beck, C. B. 1960b. The identity of Archaeopteris and Callixylon. Brittonia 12: 351–368. Beck, C. B. 1966. On the origin of gymnosperms. Taxon 15: 337–339. Beck, C. B. 1976. Origin and early evolution of angiosperms. Columbia University Press, New York, New York, USA. Bierhorst, D. W. 1971. Morphology of vascular plants. Macmillan, New York, NewYork, USA. Bowe, L. M., G. Coat, and C. W. dePamphilis. 2000. Phylogeny of seed plants based on all three genomic compartments: Extant gymnosperms are monophyletic and Gnetales’ closest relatives are conifers. Proceedings of the National Academy of Sciences, USA 97: 4092–4097. Brown, J. M., and A. R. Lemmon. 2007. The importance of data partitioning and the utility of Bayes factors in Bayesian phylogenetics. Systematic Biology 56: 643–655. Burleigh, J. G., and S. Mathews. 2004. Phylogenetic signal in nucleotide data from seed plants: Implications for resolving the seed plant tree of life. American Journal of Botany 91: 1599–1613. Burleigh, J. G., and S. Mathews. 2007a. Assessing among-locus variation in the inference of seed plant phylogeny. International Journal of Plant Sciences 168: 111–124. Burleigh, J. G., and S. Mathews. 2007b. Assessing systematic error in the inference of seed plant phylogeny. International Journal of Plant Sciences 168: 125–135. Cantino, P. D., J. A. Doyle, S. W. Graham, W. S. Judd, R. G. Olmstead, D. E. Soltis, P. S. Soltis, and M. J. Donoghue. 2007. Towards a phylogenetic nomenclature of Tracheophyta. Taxon 56: 822–846. Carlquist, S. 1996. Wood, bark, and stem anatomy of Gnetales: A summary. International Journal of Plant Sciences 157: S58–S76. Chamberlain, C. J. 1935. Gymnosperms. Structure and evolution. University of Chicago Press, Chicago, Illinois, USA. Chaw, S. M., C. L. Parkinson, Y. Cheng, T. M. Vincent, and J. D. Palmer. 2000. Seed plant phylogeny inferred from all three plant genomes: Monophyly of extant gymnosperms and origin of Gnetales from conifers. Proceedings of the National Academy of Sciences, USA 97: 4086–4091. Chaw, S. M., A. Zharkikh, H. M. Sung, T. C. Lau, and W. H. Li. 1997. Molecular phylogeny of extant gymnosperms and seed plant evolution: Analysis of nuclear 18S rRNA sequences. Molecular Biology and Evolution 14: 56–68. Chumley, T. W., S. K. R. McCoy, and L. A. Raubeson. 2008. Gnedeep: Exploring Gnetalean affinities in seed plant phylogeny with 83 plastid genes. Botany 2008: Joint Annual Meeting of Canadian Botanical Association, American Fern Society, American Society of Plant Taxonomists, and the Botanical Society of America, Vancouver, British Columbia, Canada [online abstract, http://2008.botanyconference.org/engine/search/index.php?func=detail&aid=770]. Crane, P. R. 1985. Phylogenetic analysis of seed plants and the origin of angiosperms. Annals of the Missouri Botanical Garden 72: 716–793. Crane, P. R., P. Herendeen, and E. M. Friis. 2004. Fossils and plant phylogeny. American Journal of Botany 91: 1683–1699. de la Torre, J., M. Egan, M. Katari, E. Brenner, D. Stevenson, G. Coruzzi, and R. DeSalle. 2006. ESTimating plant phylogeny: Lessons from partitioning. BMC Evolutionary Biology 6: 48. January 2009] Mathews—Phylogenetic relationships among seed plants Degnan, J. H., and N. A. Rosenberg. 2006. Discordance of species trees with their most likely gene trees. PLoS Genetics 2: e68. Donoghue, M. J., and J. A. Doyle. 2000. Seed plant phylogeny: Demise of the anthophyte hypothesis? Current Biology 10: R106–R109. Doolittle, W. F., and J. R. Brown. 1994. Tempo, mode, the progenote, and the universal root. Proceedings of the National Academy of Sciences, USA 91: 6721–6728. Doyle, J. A. 1996. Seed plant phylogeny and the relationships of Gnetales. International Journal of Plant Sciences 157: S3–S39. Doyle, J. A. 1998. Molecules, morphology, fossils, and the relationship of angiosperms and Gnetales. Molecular Phylogenetics and Evolution 9: 448–462. Doyle, J. A. 2001. Significance of molecular phylogenetic analyses for paleobotanical investigations on the origin of angiosperms. Palaeobotanist 50: 63–95. Doyle, J. A. 2006. Seed ferns and the origin of angiosperms. Journal of the Torrey Botanical Society 133: 169–209. Doyle, J. A., and M. J. Donoghue. 1986. Seed plant phylogeny and the origin of angiosperms: An experimental cladistic approach. Botanical Review 52: 321–431. Doyle, J. A., and M. J. Donoghue. 1987. The origin of angiosperms: A cladistic approach. In E. M. Friis, W. G. Chaloner, and P. R. Crane [eds.], The origins of angiosperms and their biological consequences, 17–49. Cambridge University Press, Cambridge, UK. Doyle, J. A., and M. J. Donoghue. 1992. Fossils and seed plant phylogeny reanalyzed. Brittonia 44: 89–106. Driskell, A. C., C. Ané, J. G. Burleigh, M. M. McMahon, C. O’Meara B, and M. J. Sanderson. 2004. Prospects for building the tree of life from large sequence databases. Science 306: 1172–1174. Dunn, C. W., A. Hejnol, D. Q. Matus, K. Pang, W. E. Browne, S. A. Smith, E. Seaver, et al. 2008. Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452: 745–749. Eames, A. J. 1952. Relationships of the Ephedrales. Phytomorphology 2: 79–100. Edwards, S. V., L. Liu, and D. K. Pearl. 2007. High-resolution species trees without concatenation. Proceedings of the National Academy of Sciences, USA 104: 5936–5941. Feild, T. S., N. C. Arens, and T. E. Dawson. 2003. The ancestral ecology of angiosperms: Emerging perspectives from extant basal lineages. International Journal of Plant Sciences 164: S129–S142. Feild, T. S., N. C. Arens, J. A. Doyle, T. E. Dawson, and M. J. Donoghue. 2004. Dark and disturbed: A new image of early angiosperm ecology. Paleobiology 30: 82–107. Felsenstein, J. 1978. Cases in which parsimony or compatibility methods will be positively misleading. Systematic Zoology 27: 401–410. Friis, E. M., P. R. Crane, K. R. Pedersen, S. Bengtson, P. C. J. Donoghue, G. W. Grimm, and M. Stampanoni. 2007. Phasecontrast X-ray microtomography links Cretaceous seeds with Gnetales and Bennettitales. Nature 450: 549–552. Gogarten, J. P., H. Kilbak, P. Dittrich, L. Taiz, E. J. Bowman, B. J. Bowman, M. F. Manolson, et al. 1989. Evolution of vacuolar H+ATPase: Implications for the origin of eukaryotes. Proceedings of the National Academy of Sciences, USA 86: 6661–6665. Graham S. W. , and W. J. D. Iles. 2009. Different gymnosperm outgroups have (mostly) congruent signal regarding the root of floweringplant phylogeny. American Journal of Botany 96: 216–227. Graham, S. W., and R. G. Olmstead. 2000. Utility of 17 chloroplast genes for inferring the phylogeny of the basal angiosperms. American Journal of Botany 87: 1712–1730. Graybeal, A. 1998. Is it better to add taxa or characters to a difficult phylogenetic problem? Systematic Biology 47: 9–17. Gugerli, F., C. Sperisen, U. Buchler, I. Brunner, S. Brodbeck, J. D. Palmer, and Y. L. Qiu. 2001. The evolutionary split of Pinaceae from other conifers: Evidence from an intron loss and a multigene phylogeny. Molecular Phylogenetics and Evolution 21: 167–175. Hajibabaei, M., J. Xia, and G. Drouin. 2006. Seed plant phylogeny: Gnetophytes are derived conifers and a sister group to Pinaceae. Molecular Phylogenetics and Evolution 40: 208–217. 235 Hamby, R. K., and E. A. Zimmer. 1992. Ribosomal RNA as a phylogenetic tool in plant systematics. In P. S. Soltis, D. E. Soltis, and J. J. Doyle [eds.], Molecular systematics of plants, 50–91. Chapman and Hall, New York, New York, USA. Hart, J. A. 1987. A cladistic analysis of conifers: Preliminary results. Journal of the Arnold Arboretum of Harvard University 68: 269–307. Hillis, D. M. 1996. Inferring complex phylogenies. Nature 383: 130–131. Hillis, D. M. 1998. Taxonomic sampling, phylogenetic accuracy, and investigator bias. Systematic Biology 47: 3–8. Hilton, J., and R. M. Bateman. 2006. Pteridosperms are the backbone of seed-plant phylogeny. Journal of the Torrey Botanical Society 133: 119–168. Iwabe, N., K. Kuma, M. Hasegawa, S. Osawa, and T. Miyata. 1989. Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. Proceedings of the National Academy of Sciences, USA 86: 9355–9359. Kim, J. 1998. Large-scale phylogenies and measuring performance of phylogenetic estimators. Systematic Biology 47: 43–60. Kolaczkowski, B., and J. W. Thornton. 2004. Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous. Nature 431: 980–984. Kubatko, L. S., and J. H. Degnan. 2007. Inconsistency of phylogenetic estimates from concatenated data under coalescence. Systematic Biology 56: 17–24. Kubitzki, K. 1990. Gnetales. In K. U. Kramer and P. S. Green [eds], The families and genera of vascular plants, vol. 1, 379. Springer-Verlag, Berlin, Germany. Liu, L., and D. K. Pearl. 2007. Species trees from gene trees: Reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions. Systematic Biology 56: 504–514. Magallón, S., and M. J. Sanderson. 2002. Relationships among seed plants inferred from highly conserved genes: Sorting conflicting phylogenetic signals among ancient lineages. American Journal of Botany 89: 1991–2006. Magallón, S. A., and M. J. Sanderson. 2005. Angiosperm divergence times: The effect of genes, codon positions, and time constraints. Evolution; International Journal of Organic Evolution 59: 1653–1670. Mathews, S. 2006. Phytochrome-mediated development in land plants: Red light sensing evolves to meet the challenges of changing light environments. Molecular Ecology 15: 3483–3503. Mathews, S., and M. J. Donoghue. 1999. The root of angiosperm phylogeny inferred from duplicate phytochrome genes. Science 286: 947–950. Mathews, S., and M. J. Donoghue. 2000. Basal angiosperm phylogeny inferred from duplicate phytochromes A and C. International Journal of Plant Sciences 161: S41–S55. Matsen, F. A., and M. Steel. 2007. Phylogenetic mixtures on a single tree can mimic a tree of another topology. Systematic Biology 56: 767–775. McCoy, S. R., J. V. Kuehl, J. L. Boore, and L. A. Raubeson. 2008. The complete plastid genome sequence of Welwitschia mirabilis: An unusually compact plastome with accelerated divergence rates. BMC Evolutionary Biology 8: 130. McMahon, M. M., and M. J. Sanderson. 2006. Phylogenetic supermatrix analysis of GenBank sequences from 2228 papilionoid legumes. Systematic Biology 55: 818–836. Nickrent, D. L., C. L. Parkinson, J. D. Palmer, and R. J. Duff. 2000. Multigene phylogeny of land plants with special reference to bryophytes and the earliest land plants. Molecular Biology and Evolution 17: 1885–1895. Nixon, K. C., W. L. Crepet, D. Stevenson, and E. M. Friis. 1994. A reevaluation of seed plant phylogeny. Annals of the Missouri Botanical Garden 81: 484–533. Nylander, J. A., F. Ronquist, J. P. Huelsenbeck, and J. L. NievesAldrey. 2004. Bayesian phylogenetic analysis of combined data. Systematic Biology 53: 47–67. 236 American Journal of Botany Palmer, J. D., D. E. Soltis, and M. W. Chase. 2004. The plant tree of life: An overview and some points of view. American Journal of Botany 91: 1437–1445. Parkinson, C. L., K. L. Adams, and J. D. Palmer. 1999. Multigene analyses identify the three earliest lineages of extant flowering plants. Current Biology 9: 1485–1488. Philippe, H., Y. Zhou, H. Brinkmann, N. Rodrigue, and F. Delsuc. 2005. Heterotachy and long-branch attraction in phylogenetics. BMC Evolutionary Biology 5: 50. Pryer, K. M., E. Scheuttpelz, P. G. Wolf, H. Schneider, A. R. Smith, and R. Cranfill. 2004. Phylogeny and evolution of ferns (Monilophytes) with a focus on the early leptosporangiate divergences. American Journal of Botany 91: 1582–1598. Qiu, Y.-L. 2008. Phylogeny and evolution of charophytic algae and land plants. Journal of Systematics and Evolution 46: 287–306. Qiu, Y.-L., J. Lee, F. Bernasconi-Quadroni, D. E. Soltis, P. S. Soltis, M. Zanis, E. A. Zimmer, et al. 1999. The earliest angiosperms: Evidence from mitochondrial, plastid and nuclear genomes. Nature 402: 404–407. Rai, H. S., H. E. O’Brien, P. A. Reeves, R. G. Olmstead, and S. W. Graham. 2003. Inference of higher-order relationships in the cycads from a large chloroplast data set. Molecular Phylogenetics and Evolution 29: 350–359. Rai, H. S., P. A. Reeves, R. Peakall, R. G. Olmstead, and S. W. Graham. 2008. Inference of higher-order conifer relationships from a multilocus plastid data set. Canadian Journal of Botany 86: 658–669. Raubeson, L. A., and R. K. Jansen. 1992. A rare chloroplast-DNA structural mutation is shared by all conifers. Biochemical Systematics and Ecology 402: 404–407. Rodríguez-Ezpeleta, N., H. Brinkmann, B. Roure, N. Lartillot, B. F. Lang, and H. Philippe. 2007. Detecting and overcoming systematic errors in genome-scale phylogenies. Systematic Biology 56: 389–399. Rothwell, G. W. 1982. New interpretations of the earliest conifers. Review of Palaeobotany and Palynology 37: 7–28. Rothwell, G. W., and R. Serbet. 1994. Lignophyte phylogeny and the evolution of Spermatophytes: A numerical cladistic analysis. Systematic Botany 19: 443–482. Rydin, C., and M. Källersjö. 2002. Taxon sampling and seed plant phylogeny. Cladistics 18: 484–513. Rydin, C., M. Källersjö, and E. M. Friis. 2002. Seed plant relationships and the systematic position of Gnetales based on nuclear and chloroplast DNA: Conflicting data, rooting problems, and the monophyly of conifers. International Journal of Plant Sciences 163: 197–214. Sanderson, M. J., M. F. Wojciechowski, J. M. Hu, T. S. Khan, and S. G. Brady. 2000. Error, bias, and long-branch attraction in data for two chloroplast photosystem genes in seed plants. Molecular Biology and Evolution 17: 782–797. Soltis, D. E., P. S. Soltis, and M. J. Zanis. 2002. Phylogeny of seed plants based on evidence from eight genes. American Journal of Botany 89: 1670–1681. Stefanović, S., M. Jager, J. Deutsch, J. Broutin, and M. Masselot. 1998. Phylogenetic relationships of conifers inferred from partial 28S rRNA gene sequences. American Journal of Botany 85: 688–697. Stefanović, S., D. W. Rice, and J. D. Palmer. 2004. Long branch attraction, taxon sampling, and the earliest angiosperms: Amborella or monocots? BMC Evolutionary Biology 4: 35. Stein, W. E.. and Beck, C. B. 1987. Paraphyletic groups in phylogenetic analysis—Progymnospermopsida and Prephanerogames in alternative views of seed plant relationships. Bulletin de la Société Botanique de France-Actualités Botaniques 134: 107–119. Swofford, D. L., G. J. Olsen, P. Waddell, and D. M. Hillis. 1996. Phylogenetic inference. In D. M. Hillis, C. Moritz, and B. K. Mable [eds.], Molecular systematics, 407–425. Sinauer, Sunderland, Massachusetts, USA. Thompson, W. P. 1918. Independent evolution of vessels in gnetales and angiosperms. Botanical Gazette (Chicago, Ill.) 65: 83–90. Wettstein, R. R. 1907. Handbuch der Systematischen Botanik. Franz Deuticke, Leipzig, Germany. Wiens, J. J. 2003. Missing data, incomplete taxa, and phylogenetic accuracy. Systematic Biology 52: 528–538. Wiens, J. J. 2005. Can incomplete taxa rescue phylogenetic analyses from long-branch attraction? Systematic Biology 54: 731–742. Wu, C.-S., Y.-N. Wang, S.-M. Liu, and S.-M. Chaw. 2007. Chloroplast genome (cpDNA) of Cycas taitungensis and 56 cp protein-coding genes of Gnetum parvifolium: Insights into cpDNA evolution and phylogeny of extant seed plants. Molecular Biology and Evolution 24: 1366–1379. Yang, Z. 1998. On the best evolutionary rate for phylogenetic analysis. Systematic Biology 47: 125–133.
Journal of Systematics and Evolution 46 (3): 287–306 (2008) (formerly Acta Phytotaxonomica Sinica) doi: 10.3724/SP.J.1002.2008.08035 http://www.plantsystematics.com Phylogeny and evolution of charophytic algae and land plants Yin-Long QIU* (Department of Ecology & Evolutionary Biology, The University Herbarium, University of Michigan, Ann Arbor, MI 48109-1048, USA) Abstract Charophytic algae and land plants together make up a monophyletic group, streptophytes, which represents one of the main lineages of multicellular eukaryotes and has contributed greatly to the change of the environment on earth in the Phanerozoic Eon. Significant progress has been made to understand phylogenetic relationships among members of this group by phylogenetic studies of morphological and molecular data over the last twenty-five years. Mesostigma viride is now regarded as among the earliest diverging unicellular organisms in streptophytes. Characeae are the sister group to land plants. Liverworts represent the first diverging lineage of land plants. Hornworts and lycophytes are extant representatives of bryophytes and vascular plants, respectively, when early land plants changed from gametophyte to sporophyte as the dominant generation in the life cycle. Equisetum, Psilotaceae, and ferns constitute the monophyletic group of monilophytes, which are sister to seed plants. Gnetales are related to conifers, not to angiosperms as previously thought. Amborella, Nymphaeales, Hydatellaceae, Illiciales, Trimeniaceae, and Austrobaileya represent the earliest diverging lineages of extant angiosperms. These phylogenetic results, together with recent progress on elucidating genetic and developmental aspects of the plant life cycle, multicellularity, and gravitropism, will facilitate evolutionary developmental studies of these key traits, which will help us to gain mechanistic understanding on how plants adapted to environmental challenges when they colonized the land during one of the major transitions in evolution of life. Key words charophytes, evolution, gravitropism, land plants, life cycle, multicellularity, origin, phragmoplast, phylogeny, plasmodesmata, the tree of life. The origin of land plants (embryophytes) was one of the major events in history of life; it irreversibly changed the evolutionary course of life and the environment on earth (Graham, 1993; Gray, 1993; Kenrick & Crane, 1997; Hagemann, 1999; Gensel & Edwards, 2001). To gain a full understanding of how such a major evolutionary transition was unfolded, it is not only necessary to study the event itself, but also essential to investigate other events and processes that led to and happened after the origin of land plants, which undoubtedly contributed to the evolutionary success of this large clade of photosynthetic eukaryotes. A fully resolved phylogeny of major lineages of land plants and their algal relatives represents a foundation for comparative biological research on extant and extinct organisms to elucidate the nature of these events. Until recently, however, critical parts of the phylogeny of land plants and their close algal relatives had remained elusive, despite one and half century’s effort by plant systematists on exploring morphological, ultrastructural, phytochemical, and serological characters since Charles Darwin (1859) proposed that all life shared common descent. Over the last ——————————— Received: 20 March 2008 Accepted: 6 May 2008 * Author for correspondence. Email: ylqiu@umich.edu; Tel.: 1-734-7648279, Fax: 1-734-763-0544. twenty-five years, a rapid progress has been made in molecular systematics, as development of PCR, cloning, automated DNA sequencing technology, and high-speed computing hardware and software has permitted extensive surveys of living organisms. For the first time in the history of biology, an unprecedented amount of historical information encoded in the genomes has become available to rigorous quantitative analysis for resolving many difficult phylogenetic problems. In this paper, I will review the recent progress on phylogenetic reconstruction of charophytic algae and land plants. Because phylogenetic hypotheses play an important role in shaping our understanding of evolution of organisms, I will also discuss implications of the new phylogenetic hypotheses on several key aspects of plant evolution, especially focusing on recent progress in genetic and developmental biological studies of the plant life cycle, multicellularity, and gravitropism. Hopefully, this dual focus approach will bring a more comprehensive understanding on evolution of plants. 1 Phylogeny of charophytic algae and land plants Reconstructing phylogeny of organisms has been 288 Journal of Systematics and Evolution Vol. 46 No. 3 one of the main goals of evolutionary biologists ever since the publication of Charles Darwin’s theory of evolution (Darwin, 1859). In fact, biologists before the mid-1800’s were already pursuing the interconnected relationships among organisms in their classification work (Mayr, 1982). Formulation of cladistic principles by Hennig (1966) and others in the 1960s–’70s established a clear conceptual framework to uncover relationships among organisms through examination of their similarities and differences. The rise of molecular biology and advancement of computer science in the 1980’s unlocked an unprecedented amount of information and analytical power for quantitative analyses, making it possible to realize the dream of reconstructing the Tree of Life (Haeckel, 1866). However, the development path of phylogenetics has not been without detour. Early morphological cladistic studies made a great contribution to systematics by establishing the first explicit phylogenetic frameworks for many groups of organisms, but mis-interpretation of character homology and underestimation of homoplasy resulted in some major erroneous hypotheses. During the early phase of molecular systematics, use of single genes, often without extensive taxon sampling, produced a bewildering array of competing phylogenetic hypotheses, creating an impression that molecular phylogenetic analysis was just another one of those methodological innovations that came and went. Fortunately, this dilemma was soon ended with the invention of automated DNA sequencing technology, which enabled most systematists to use several genes, often from one to several cellular compartments, to conduct phylogenetic studies for virtually any group of organisms. With extensive taxon sampling, this multigene approach, dubbed the supermatrix approach, has proven most effective to tackle difficult phylogenetic problems (Delsuc et al., 2005). More recently, genomic scale data have been applied to phylogenetic reconstruction of organisms, but this approach has so far received only mixed results, primarily because of analytical errors amplified by the imbalance of undersampling of taxa and over-sampling of characters (Leebens-Mack et al., 2005; Brinkmann & Philippe, 2008; Heath et al., 2008). One contribution emerging from phylogenomic studies is the analyses of genomic structural changes, which have been done back in the early days of molecular systematics. This type of analyses, because of character selection based on frequency of changes and a large number of them available in entirely sequenced genomes, can provide independent data sets and is often quite informative in 2008 resolving difficult phylogenetic problems (Kelch et al., 2004; Jansen et al., 2007). Furthermore, as high through-put sequencing technology develops, which allows increased taxon sampling, and our ability to understand and solve problems in phylogenomic studies enhances (Brinkmann & Philippe, 2008), phylogenomic analysis is likely to become more commonly used to resolve difficult phylogenetic issues. Following the tradition in systematics of using all sources of data, modern phylogeneticists have much more information and many more tools to unravel the historical patterns among organisms resulted from evolution. Clearly, a lot of progress has been made on clarifying phylogenetic patterns among organisms by taking an integrated approach. 1.1 Phylogeny of charophytic algae Characeae, Coleochaete, Desmidiaceae, and Zygnemataceae (all of Charophyceae), together with Fritschiella, Oedogonium, and Ulothrix (all of Chlorophyta, see Lewis & McCourt, 2004), were among the green algae that were discussed as potential relatives of land plants from the mid 1800’s to the early 1900’s (Pringsheim, 1860, 1878; Celakovsky, 1874; Bower, 1890, 1908, 1935; Fritsch, 1916; Svedelius, 1927). Characeae were in fact often mistaken to be higher multicellular plants and placed together with mosses, e.g., in Celakovsky (1874). However, it was not until the discovery of the phragmoplast in Chara, Coleochaete, and Spirogyra around 1970 that the status of these algae as the closest algal relatives of land plants was firmly established (Pickett-Heaps, 1967, 1975; Fowke & Pickett-Heaps, 1969; PickettHeaps & Marchant, 1972; Marchant & Pickett-Heaps, 1973). A formal circumscription of Charophyceae by Mattox and Stewart in 1984, based on information from cell division and ultrastructure of the flagellar apparatus, largely defined membership of this important group of green algae, which included: Chlorokybales, Klebsormidiales, Zygnematales, Coleochaetales, and Charales. Recent studies, mostly molecular phylogenetic ones, have accumulated a large body of evidence to support the hypothesis that land plants indeed had a charophytic ancestry (Delwiche et al., 1989; Manhart & Palmer, 1990; Melkonian et al., 1995; Chapman et al., 1998; Karol et al., 2001; Petersen et al., 2006; Lemieux et al., 2007; Turmel et al., 2007). However, two questions have figured conspicuously in studies of these algae over the last twenty-five years. One concerns whether the scaly green flagellate alga Mesostigma viride Lauterborn is a member of charophytes or not. The other asks which group of charophytes represents the sister lineage of QIU: Plant phylogeny and evolution land plants. Mesostigma viride was not in Charophyceae as originally circumscribed by Mattox and Stewart (1984), yet the species was shown to have a multilayered structure (MLS) in its flagellar apparatus that is very similar to that of charophytes and land plants (Rogers et al., 1981; Melkonian, 1989). Since these two studies, molecular phylogenetic analyses have obtained conflicting results, with two main different positions for the taxon. Two early molecular phylogenetic studies, analyzing nuclear encoded small subunit (SSU) rRNA gene and actin gene respectively, placed the species at the base of streptophytes (i.e., charophytes + land plants) (Melkonian et al., 1995; Bhattacharya et al., 1998). However, a phylogenomic study analyzing both entire chloroplast genome sequences and genomic structural changes showed that the species represented the first diverging lineage in Viridiplantae (i.e., Prasinophyceae, Chlorophyta, and Streptophyta) (Lemieux et al., 2000). It was quickly pointed out that this result might be an analytical artifact caused by sparse taxon sampling (Qiu & Lee, 2000). Several recent studies, either analyzing sequences of more genes from nuclear, mitochondrial, and chloroplast genomes or surveying distribution of nuclear gene families, have confirmed a streptophytic affinity of the species (Karol et al., 2001; Kim et al., 2006; Nedelcu et al., 2006; Petersen et al., 2006; Rodriguez-Ezpeleta et al., 2007). More importantly, the original authors who published the result of placing Mesostigma as the sister to the rest of Viridiplantae have analyzed a data set with significantly increased taxon sampling (including almost all major lineages of land plants as well as many green and other algae) and more varieties of methods. They conclude that the species is indeed a member of streptophytes, specifically being sister to Chlorokybus atmophyticus, and that the two taxa constitute the first diverging lineage within streptophytes (Lemieux et al., 2007) (Fig. 1). Although claimed to be a monophyletic group when formally defined (Mattox & Stewart, 1984), charophytes are now clearly established as a paraphyletic group (Bremer, 1985; Mishler & Churchill, 1985; Sluiman, 1985; Melkonian et al., 1995; Chapman et al., 1998; Karol et al., 2001; Qiu et al., 2006b, 2007; Lemieux et al., 2007; Turmel et al., 2007). A question that has intrigued many botanists who study the green algae-land plants transition is which one of the extant charophyte lineages is sister to land plants. While Coleochaete was favored as the closest extant algal relative of land plants in an early cladistic analy- 289 sis of morphological and biochemical characters (Graham et al., 1991), two recent molecular phylogenetic studies, with sufficient sampling of taxa and genes (from all three cellular compartments), have suggested that Charales are sister to land plants with moderate to strong bootstrap support (Karol et al., 2001; Qiu et al., 2006b) (Fig. 1). This hypothesis is also consistent with the data from group II intron distribution in the chloroplast genome as well as gene content, gene order, and intron composition in the mitochondrial genome of charophytes and land plants (Turmel et al., 2003, 2006, 2007). However, a phylogenomic analysis of 76 chloroplast proteins and genes has challenged this view, indicating that Zygnematales are sister to land plants (Turmel et al., 2006). Two factors might have contributed to this result. One is the highly rearranged chloroplast genomes in the two zygnematalean taxa that have been investigated, Staurastrum punctulatum and Zygnema circumcarinatum, both of which lack an inverted repeat typically present in the chloroplast genomes of photosynthetic eukaryotes (Turmel et al., 2007). It has been known for some time that loss of the inverted repeat can cause dramatic rearrangement in the chloroplast genome and produce many autapomorphic (unique) structural changes (Palmer & Thompson, 1982). The other is the two large evolutionary gaps involved in the taxa analyzed, one between Mesostigma/Chlorokybus and Staurastrum/ Zygnema/Chaetosphaeridium/Chara, and the other between the latter group and land plants. These large evolutionary gaps can easily lead phylogenetic analysis astray, particularly in phylogenomic analyses of gene-rich, taxon-sparse sequence matrices, where taxon-character imbalance is so severe that systematic biases in the data sets become virtually irremovable (Delsuc et al., 2005; Leebens-Mack et al., 2005; Brinkmann & Philippe, 2008; Heath et al., 2008). While the zygnematalean ancestry of land plants hypothesis certainly should not be abandoned yet, further work in three areas can help to test whether it is an analytical artifact. First, chloroplast genomes of more taxa need to be analyzed, including Klebsormidium, Entransia, Coleochaete, and more members of Charales and Zygnematales. Second, a parallel study of mitochondrial genomes needs to be conducted, as the data from this genome already shows incongruence with those from chloroplast genomes (Turmel et al., 2007). Finally, more analyses using the supermatrix approach to sample many nuclear, mitochondrial, and chloroplast genes can help to distinguish the two hypotheses. 290 Journal of Systematics and Evolution Vol. 46 No. 3 2008 Fig. 1. A representative phylogeny of charophytic algae and land plants. The thicker lines are roughly proportional to the species numbers in the clades (the clades with <500 species are drawn with thin lines). Major evolutionary changes or evolution of major features are labeled at some nodes. QIU: Plant phylogeny and evolution In addition to progress on these two major questions, one recent development in phylogenetic investigation of charophytic algae involves the fresh water species Entransia fimbricata Hughes. It was only discovered about half century ago (Hughes, 1948) and was placed in Zygnemataceae tentatively (see McCourt et al., 2000). In a major molecular phylogenetic survey of Zygnematales, the species was shown to be placed outside of the order and instead was grouped with Chaetosphaeridium with weak bootstrap support (McCourt et al., 2000). Three later studies with more broad sampling of charophytes showed that Entransia was sister to Klebsormidium with moderate to strong bootstrap support (Karol et al., 2001; Turmel et al., 2002; Sluiman et al., 2008). The study by Sluiman et al. (2008) also reported that another obscure green algal taxon, Hormidiella, might belong to Klebsormidiales, a position originally proposed based on ultrastructural evidence (Lokhorst et al., 2000). These two examples, together with the recent resolution of phylogenetic position for Mesostigma, demonstrate that charophytes are vastly under-studied relative to their importance in our quest to understand the origin of land plants. Future research should place some emphasis on diversity exploration in this group of green algae of pivotal importance, as recently stressed elsewhere (Lewis & McCourt, 2004). More missing links may be discovered that will fill large gaps among currently divergent groups and facilitate phylogenomic and other evolutionary studies. 1.2 Phylogeny of land plants—bryophytes The monophyly of land plants has been robustly established by phylogenetic analyses of morphological and biochemical data (Bremer, 1985; Mishler & Churchill, 1985; Kenrick & Crane, 1997), multigene supermatrices (Qiu et al., 2006b, 2007), and chloroplast genome sequences and gene content (Lemieux et al., 2007), although an early morphological cladistic study suggested that land plants may not be a strictly monophyletic group (Sluiman, 1985). This is one of the few major phylogenetic issues for which there was much controversy in the pre-cladistic days but explicit phylogenetic studies quickly reached a consensus. On the other hand, relationships among basal lineages of land plants have been vigorously debated over the last twenty-five years, from morphological cladistic studies to molecular phylogenetic analyses of genome sequences and multigene supermatrices. Three questions are at the center of the debate. First, do bryophytes constitute a mono- or paraphyletic group? If they form a paraphyletic group, two questions then follow. Which group of bryophytes represents the first 291 diverging lineage of land plants, and which lineage is sister to vascular plants? Two early cladistic studies of morphological and biochemical characters concluded that bryophytes were a paraphyletic group (Mishler & Churchill, 1984; Bremer, 1985). This hypothesis was later confirmed by two analyses of somewhat different morphological data sets (Kenrick & Crane, 1997; Renzaglia et al., 2000). However, two studies of spermatogenesis characters reached a conclusion that bryophytes represent a monophyletic group (Garbary et al., 1993; Renzaglia et al., 2000). Two recent studies of entire chloroplast genome sequences, sampling one species each from liverworts, mosses, and hornworts, also recovered a monophyletic group of bryophytes, which is sister to vascular plants (Nishiyama et al., 2004; Goremykin & Hellwig, 2005). On the other hand, an extensive survey of three mitochondrial group II introns across 350 diverse land plants and several red and green algae showed that liverworts exhibit the same condition of lacking the introns as the algae, supporting the paraphyly hypothesis (Qiu et al., 1998). This conclusion was recently reinforced by an expanded study of 28 mitochondrial group II introns in a smaller number of taxa (Qiu et al., 2006b). Most recently, parsimony and likelihood analyses of a multigene supermatrix with extensive taxon sampling of bryophytes and vascular plants have shown that the paraphyly of bryophytes is virtually indisputable (Qiu et al., 2006b). In consideration of all above-cited studies, it is clear that the paraphyly hypothesis of bryophytes has strong support from diverse sources of data, whereas the monophyly hypothesis is only supported by the studies that suffer from weakness in either character or taxon sampling. Hence, the paraphyly of bryophytes can be regarded as one of the most clearly established aspects of the early land plant phylogeny (Fig. 1). Identifying the earliest diverging lineage of land plants can provide significant insight into the algaeland plants transition. Early cladistic studies of morphological and biochemical data suggested that liverworts occupied such a position (Mishler & Churchill, 1984; Bremer, 1985; Kenrick & Crane, 1997). This hypothesis, however, was challenged by at least four subsequent analyses, one on morphological and developmental characters (Renzaglia et al., 2000) and three on multigene matrices (Nishiyama & Kato, 1999; Nickrent et al., 2000; Renzaglia et al., 2000), which all argued that hornworts represented the sister lineage to the rest of land plants. A large survey of three mitochondrial group II introns, on the other 292 Journal of Systematics and Evolution Vol. 46 No. 3 hand, provided some of the most unequivocal evidence supporting liverworts as the basalmost lineage in land plants (Qiu et al., 1998). This result was later corroborated by an independent study of chloroplast genomic structural changes as well as an expanded survey of 28 mitochondrial group II introns (Kelch et al., 2004; Qiu et al., 2006b). Furthermore, two phylogenomic analyses of entire chloroplast genome sequences, after sufficient taxon sampling of major land plant lineages was achieved, produced the same topology (Wolf et al., 2005; Qiu et al., 2006b). Finally, both parsimony and likelihood analyses of a multigene supermatrix with extensive taxon sampling across all major lineages of land plants resolved the liverworts’ basalmost position in land plants with strong support (Qiu et al., 2006b). The paraphyly of bryophytes and the basalmost position of liverworts in land plants were resolved in three early cladistic studies of morphological and biochemical data (Mishler & Churchill, 1984; Bremer, 1985; Kenrick & Crane, 1997), and they stood tests by numerous molecular phylogenetic analyses over the last twenty-five years. However, the status of mosses as the sister lineage to vascular plants established in those studies was challenged very early on by molecular studies. Hornworts were often recovered as sister to vascular plants in some early single gene analyses (Lewis et al., 1997; Samigullin et al., 2002; Dombrovska & Qiu, 2004). More convincing evidence for this position of hornworts came in several recent studies of chloroplast and mitochondrial genomic structural features (Malek & Knoop, 1998; Kelch et al., 2004; Groth-Malonek et al., 2005) and phylogenomic analyses of entire chloroplast genome sequences (Wolf et al., 2005; Qiu et al., 2006b). In particular, phylogenetic analyses of a supermatrix with dense taxon sampling in charophytes, bryophytes, pteridophytes, and seed plants have provided decisive support to the position of hornworts as the sister to vascular plants (Qiu et al., 2006b). In retrospect, the early morphological cladistic studies mis-interpreted analogy of vascularized conducting tissues in moss sporophytes and vasculature in vascular plants. On the other hand, development of nutritionally largely independent sporophytes in hornworts (Stewart & Rodgers, 1977), a key character syndrome that facilitates completion of alternation of generations during early evolution of land plants, has been greatly under-appreciated (Qiu et al., 2006b, 2007). This is an exemplar case where new molecular phylogenetic results lead to discovery of previously neglected morphological and developmental characters and 2008 consequently greatly enhance our understanding of plant phylogeny and evolution through reciprocal illumination (Hennig, 1966). 1.3 Phylogeny of land plants—pteridophytes While the knowledge of extinct fossil taxa is essential for our understanding of the origin and evolution of vascular plants, I will limit this review mostly to studies of extant plants in order to keep the paper within a reasonable length. Among several extant basal vascular plant lineages, Psilotaceae were often compared to extinct early vascular plants Rhyniopsida and were suggested to be sister to the rest of extant vascular plants in a morphological cladistic study (Bremer, 1985). However, discovery of a 30 kb inversion in the chloroplast genome shared by all vascular plants except lycophytes, which exhibit the same condition as bryophytes, clinched the status of lycophytes as the earliest diverging lineage among extant vascular plants (Raubeson & Jansen, 1992). This result has recently been confirmed by completely sequenced chloroplast genomes of Physcomitrella patens (Sugiura et al., 2003), Anthoceros formosae (Kugita et al., 2003), Huperzia lucidula (Wolf et al., 2005), and Psilotum nudum (Wakasugi et al., unpublished). In Selaginella uncinata, however, there is a shorter (20 kb) inversion in the same region of the chloroplast genome that appears to show the same condition as non-lycophyte vascular plants (Tsuji et al., 2007). Nevertheless, adjacent genes immediately outside of this inversion still exhibit the same order as those in Huperzia lucidula, thus suggesting that this species acquired a superficially similar inversion via an independent genome rearrangement event. Recently, the position of lycophytes as the sister to other vascular plants has also been corroborated by analyses of a multigene supermatrix with extensive sampling of all major land plant lineages (Qiu et al., 2007) (Fig. 1). The other major breakthrough in pteridophyte systematics over the last twenty-five years is represented by identification of a major clade that unites Equisetum, Psilotaceae and true ferns and placement of this clade as the sister to seed plants (Fig. 1). This clade, named monilophytes, was first recognized in a morphological cladistic analysis on extinct and living taxa, and it possesses one synapomorphy, mesarch protoxylem confined to the lobes of the xylem strand (Kenrick & Crane, 1997). Later, a molecular phylogenetic study identified the clade with strong bootstrap support and also uncovered a highly diagnostic three codon insertion in the chloroplast gene rps4 (Pryer et al., 2001). A large-scale phylogenetic study QIU: Plant phylogeny and evolution with extensive taxon sampling in bryophytes, pteridophytes, and seed plants have also identified this clade and placed it as the sister to seed plants with strong statistical support (Qiu et al., 2007). Resolution of these relationships represents significant progress toward achieving a complete understanding on the origin of seed plants, as early morphological cladistic studies have had great difficulty to clarify relationships among the so-called fern allies (Equisetum, Psilotaceae, and lycophytes), ferns, and seed plants (Bremer, 1985; Garbary et al., 1993). While placement of Equisetum and Psilotaceae with true ferns in the monilophyte clade has clarified relationships among early vascular plants, relationships among these two taxa, two eusporangiate fern families (Marattiaceae and Ophioglossaceae), and the clade of leptosporangiate ferns are still not resolved. The only resolved part here is the sister relationship between Psilotaceae and Ophioglossaceae, which receives strong bootstrap support in molecular phylogenetic analyses (Pryer et al., 2001; Qiu et al., 2007). Future studies that sample more genes from mitochondrial, nuclear, and chloroplast genomes may offer resolution to this problem. 1.4 Phylogeny of land plants—seed plants Early morphological cladistic analyses of extinct and extant taxa concluded that seed plants were a monophyletic group (Crane, 1985; Doyle & Donoghue, 1986), although there was a possibility of biphyletic origin of seed plants (Doyle & Donoghue, 1986). The seed plant monophyly has now been clearly confirmed by a large-scale molecular phylogenetic study that sampled both non-seed plants and seed plants extensively (Qiu et al., 2007). The other major finding from several morphological cladistic analyses of seed plants (Crane, 1985; Doyle & Donoghue, 1986; Nixon et al., 1994; Rothwell & Serbet, 1994) was a close relationship between Gnetales and angiosperms, but this result was almost never recovered in any molecular study. Instead, Gnetales were often shown to be sister to Pinaceae (Goremykin et al., 1996; Winter et al., 1999; Bowe et al., 2000; Chaw et al., 2000; Frohlich & Parker, 2000; Gugerli et al., 2001; Magallon & Sanderson, 2002; Soltis et al., 2002; Burleigh & Mathews, 2004; Qiu et al., 2007) (Fig. 1) or sometimes to conifers (Chaw et al., 1997; Burleigh & Mathews, 2004). In analyses of fast-evolving genes or nucleotide positions, typically chloroplast genes or 3rd codon positions of other genes, Gnetales were placed as the sister to all other seed plants (Magallon & Sanderson, 2002; Rydin et al., 2002). Only in a study of nuclear 18S and 26S 293 rRNA genes, Gnetales were shown to be sister to angiosperms, but with low bootstrap support (Rydin et al., 2002). Given the kinds and number of genes sampled, the diversity of taxon sampling schemes used, and the variety of methods employed in all these analyses, it is difficult to imagine what types of systematic errors were present in all these molecular data sets that would prevent recovery of a close relationship between Gnetales and angiosperms if there was one. Therefore, it seems reasonable to conclude that Gnetales are related to conifers rather than to angiosperms. One of the most spectacular discoveries in molecular systematics over the last two and half decades is the identification of several basal angiosperm taxa as the earliest diverging lineages among living angiosperms, which include Amborella, Nymphaeales, Hydatellaceae, and Illiciales/Trimeniaceae/Austrobaileya (ANHITA; Fig. 1) (Mathews & Donoghue, 1999; Parkinson et al., 1999; Qiu et al., 1999; Soltis et al., 1999; Barkman et al., 2000; Graham & Olmstead, 2000; Saarela et al., 2007). Subsequent analyses sampling more genes and employing more varieties of analytical methods have solidified this result (Qiu et al., 2000, 2001, 2005, 2006a; Zanis et al., 2002; Borsch et al., 2003; Hilu et al., 2003; Stefanovic et al., 2004; Leebens-Mack et al., 2005; Jansen et al., 2007; Moore et al., 2007). Until recently, it was thought that the diversification pattern among the earliest angiosperms could never be resolved despite nearly two centuries of research (see Qiu et al., 1993). The divergence gap between ANHITA and euangiosperms (angiosperms exclusive of Amborella, Nymphaeales, Hydatellaceae, and Illiciales/Trimeniaceae/Austrobaileya (Qiu et al., 1999)) in fact is quite large, and has been identified in most molecular phylogenetic studies that sample several genes (Parkinson et al., 1999; Qiu et al., 1999, 2005, 2006a; Graham & Olmstead, 2000; Soltis et al., 2000; Borsch et al., 2003; Stefanovic et al., 2004; Jansen et al., 2007; Moore et al., 2007). This gap has also been independently corroborated by studies of fossil evidence (Friis et al., 1999) and morphology (Endress & Igersheim, 2000; Williams & Friedman, 2002). In retrospect, early appearance of ANHITA in angiosperm evolution was already detected in the comparative analyses of extant angiosperms (Stebbins, 1974; Endress, 1986) and fossil record (Upchurch, 1984). This is another exemplar case where phylogenetic relationships, once resolved by molecular systematic studies, suddenly reveal a consistent evolutionary pattern in the data that already existed and are being gathered, shedding significant 294 Journal of Systematics and Evolution Vol. 46 No. 3 light on a long-standing evolutionary enigma—the origin of angiosperms in this case. Another spectacular discovery in the recent history of plant systematics involves the recognition of a large monophyletic group of angiosperms termed eudicots (or tricolpates) (Doyle & Hotton, 1991), which encompasses 75% of extant angiosperm diversity (Mabberley, 1987) (Fig. 1). It was suggested as early as in the 1930’s that angiosperms with tricolpate pollen and derived pollen types may represent a natural group based on extensive surveys of extant and fossil angiosperm pollen (Wodehouse, 1935, 1936). Several other authors later supported this hypothesis from their comparative studies of plant morphology and pollen (Bailey & Nast, 1943; Hu, 1950; Walker & Doyle, 1975). In an explicit cladistic analysis of basal angiosperms using morphological data, the monophyly of eudicots was established for the first time (Donoghue & Doyle, 1989). However, limited taxon sampling in that study prevented this finding from being widely recognized. The first large-scale molecular phylogenetic analysis of angiosperms using sequences of the chloroplast gene rbcL established monophyly of this large group beyond any doubt (Chase et al., 1993). All major molecular phylogenetic studies of angiosperms since then have shown that monophyly of eudicots is one of the best established aspects of the angiosperm phylogeny (Qiu et al., 1999, 2006a; Savolainen et al., 2000; Soltis et al., 2000; Hilu et al., 2003; Jansen et al., 2007; Moore et al., 2007). Evolution of tricolpate pollen turns out to be such an infrequent event that it happened only twice outside eudicots, once in Illiciales and once in Arecaceae (the palm family), and in both cases the pollen developmental pattern is actually different from that in eudicots (see Cronquist, 1981; Qiu et al., 1993). The history of discovery of this large clade of land plants, spanning more than half century, demonstrates that our ability to explore the nature is highly dependent on advancement of technology. In this particular case, invention of light microscope, electron microscope, DNA sequencing techniques, and computer all had an instrumental role in the eventual recognition of this major clade of angiosperms. Besides these two major findings, recent molecular phylogenetic analyses of large data sets with extensive sampling of genes and taxa have greatly clarified relationships among angiosperms (Hilu et al., 2003; Qiu et al., 2005, 2006a). Overall, euangiosperms can be divided into five monophyletic groups: Ceratophyllum, Chloranthaceae, eudicots, magnoliids (which include two pairs of sister taxa, Canel- 2008 lales/Piperales and Magnoliales/Laurales), and monocots. Currently, relationships among three large clades with significant diversity, magnoliids, monocots, and eudicots, have been resolved differently in studies using multigene and phylogenomic data sets. In a study of analyzing 8 mitochondrial, chloroplast, and nuclear genes from 144 taxa with a compatibility method, magnoliids and eudicots were shown to be sister to each other, and they were sister to monocots (Qiu & Estabrook, 2008). However, in two analyses of entire chloroplast genome sequences, monocots were sister to eudicots, and together they were sister to magnoliids (Jansen et al., 2007; Moore et al., 2007). Because the power of the compatibility method in resolving deep phylogenetic patterns remains relatively untested, and phylogenomic studies have often suffered from systematic errors in data sets (Delsuc et al., 2005; Leebens-Mack et al., 2005; Brinkmann & Philippe, 2008; Heath et al., 2008), it is best to view relationships among the major angiosperm lineages as unresolved at present. 2 Evolutionary implications of a newly reconstructed phylogeny of charophytic algae and land plants From the above review, it is clear that our understanding on the phylogeny of charophytic algae and land plants has been significantly improved over the last two and half decades (Fig. 1). Several cladistic analyses of mostly morphological characters for the first time formulated explicit phylogenetic hypotheses on relationships among major lineages of these photosynthetic eukaryotes (Mishler & Churchill, 1984, 1985; Bremer, 1985; Crane, 1985; Doyle & Donoghue, 1986; Donoghue & Doyle, 1989; Graham et al., 1991; Kenrick & Crane, 1997). These hypotheses served as paradigms for guiding evolutionary studies of various aspects of these organisms during this period, for example, origin of sporopollenin in charophytes (Delwiche et al., 1989), fertilization in Gnetales and angiosperms (Friedman, 1990) and auxin metabolism in early land plants (Sztein et al., 1995; Cooke et al., 2002, 2004). Undoubtedly, these morphological cladistic studies represented a major step forward from traditional taxonomy, enforcing a rigorous criterion on identifying homologous characters and defining strictly monophyletic groups. Nevertheless, these studies also had their limitations, especially in making mis-interpretation of some of the morphological characters and under-estimating the QIU: Plant phylogeny and evolution extent of homoplasy in plant evolution. Molecular phylogenetic studies, in particular those based on supermatrices and infrequent genomic structural changes, have a greater resolution power because of access to a much larger amount of historical information and higher quality characters (Manhart & Palmer, 1990; Raubeson & Jansen, 1992; Chase et al., 1993; Qiu et al., 1998, 1999, 2006a, b, 2007; Bowe et al., 2000; Chaw et al., 2000; Graham & Olmstead, 2000; Savolainen et al., 2000; Soltis et al., 2000; Karol et al., 2001; Pryer et al., 2001; Hilu et al., 2003; Burleigh & Mathews, 2004; Kelch et al., 2004). These molecular studies have remedied to a good extent the weakness of morphological cladistic studies, by circumventing the problem of relying on a few morphological characters that might have experienced convergent evolution due to similar selection pressure. As a result, the combined use of morphology and molecules in rigorous quantitative analyses over the last twenty-five years has led to one of the most rapid growth periods in our knowledge on evolutionary relationships among organisms. The significantly improved organismal phylogeny is providing a momentum for the pendulum of evolutionary research to swing back to the study of mechanisms and processes. It allows tracing macro-evolutionary patterns among major clades of organisms on large scales and helps formulating hypotheses on mechanisms and processes of some major evolutionary transitions. The emergence of evolutionary developmental biology will further catalyze this transformation by providing experimental approaches to test the hypotheses. This interplay among studies of phylogenetic patterns, developmental mechanisms, and evolutionary processes in a large diversity of organisms is likely to lead to a new level of understanding on functioning and evolution of life in general. Charophytes and land plants together represent one of the major lineages in eukaryotic evolution, which spans the diversity from unicellular aquatic algae to highly evolved multicellular angiosperms. Studies of their evolutionary patterns and developmental mechanisms under a new phylogenetic framework will help us not only to understand how this major clade has evolved, but also to learn how eukaryotes in general adapt to environment challenges during several major evolutionary transitions that were not unique to plants, e.g., from unicellularity to multicellularity, from a gravity-water buoyancy environment to a gravity-air buoyancy environment, and from a haploid gametophyte to a diploid sporophyte as the dominant generation in the life cycle. Below, I will 295 review and discuss recent progress on genetic, developmental, and cell biological studies of several plant traits, which, when investigated with an evolutionary developmental approach under the new phylogenetic hypotheses, are likely to further our understanding of plant evolution. 2.1 Evolution of life cycle in land plants One of the most interesting and important, but somehow recently neglected, aspects of plant evolution is the change of life cycle in various lineages of charophytic algae and land plants. The phylogeny of these organisms as currently understood (Fig. 1) and their order of appearance in the fossil record (Gray, 1993; Taylor & Taylor, 1993; Kenrick & Crane, 1997; Wellman et al., 2003) clearly demonstrate a trend of expansion of the diploid sporophyte generation with concomitant reduction of the haploid gametophyte generation. However, had the phylogenetic pattern not been clear, it would have been much more difficult, if not impossible, to detect this trend based purely on fossil evidence. Three issues in the phylogeny of charophytic algae and early land plants, all resolved over the last several decades, can directly affect interpretation of evolution of life cycle in land plants. First, charophytes, rather than Ulvophyceae in Chlorophyta, are identified as the closest algal relatives of land plants. Although this relationship was recognized based on surveys of cell division and ultrastructure of the flagellar apparatus among green algae and land plants (Pickett-Heaps, 1975; Mattox & Stewart, 1984), phylogenetic analyses of morphological and molecular data provided robust assurance and independent corroboration to this result (Manhart & Palmer, 1990; Melkonian et al., 1995; Chapman et al., 1998; Karol et al., 2001; Lemieux et al., 2007; Turmel et al., 2007). Second, the monophyly of land plants was much less certain in the pre-cladistic time, and an early morphological cladistic study even expressed doubt on the issue (Sluiman, 1985). However, cladistic studies of morphological and biochemical data firmly established monophyly of land plants (Bremer, 1985; Mishler & Churchill, 1985; Kenrick & Crane, 1997). Recent phylogenetic analyses of molecular data have provided further and more convincing evidence to support this conclusion (Qiu et al., 2006b, 2007; Lemieux et al., 2007; Turmel et al., 2007). Finally, the paraphyly of bryophytes, though established by some early morphological cladistic studies (Mishler & Churchill, 1984; Bremer, 1985), has been challenged by both morphological and molecular phylogenetic analyses from time to time (Garbary et al., 1993; 296 Journal of Systematics and Evolution Vol. 46 No. 3 Renzaglia et al., 2000; Nishiyama et al., 2004; Goremykin & Hellwig, 2005). The recent large scale analyses of multigene supermatrices and a broad survey of mitochondrial group II introns, all with extensive taxon sampling across land plants, have decisively resolved this issue (Qiu et al., 1998, 2006b, 2007). Failure to resolve any of these issues would have resulted in a much less clear phylogenetic pattern, impeding investigation of the origin of land plants and evolution of alternation of generations. Alternatively, if some of these three issues were resolved with different outcomes, e.g., Ulvophyceae were identified as the closest algal relatives of land plants, or bryophytes were shown to form a monophyletic group sister to vascular plants, an entirely different hypothesis, the homologous hypothesis (Pringsheim, 1878), than the one discussed below would have to be considered to explain the origin of land plants and evolution of life cycle in land plants. The phylogenetic pattern among charophytic algae and early land plants inferred from morphological and molecular data sheds significant light on two major events in the history of plant life: colonization of land and change from a haploid gametophyte to a diploid sporophyte as the dominant generation in the life cycle. One school of thoughts, often known as the antithetic hypothesis, first developed Celakovsky in 1874 and later greatly expanded by Bower (1890, 1908, 1935) and others (Campbell, 1924; Svedelius, 1927; Smith, 1955), actually used a phylogenetic scheme that is largely congruent to what is reconstructed now to explain evolutionary changes at these two major events. These authors examined and compared developmental patterns of life cycle in various algal and plant lineages by following this phylogenetic scheme. They hypothesized that land plants originated as a consequence of interpolation of a new phase (sporophyte generation) into the life cycle of some green algae that were more likely related to today’s charophytes. They further suggested that as early land plants (mosses as recognized by those early botanists) evolved, the sporophyte generation expanded through structural elaboration and progressive sterilization of potentially sporogenesis tissues and ultimately became a free-living dominant generation as seen in the life cycle of ferns and seed plants. It is noteworthy that the different ploidy levels of gametophyte and sporophyte generations (Strasburger, 1894) and meiosis (Van Beneden, 1883) (see Hamoir, 1992) were both discovered after Celakovsky (1874) had proposed this hypothesis. Hence, one has to be amazed by the power of comparative developmental biology, which can 2008 only be realized when there is a correct phylogenetic framework to guide interpretation of the observed pattern. The third major event in plant evolution is the origin of seed plants (and the origin of angiosperms can be regarded as an extension of this process). This event has not been so much targeted in the study of evolution of life cycle in land plants, particularly in the debate between the antithetic (Celakovsky, 1874; Bower, 1890; Campbell, 1924; Svedelius, 1927; Smith, 1955) and the homologous hypotheses (Pringsheim, 1878). Equally puzzling is that despite intense interest in the origins of seed plants and angiosperms throughout the entire last century, few have looked at the problems from a life cycle evolutionary developmental perspective, with perhaps one exception (Takhtajan, 1976), who alluded to neoteny as one of the possible mechanisms contributing to the origin of angiosperms. What has received most attention is emergence of new structures such as seeds and flowers (Crane, 1985; Doyle & Donoghue, 1986; Frohlich & Parker, 2000; Theissen et al., 2000), but equally important aspects during this phase of land plant evolution are reduction of gametophytes and further increase of male meiocyte number per fertilization event (this number, in a non-heterospory situation, was already greatly increased when the sporophyte became a dominant generation during the origin of vascular plants). These evolutionary changes are obvious when reproductive cycles of seed plants, monilophytes, and lycophytes (Gifford & Foster, 1989) are compared under a phylogenetic framework, which again has only been available from recent phylogenetic analyses of morphological and molecular data (Kenrick & Crane, 1997; Pryer et al., 2001; Qiu et al., 2007). More importantly, these changes fit the trend of sporophyte expansion and gametophyte reduction since plants colonized the land (Fig. 1). Adaptive significance of this trend in life cycle change lies in the fact that it allows generation of a larger number of genetically different gametes through increase of the meiocyte number, which then leads to occupation of more variable environmental niches on the land than in the water by more genetically variable offspring after fertilization (Svedelius, 1927). Once this macro-evolutionary trend is revealed, it becomes relatively straightforward to design an integrated experimental strategy to explore developmental mechanisms that have shaped the pattern of life cycles in extant land plants, though it will take many years to elucidate these mechanisms. The study of developmental events in the life cycle, especially QIU: Plant phylogeny and evolution the control of timing of meiosis initiation, has been pursued for many years in the fungal system, Schizosaccharomyces pombe. One particular gene that has been identified to play an important role in meiosis initiation is mei2, which encodes an RNA-binding protein that is essential for premeiotic DNA synthesis and the commitment to meiosis (Watanabe & Yamamoto, 1994). This gene is likely to be conserved among protists, fungi, and plants (Jeffares et al., 2004). In green algae and land plants, it appears that the gene has undergone a major duplication event, with one gene family (TEL) involved in cell differentiation in shoot and root meristems (Jeffares et al., 2004; Veit et al., 1998) and the other (AML) playing a role in vegetative meristem activity as well as meiosis in Arabidopsis thaliana (Kaur et al., 2006). It is no coincidence that both meiosis and mitosis are targets of action by this gene family, as cell divisions in vegetative and reproductive growth are key processes to regulate in order to mold the life cycle of a certain lineage. Hence, genes controlling the timing of mitosis and meiosis should be high priority targets to investigate in the effort to understand evolution of the life cycle in land plants. Since the primary focus of this review is on the phylogeny, no other genes will be discussed here though many have been identified (Ma, 2005; Hamant et al., 2006). The above example is provided merely to show that it is feasible to take an evolutionary developmental approach to study mechanistic aspects of evolution of life cycle in land plants. 2.2 Transition from unicellularity to multicellularity in streptophytes Land plants represent one of the most successful groups and the most sophisticated kinds of multicellular organisms (Hagemann, 1999). The transition from uni- to multicellularity actually took place before they came onto land, and the current phylogeny suggests that it happened either in the common ancestor of all streptophytes or shortly after the origin of this clade (Fig. 1). Placement of Mesostigma viride, a unicellular organism, is critical to pinpoint the origin of multicellularity in this part of eukaryotic evolution. Since there is no longer any dispute on its inclusion within charophytes (Karol et al., 2001; Kim et al., 2006; Nedelcu et al., 2006; Petersen et al., 2006; Lemieux et al., 2007; Rodriguez-Ezpeleta et al., 2007), it is reasonable to suggest that multicellularity evolved in the common ancestor of streptophytes or later, because the green algae below this node on the phylogeny, Prasinophyceae, are all unicellular (Baldauf, 2003; Lewis & McCourt, 2004). On the other hand, whether Mesostigma alone (Karol et al., 2001; 297 Petersen et al., 2006) or together with Chlorokybus (Qiu et al., 2006b; Lemieux et al., 2007) is sister to all other streptophytes has not been clearly resolved. Hence, it is also likely that multicellularity evolved shortly after the origin of streptophytes, particularly in consideration of the sarcinoid organization in Chlorokybus atmophyticus, which is a packet of a few cells held together by a gelatinous matrix without plasmodesma connection (van den Hoek et al., 1995)—a quasi-state of multicellularity. Regardless of these topological variations, the currently resolved phylogenetic relationships among early diverging charophytic algae provide a sound evolutionary framework under which the transition from uni- to multicellularity can be meaningfully investigated. Multicellularity evolved more than two dozens of times in bacteria, archaea, and eukaryotes (Bonner, 1999; Grosberg & Strathmann, 2007). It confers several advantages to organisms by enhancing their metabolic and reproductive capabilities (Niklas, 1997; Grosberg & Strathmann, 2007). First, the sheer increase of physical size allows the organisms to use resource from the environment better than their unicellular competitors. Because the cell size increase has an upper bound constrained by physico-chemical properties of phospholipids, proteins, and other compounds that make up the plasma membrane and cell wall, multicellularity provides the only solution to the problems of out-competing other organisms when resource stays the same or decreases in the environment, or increasing metabolic activity in a resource-richer environment. Multicellularity also ensures better protection of genetic material than unicellularity. Second, once multicellularity emerges, functional differentiation and specialization (division of labor) among cells in an organism will confer a greater fitness to its metabolism and reproduction. Complexity will be achieved through structural (morphological), metabolic (chemical, physiological, and behavioral), and reproductive differentiation, with formation of tissues, organs, and member groups within a social group. Indeed, empirical analyses have detected a positive correlation between size and complexity (Bell & Mooers, 1997; Bonner, 2004). Finally, because the life span of a cell is constrained by physico-chemical properties of carbohydrates, phospholipids, fatty acids, amino acids, proteins, nucleic acids, and other compounds that make up the cell, multicellular organisms have an advantage of out-living unicellular competitors. Development of multicellularity depends on two basic processes at the cellular level, cell cohesion and 298 Journal of Systematics and Evolution Vol. 46 No. 3 cell-cell exchange of information and materials (Alberts et al., 1989). At present, little is known about cell cohesion during the transition from uni- to multicellularity in early streptophytes. In comparison, more information is available on plasmodesmata-cytoplasmic bridges that connect adjacent cells and allow exchange of hormones, RNAs, carbohydrates, proteins, and other compounds between cells (Lucas & Lee, 2004). In eukaryotes, plasmodesmata have evolved several times independently, in Fungi, Phaeophyta, Chlorophyta, and Streptophyta (Lucas et al., 1993; Raven, 1997). Evolution of this cell-cell communication/transportation device in early streptophytes has undoubtedly contributed to the success of building large complex multicellular organisms in this lineage of eukaryotes. Among all extant charophytes, Mesostigma viride is probably the only ancestrally unicellular organisms. Chlorokybus atmophyticus is sarcinoid (no plasmodesma), exhibiting a primitive type of multicellularity. The three genera of Klebsormidiales, Klebsormidium, Entransia, and Hormidiella (Lewis & McCourt, 2004; Sluiman et al., 2008), all contain unbranched filamentous species (Hughes, 1948; van den Hoek et al., 1995; Lokhorst et al., 2000; Cook, 2004). There are no plasmodesmata connecting cells in species of Klebsormidium (van den Hoek et al., 1995) and Entransia (M. E. Cook, personal communication); no information about plasmodesmata is currently available in Hormidiella. Zygnematales contain a large number of unicellular, colonial, or unbranched filamentous species. The phylogenetic distribution of plasmodesmata in charophytes (McCourt et al., 2000; Karol et al., 2001) suggests that unicellularity in this group might have been secondarily derived. So far, no plasmodesmata have been reported in any species of Zygnematales (van den Hoek et al., 1995). Coleochaetales and Charales are the only multicellular charophytic algae that have plasmodesmata connecting their cells (Franceschi et al., 1994; van den Hoek et al., 1995; Cook et al., 1997). From their distribution on the currently resolved phylogeny of charophytes and land plants (Fig. 1), it seems reasonable to suggest that plasmodesmata evolved in the common ancestor of Coleochaetales, Charales, and land plants. Another structure that has likely contributed to evolution of multicellularity in streptophytes, and particularly formation of the three-dimensional plant body, is the phragmoplast, which is a unique arrangement of vesicles and microtubules during cytokinesis whereby microtubules are oriented perpendicular to the plane of cytokinesis (Fowke & 2008 Pickett-Heaps, 1969; Pickett-Heaps, 1975). It is found primarily in Zygnematales, Coleochaetales, Charales, and land plants (Pickett-Heaps, 1967; Fowke & Pickett-Heaps, 1969; Marchant & Pickett-Heaps, 1973); elsewhere it is only found in Trentepohliales of Chlorophyta (Chapman & Henk, 1986). Klebsormidium, though lacking a clear phragmoplast, exhibits two characteristics required for evolution of the structure: an ingrowing cleavage furrow and a persistent system of interzonal microtubules separating daughter nuclei and derived from a spindle apparatus (Floyd et al., 1972; Pickett-Heaps & Marchant, 1972). Hence, it seems that the phragmoplast evolved shortly after streptophytes originated, as can be inferred from its distribution on the phylogeny of charophytes and land plants (Fig. 1). It has been suggested that the phragmoplast is perhaps essential for organisms to achieve a two- to three-dimensional pattern of cell division, instead of a one-dimensional, filamentous type of simple division, to develop complex plant bodies (Hagemann, 1999; Pickett-Heaps et al., 1999). Hagemann (1999) in particular has argued that the type of cell division involving formation of a phragmoplast is related to the unique way of cell wall construction in land plants, which are multicellular organisms that grow against the direction of gravity in a less buoyant medium (air, in comparison to water for most algae) and have few, if any, parallels in eukaryotes. These ideas are certainly consistent with the pattern of morphological complexity of plant bodies exhibited by various charophyte and land plant lineages on the phylogeny (Fig. 1). The plasmodesmata and phragmoplasts, the evolution of which seems not dependent on each other, may have been largely responsible for evolution of multicellularity in streptophytes (Lucas et al., 1993; Franceschi et al., 1994; Hagemann, 1999). Identification of genes encoding various components of both structures will significantly increase our understanding on how multicellularity was achieved step by step during the transition of photosynthetic eukaryotes from the aquatic to the terrestrial environment. The knowledge accumulated from cell biology research over the last several decades has laid down a solid foundation (Pickett-Heaps et al., 1999; Lucas & Lee, 2004), but fine-scale evolutionary genetic and developmental studies are needed to elucidate the process of uni- to multicellularity transition in streptophytes. Finally, it should be added that uni- to multicellularity transition actually happened twice during streptophyte evolution: once at the gametophyte whole organism level in early evolution of charophytes and another at QIU: Plant phylogeny and evolution the sporophyte level during the origin of land plants (only part of the organism was involved in Characeae and liverworts). For the latter, a life cycle with the diploid sporophyte generation being dominant (Svedelius, 1927; Coelho et al., 2007; McManus & Qiu, 2008) and origin of lignin (for cell cohesion) probably have also contributed to the building of large and complex multicellular plant bodies. 2.3 Origin and evolution of gravitropism in streptophytes Land plants, as the major primary producers in the terrestrial ecosystem, have developed a body plan of a vertical axis with photosynthetic organs (leaves) in the air and absorption-anchorage organs (roots or rhizoids) in the soil. Although this body plan is best manifested in seed plants (Cooke et al., 2004), prototypic forms are found in pteridophytes, bryophytes, and Characeae. Gravitropism has played an instrumental role in the origin and evolution of this body plan, as streptophytes undergo the transition from free-swimming/planktonic green algae such as Mesostigma viride and Zygnematales to aquatic rhizophytic Characeae (Raven & Edwards, 2001), and to land-grown bryophytes, pteridophytes and seed plants. Essentially, two types of cells, “root” and “shoot” meristematic cells that respond to gravity positively and negatively, are responsible for building this body plan. Though gravity has always accompanied life on earth (Volkmann & Baluska, 2006), in no time has it figured so conspicuously in influencing evolution of organisms as during the water-land transition of plants, since no other organisms have built a body with the size and mass of giant sequoia or eucalyptus trees. Hence, elucidating the origin and evolution of gravitropism in streptophytes will not only help us to understand the origin and evolution of land plants, but also provide insight into the role of gravity in shaping evolution of life. The phylogeny of charophytic algae and land plants as currently reconstructed (Fig. 1) shows that gravitropism in streptophytes likely evolved in the common ancestor of Characeae and land plants, because both groups are rhizophytes (Raven & Edwards, 2001) and other charophytes are either free-swimming/planktonic or epiphytic in aquatic or terrestrial habitats (van den Hoek et al., 1995). In some early phylogenetic studies of streptophytes and green algae using nuclear 18S rDNA data, Characeae were shown to be the first diverging lineage among charophytes (Kranz et al., 1995; Friedl, 1997). The cladistic analysis of morphological and biochemical characters placed Coleochaete as the closest extant 299 algal relative of land plants (Graham et al., 1991). Both topologies suggest either two independent origins of gravitropism in Characeae and land plants separately or loss of the trait in some charophytes. These scenarios are not entirely without any merit since rhizophytes are also found in Chlorophyta (Raven & Edwards, 2001), and gravitropism has clearly evolved more than once in eukaryotes. However, the current strong support from two multigene analyses clearly favors the position of Characeae as the sister to land plants (Karol et al., 2001; Qiu et al., 2007). Thus, the best hypothesis at present is that gravitropism evolved only once in streptophytes. Gravitropism in Characeae has been studied at the cellular level in great details (Braun & Limbach, 2006). In general, actin has been shown to be intimately involved in gravity sensing and polarized cell growth in this system. Actomyosin plays a key role in gravity sensing by first coordinating the position of statoliths, which are BaSO4-crystall-filled vesicles (different from starch-filled amyloplasts in angiosperms). Upon a change in the cell’s orientation relative to the direction of gravity, it directs sedimenting statoliths to specific areas of the plasma membrane, where contact with membrane-bound gravisensor molecules elicits short gravitropic pathways. In controlling polarized cell growth, actin and a steep gradient of cytoplasmic free calcium make up crucial components of a feedback mechanism. So far, only limited knowledge on the role of auxin in regulating rhizoid growth and gravitropism has been obtained in Chara (Klambt et al., 1992; Cooke et al., 2002). In Arabidopsis thaliana, a lot more information has been learned about gravitropism from fine-scale genetic analyses conducted over the last ten years. A family of genes (named PIN after pin-formed mutants) encoding auxin efflux carrier proteins have been isolated (Galweiler et al., 1998; Paponov et al., 2005). It is demonstrated that upon gravity stimulation, the PIN3 protein, positioned symmetrically at the plasma membrane of the root columella cells, rapidly relocalizes at the lateral plasma membrane surface and to vesicles that cycle in an actin-dependent manner, which provides a mechanism for redirecting auxin flux to trigger asymmetric growth (Friml et al., 2002; Palme et al., 2006). Recently, it has been shown that five PIN genes (PIN1, 2, 3, 4, and 7) collectively control auxin distribution to regulate cell division and expansion in the primary root, and that they work with another family of genes (PLT for PLETHORA) to specify the meristematic identity of cells in the root (Blilou et al., 2005). In another recent paper, sterol 300 Journal of Systematics and Evolution Vol. 46 No. 3 composition has been implicated in polar localization of the PIN2 protein, which also encodes an auxin transporter and directs root gravitropism (Men et al., 2008). This body of work basically confirms the classical view that auxin, via its basipetal transport, is the main secondary messenger regulating gravitropic growth (Boonsirichai et al., 2002), but has revealed much more genetic insight on mechanisms how plants respond to gravity. The detailed genetic and cell biological studies of gravitropism in Characeae and Arabidopsis set a stage for broad-scale evolutionary investigation of the phenomenon. A third line of work that may aid this research is the isolation of various gravitropic mutants in two mosses, Physcomitrella patens (Knight et al., 1991) and Ceratodon purpureus (Wagner et al., 1997; Cove & Quatrano, 2006). Given the momentum of research on Physcomitrella, it is conceivable that the genes will be isolated from these mutants soon. Thus far, the PIN genes have been found in both eudicots and monocots (Paponov et al., 2005). It will be desirable to extend isolation of this family of genes and the PLT genes, which mediate patterning of the root stem cell niche in Arabidopsis (Aida et al., 2004), to gymnosperms and pteridophytes, since the root apical meristem first appeared at the beginning of vascular plant evolution. Further, the search of the PIN genes should be pursued in bryophytes and Characeae. Although there is lack of evidence at present to support a hypothesis that gravitropism in Characeae and all land plants is controlled by the same genetic machinery, it is logical to expect so. A generally consistent evolutionary pattern of auxin metabolism and transport in Characeae, bryophytes, pteridophytes, and seed plants offers some optimism in this hypothesis (Cooke et al., 2002). When the algae colonized the land, among various challenges they face (desiccation, nutrient shortage, less buoyancy, UV, and spermlocomotion hindrance), absorption of water and nutrients was the first they had to deal with in order to survive, and gravitropism had to have evolved before the algae moved onto the land so that underground organs could develop to overcome this challenge. Moreover, a brief examination of morphology of the organisms along the phylogeny also supports such a hypothesis. Characeae, liverworts, mosses, hornworts, and vascular plants all have rhizoids or roots, which are all gravity-sensing and underground organs. The shoot negative gravitropism has not been studied as much as the root positive gravitropism, but the evolutionary history of the shoot portrays a similar and virtually universal role of negative gravitropism in 2008 shoot development. At the gametophytic level, Characeae, Haplomitrium, some leafy liverworts, Takakia and acrocarpous mosses all have a vertical upward growing axis. At the sporophytic level, Haplomitrium, simple thalloid and leafy liverworts, all mosses, hornworts, and vascular plants also have such an organ. In development of multicellular organisms, cell polarity is the fundamental problem that organisms have to solve during evolution (Cove, 2000). During evolution of streptophytes, gravity had clearly provided the most decisive environmental cue for polarity establishment once these plants became anchored on the soil. Therefore, the current exquisite genetic information obtained from Arabidopsis and the emerging results from studies of Characeae and Physcomitrella provide a solid foundation to investigate the origin and evolution of gravitropism in charophytic algae and land plants. In summary, the last twenty-five years witnessed one of the most rapid growth periods in the history of systematics. For the phylogeny of charophytic algae and land plants, not only the backbone, which was proposed even before morphological cladistic studies, has been confirmed with rigorous quantitative analyses of morphological and molecular data, but also many important details crucial for our understanding of some major evolutionary transitions have been clarified by mostly molecular phylogenetic studies. This progress on one hand brings us closer toward our goal of reconstructing the Tree of Life (Haeckel, 1866), and on the other hand sets the stage for new endeavors to explore how major evolutionary transitions in evolution of plant life happened. Advances in genetics and developmental biology in particular offer a great opportunity to integrate the knowledge from model organisms such as Physcomitrella and Arabidopsis and the new phylogenetic information on lineages that preceded and followed some major transitions, e.g., Coleochaetaceae/Characeae and the origin of gravitropism, and hornworts/lycophytes and alternation of generations in the life cycle of early land plants. This integrated multidisciplinary approach is likely to help us to gain mechanistic understanding on how some of the major evolutionary events in the plant history were unfolded. Acknowledgements I would like to express my gratitude to Christiane Anderson, Peter K. Endress, and Michael J. Wynne for help with literature search and translation, and to Todd J. Cooke, Hilary A. McManus and Paul G. Wolf for suggestions on improvement of the manuscript. I also would like to QIU: Plant phylogeny and evolution thank the National Science Foundation (USA), the National Natural Science Foundation of China, and DIVERSITAS (bioGENESIS program) for financial support. References Aida M, Beis D, Heidstra R, Willemsen V, Blilou I, Galinha C, Nussaume L, Noh YS, Amasino R, Scheres B. 2004. The PLETHORA genes mediate patterning of the Arabidopsis root stem cell niche. Cell 119: 109–120. Alberts B, Bray D, Lewis J, Raff M, Roberts K, Watson JD. 1989. Molecular biology of the cell. 2nd ed. New York: Garland Publishing, Inc. Bailey IW, Nast CG. 1943. The comparative morphology of the Winteraceae I. Pollen and stamens. Journal of the Arnold Arboretum 24: 340–346. Baldauf SL. 2003. The deep roots of eukaryotes. Science 300: 1703–1706. Barkman TJ, Chenery G, McNeal JR, Lyons-Weiler J, Ellisens WJ, Moore G, Wolfe AD, dePamphilis CW. 2000. Independent and combined analyses of sequences from all three genomic compartments converge on the root of flowering plant phylogeny. Proceedings of the National Academy of Sciences USA 97: 13166–13171. Bell G, Mooers AO. 1997. Size and complexity among multicellular organisms. Biological Journal of the Linnean Society 60: 345–363. Bhattacharya D, Weber K, An SS, Berning-Koch W. 1998. Actin phylogeny identifies Mesostigma viride as a flagellate ancestor of the land plants. Journal of Molecular Evolution 47: 544–550. Blilou I, Xu J, Wildwater M, Willemsen V, Paponov I, Friml J, Heidstra R, Aida M, Palme K, Scheres B. 2005. The PIN auxin efflux facilitator network controls growth and patterning in Arabidopsis roots. Nature 433: 39–44. Bonner JT. 1999. The origins of multicellularity. Integrative Biology 1: 27–36. Bonner JT. 2004. Perspective: The size-complexity rule. Evolution 58: 1883–1890. Boonsirichai K, Guan C, Chen R, Masson PH. 2002. Root gravitropism: An experimental tool to investigate basic cellular and molecular processes underlying mechanosensing and signal transmission in plants. Annual Review of Plant Biology 53: 421–447. Borsch T, Hilu KW, Quandt D, Wilde V, Neinhuis C, Barthlott W. 2003. Noncoding plastid trnT-trnF sequences reveal a well resolved phylogeny of basal angiosperms. Journal of Evolutionary Biology 16: 558–576. Bowe LM, Coat G, dePamphilis CW. 2000. Phylogeny of seed plants based on all three genomic compartments: Extant gymnosperms are monophyletic and Gnetales’ closest relatives are conifers. Proceedings of the National Academy of Sciences USA 97: 4092–4097. Bower FO. 1890. On antithetic as distinct from homologous alternation of generations in plants. Annals of Botany 4: 347–370. Bower FO. 1908. The origin of land flora: a theory based upon the facts of alternation. London: MacMillan and Co., Ltd. 301 Bower FO. 1935. Primitive land plants, also known as the archegoniatae. London: MacMillan and Co., Ltd. Braun M, Limbach C. 2006. Rhizoids and protonemata of characean algae: model cells for research on polarized growth and plant gravity sensing. Protoplasma 229: 133–142. Bremer K. 1985. Summary of green plant phylogeny and classification. Cladistics 1: 369–385. Brinkmann H, Philippe H. 2008. Animal phylogeny and large-scale sequencing: progress and pitfalls. Journal of Systematics and Evolution 46: 274–286. Burleigh JG, Mathews S. 2004. Phylogenetic signal in nucleotide data from seed plants: Implications for resolving the seed plant tree of life. American Journal of Botany 91: 1599–1613. Campbell DH. 1924. A remarkable development of the sporophyte in Anthoceros fusiformis, Aust. Annals of Botany 37: 473–483. Celakovsky L. 1874. Ueber die verschiedenen Formen und die Bedeutung des Generationwechsels der Pflanzen. Sitzungsberichte der koeniglichen Boehmischen Gesellschaft der Wissenschaften in Prague 2: 21–61. Chapman RL, Buchheim MA, Delwiche CF, Friedl T, Huss VAR, Karol KG, Lewis LA, Manhart J, McCourt RM, Olsen JL, Waters DA. 1998. Molecular systematics of the green algae. In: Soltis DE, Soltis PS, Doyle JJ eds. Molecular systematics of plants, II. Boston: Kluwer Academic Publishers. Chapman RL, Henk MC. 1986. Phragmoplasts in cytokinesis of Cephaleuros parasiticus (Chlorophyta) vegetative cells. Journal of Phycology 22: 83–88. Chase MW, Soltis DE, Olmstead RG, Morgan D, Les DH, Mishler BD, Duvall MR, Price RA, Hills HG, Qiu YL, Kron KA, Rettig JH, Conti E, Palmer JD, Manhart JR, Sytsma KJ, Michaels HJ, Kress WJ, Karol KG, Clark WD, Hedren M, Gaut BS, Jansen RK, Kim KJ, Wimpee CF, Smith JF, Furnier GR, Strauss SH, Xiang QY, Plunkett GM, Soltis PS, Swensen SM, Williams SE, Gadek PA, Quinn CJ, Eguiarte LE, Golenberg E, Learn GH, Graham SW, Barrett SCH, Dayanandan S, Albert VA. 1993. Phylogenetics of seed plants: an analysis of nucleotidesequences from the plastid gene rbcL. Annals of the Missouri Botanical Garden 80: 528–580. Chaw SM, Parkinson CL, Cheng YC, Vincent TM, Palmer JD. 2000. Seed plant phylogeny inferred from all three plant genomes: Monophyly of extant gymnosperms and origin of Gnetales from conifers. Proceedings of the National Academy of Sciences USA 97: 4086–4091. Chaw SM, Zharkikh A, Sung HM, Lau TC, Li WH. 1997. Molecular phylogeny of extant gymnosperms and seed plant evolution: Analysis of nuclear 18S rRNA sequences. Molecular Biology and Evolution 14: 56–68. Coelho SM, Peters AF, Charrier B, Roze D, Destombe C, Valero M, Cock JM. 2007. Complex life cycles of multicellular eukaryotes: New approaches based on the use of model organisms. Gene 406: 152–170. Cook ME. 2004. Structure and asexual reproduction of the enigmatic charophycean green alga Entransia fimbriata (Klebsormidiales, Charophyceae). Journal of Phycology 40: 424–431. Cook ME, Graham LE, Botha CEJ, Lavin CA. 1997. 302 Journal of Systematics and Evolution Vol. 46 No. 3 Comparative ultrastructure of plasmodesmata of Chara and selected bryophytes: toward an elucidation of the evolutionary origin of plant plasmodesmata. American Journal of Botany 84: 1169–1178. Cooke TJ, Poli D, Cohen JD. 2004. Did auxin play a crucial role in the evolution of novel body plans during the Late Silurian—Early Devonian radiation of land plants? In: Hemsley AR, Poole I eds. The evolution of plant physiology: from whole plants to ecosystems. Amsterdam: Elsevier Academic Press. Cooke TJ, Poli D, Sztein AE, Cohen JD. 2002. Evolutionary patterns in auxin action. Plant Molecular Biology 49: 319–338. Cove DJ. 2000. The generation and modification of cell polarity. Journal of Experimental Botany 51: 831–838. Cove DJ, Quatrano RS. 2006. Agravitropic mutants of the moss Ceratodon purpureus do not complement mutants having a reversed gravitropic response. Plant Cell and Environment 29: 1379–1387. Crane PR. 1985. Phylogenetic analysis of seed plants and the origin of angiosperms. Annals of the Missouri Botanical Garden 72: 716–793. Cronquist A. 1981. An integrated system of classification of flowering plants. New York: Columbia University Press. Darwin C. 1859. On the origin of species by means of natural selection. London: J. Murray. Delsuc F, Brinkmann H, Philippe H. 2005. Phylogenomics and the reconstruction of the tree of life. Nature Reviews Genetics 6: 361–375. Delwiche CF, Graham LE, Thomson N. 1989. Lignin-like compounds and sporopollenin in Coleochaete, an algal model for land plant ancestry. Science 245: 399–401. Dombrovska O, Qiu Y-L. 2004. Distribution of introns in the mitochondrial gene nad1 in land plants: phylogenetic and molecular evolutionary implications. Molecular Phylogenetics and Evolution 32: 246–263. Donoghue MJ, Doyle JA. 1989. Phylogenetic analysis of angiosperms and the relationships of Hamamelidae. In: Crane PR, Blackmore S eds. Evolution, systematics, and fossil history of the Hamamelidae. Vol. 1. Oxford: Clarendon Press. Doyle JA, Donoghue MJ. 1986. Seed plant phylogeny and the origin of angiosperms—an experimental cladistic approach. Botanical Review 52: 321–431. Doyle JA, Hotton CL. 1991. Diversity of early angiosperm pollen in a cladistic context. In: Blackmore S, Barnes A eds. Pollen and spores. Oxford: Clarendon Press. Endress PK. 1986. Reproductive structures and phylogenetic significance of extant primitive angiosperms. Plant Systematics and Evolution 152: 1–28. Endress PK, Igersheim A. 2000. Gynoecium structure and evolution in basal angiosperms. International Journal of Plant Sciences 161: S211–S223. Floyd GL, Stewart KD, Mattox KR. 1972. Cellular organization, mitosis, and cytokinesis in the ulotrichalean alga, Klebsormidium. Journal of Phycology 8: 176–184. Fowke LC, Pickett-Heaps JD. 1969. Cell division in Spirogyra. II cytokinesis. Journal of Phycology 5: 273–281. Franceschi VR, Ding B, Lucas WJ. 1994. Mechanism of plasmodesmata formation in characean algae in relation to evolution of intercellular communication in higher plants. 2008 Planta 192: 347–358. Friedl T. 1997. The evolution of the green algae. Plant Systematics and Evolution (Suppl.) 11: 87–101. Friedman WE. 1990. Double fertilization in Ephedra, a nonflowering seed plant—its bearing on the origin of angiosperms. Science 247: 951–954. Friis EM, Pedersen KR, Crane PR. 1999. Early angiosperm diversification: The diversity of pollen associated with angiosperm reproductive structures in Early Cretaceous floras from Portugal. Annals of the Missouri Botanical Garden 86: 259–296. Friml J, Wisniewska J, Benkova E, Mendgen K, Palme K. 2002. Lateral relocation of auxin efflux regulattor PIN3 mediates tropism in Arabidopsis. Nature 415: 805–809. Fritsch FE. 1916. The algal ancestry of the higher plants. New Phytologist 15: 233–250. Frohlich MW, Parker DS. 2000. The mostly male theory of flower evolutionary origins: from genes to fossils. Systematic Botany 25: 155–170. Galweiler L, Guan C, Muller A, Wisman E, Mendgen K, Yephremov A, Palme K. 1998. Regulation of polar auxin transport by AtPIN1 in Arabidopsis vascular tissue. Science 282: 2226–2230. Garbary DJ, Renzaglia KS, Duckett JG. 1993. The phylogeny of land plants—a cladistic analysis based on male gametogenesis. Plant Systematics and Evolution 188: 237–269. Gensel PG, Edwards D. 2001. Plants Invade the Land. New York: Columbia University Press. Gifford EM, Foster AS. 1989. Morphology and evolution of vascular plants. 3rd ed. New York: W. H. Freeman and Company. Goremykin V, Bobrova V, Pahnke J, Troitsky A, Antonov A, Martin W. 1996. Noncoding sequences from the slowly evolving chloroplast inverted repeat in addition to rbcL data do not support Gnetalean affinities of angiosperms. Molecular Biology and Evolution 13: 383–396. Goremykin VV, Hellwig FH. 2005. Evidence for the most basal split in land plants dividing bryophyte and tracheophyte lineages. Plant Systematics and Evolution 254: 93–103. Graham LE. 1993. Origin of land plants. New York: John Wiley & Sons, Inc. Graham LE, Delwiche CF, Mishler BD. 1991. Phylogenetic connections between the “green algae” and the “bryophytes”. Advances in Bryology 4: 213–244. Graham SW, Olmstead RG. 2000. Utility of 17 chloroplast genes for inferring the phylogeny of the basal angiosperms. American Journal of Botany 87: 1712–1730. Gray J. 1993. Major paleozoic land plant evolutionary bio-events. Palaeogeography Palaeoclimatology Palaeoecology 104: 153–169. Grosberg RK, Strathmann RR. 2007. The evolution of multicellularity: A minor major transition? Annual Review of Ecology Evolution and Systematics 38: 621–654. Groth-Malonek M, Pruchner D, Grewe F, Knoop V. 2005. Ancestors of trans-splicing mitochondrial introns support serial sister group relationships of hornworts and mosses with vascular plants. Molecular Biology and Evolution 22: 117–125. Gugerli F, Sperisen C, Buchler U, Brunner L, Brodbeck S, Palmer JD, Qiu YL. 2001. The evolutionary split of QIU: Plant phylogeny and evolution Pinaceae from other conifers: Evidence from an intron loss and a multigene phylogeny. Molecular Phylogenetics and Evolution 21: 167–175. Haeckel E. 1866. Generelle Morphologie der Organismen: Allgemeine Grundzuge der organischen FormenWissenschaft, mechanisch begrundet durch die von Charles Darwin reformirte Descendez-Theorie. Berlin: Georg Riemer. Hagemann W. 1999. Towards an organismic concept of land plants: the marginal blastozone and the development of the vegetation body of selected frondose gametophytes of liverworts and ferns. Plant Systematics and Evolution 216: 81–133. Hamant O, Ma H, Cande WZ. 2006. Genetics of meiotic prophase I in plants. Annual Review of Plant Biology 57: 267–302. Hamoir G. 1992. The discovery of meiosis by E. Van Beneden, a breakthrough in the morphological phase of heredity. The International Journal of Developmental Biology 36: 9–15. Heath TA, Hedtke SM, Hillis DM. 2008. Taxon sampling and the accuracy of phylogenetic analyses. Journal of Systematics and Evolution 46: 239–251. Hennig W. 1966. Phylogenetic systematics. Urbana: University of Illinois Press. Hilu KW, Borsch T, Muller K, Soltis DE, Soltis PS, Savolainen V, Chase MW, Powell MP, Alice LA, Evans R, Sauquet H, Neinhuis C, Slotta TAB, Rohwer JG, Campbell CS, Chatrou LW. 2003. Angiosperm phylogeny based on matK sequence information. American Journal of Botany 90: 1758–1776. Hu H-H. 1950. A polyphyletic system of classification of angiosperms. Science Record 3: 221–230. Hughes EO. 1948. New fresh-water Chlorophyceae from Nova Scotia. American Journal of Botany 35: 424–427. Jansen RK, Cai Z, Raubeson LA, Daniell H, Depamphilis CW, Leebens-Mack J, Muller KF, Guisinger-Bellian M, Haberle RC, Hansen AK, Chumley TW, Lee SB, Peery R, McNeal JR, Kuehl JV, Boore JL. 2007. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proceedings of the National Academy of Sciences USA 104: 19369–19374. Jeffares DC, Phillips MJ, Moore S, Veit B. 2004. A description of the Mei2-like protein family; structure, phylogenetic distribution and biological context. Development Genes and Evolution 214: 149–158. Karol KG, McCourt RM, Cimino MT, Delwiche CF. 2001. The closest living relatives of land plants. Science 294: 2351–2353. Kaur J, Sebastian J, Siddiqi I. 2006. The Arabidopsis-mei2-like genes play a role in meiosis and vegetative growth in Arabidopsis. Plant Cell 18: 545–559. Kelch DG, Driskell A, Mishler BD. 2004. Inferring phylogeny using genomic characters: a case study using land plant plastomes. In: Goffinet B, Hollowell V, Magill R eds. Molecular systematics of bryophytes. St. Louis: Missouri Botanical Garden Press. Kenrick P, Crane PR. 1997. The origin and early diversification of land plants: a cladistic study. Washington, DC: Smithsonian Institution Press. 303 Kim E, Wilcox LW, Fawley MW, Graham LE. 2006. Phylogenetic position of the green flagellate mesostigma viride based on alpha-tubulin and beta-tubulin gene sequences. International Journal of Plant Sciences 167: 873–883. Klambt D, Knauth B, Dittmann I. 1992. Auxin dependent growth of rhizoid of Chara globularis. Physiologia Plantarum 85: 537–540. Knight CD, Futers TS, Cove DJ. 1991. Genetic analysis of a mutant class of Physcomitrella patens in which the polarity of gravitropism is reversed. Molecular & General Genetics 230: 12–16. Kranz HD, Miks D, Siegler M-L, Capesius I, Sensen CW, Huss VAR. 1995. The origin of land plants: phylogenetic relationships among charophytes, bryophytes, and vascular plants inferred from complete small-subunit ribosomal RNA gene sequences. Journal of Molecular Evolution 41: 74–84. Kugita M, Kaneko A, Yamamoto Y, Takeya Y, Matsumoto T, Yoshinaga K. 2003. The complete nucleotide sequence of the hornwort (Anthoceros formosae) chloroplast genome: insight into the earliest land plants. Nucleic Acids Research 31: 716–721. Leebens-Mack J, Raubeson LA, Cui LY, Kuehl JV, Fourcade MH, Chumley TW, Boore JL, Jansen RK, dePamphilis CW. 2005. Identifying the basal angiosperm node in chloroplast genome phylogenies: Sampling one’s way out of the felsenstein zone. Molecular Biology and Evolution 22: 1948–1963. Lemieux C, Otis C, Turmel M. 2000. Ancestral chloroplast genome in Mesostigma viride reveals an early branch of green plant evolution. Nature 403: 649–652. Lemieux C, Otis C, Turmel M. 2007. A clade uniting the green algae Mesostigma viride and Chlorokybus atmophyticus represents the deepest branch of the Streptophyta in chloroplast genome-based phylogenies. BMC Biology 5: 2. Lewis LA, McCourt RM. 2004. Green algae and the origin of land plants. American Journal of Botany 91: 1535–1556. Lewis LA, Mishler BD, Vilgalys R. 1997. Phylogenetic relationships of the liverworts (Hepaticae), a basal embryophyte lineage, inferred from nucleotide sequence data of the chloroplast gene rbcL. Molecular Phylogenetics and Evolution 7: 377–393. Lokhorst GM, Star W, Lukesova A. 2000. The new species Hormidiella attenuata (Klebsormidiales), notes on morphology and reproduction. Algological Studies 100: 11–27. Lucas WJ, Ding B, Vanderschoot C. 1993. Plasmodesmata and the supracellular nature of plants. New Phytologist 125: 435–476. Lucas WJ, Lee JY. 2004. Plasmodesmata as a supracellular control network in plants. Nature Reviews Molecular Cell Biology 5: 712–726. Ma H. 2005. Molecular genetic analyses of microsporogenesis and microgametogenesis in flowering plants. Annual Review of Plant Biology 56: 393–434. Mabberley DJ. 1987. The plant book. Cambridge: Cambridge University Press. Magallon S, Sanderson MJ. 2002. Relationships among seed plants inferred from highly conserved genes: Sorting 304 Journal of Systematics and Evolution Vol. 46 No. 3 conflicting phylogenetic signals among ancient lineages. American Journal of Botany 89: 1991–2006. Malek O, Knoop V. 1998. Trans-splicing group II introns in plant mitochondria: The complete set of cis-arranged homologs in ferns, fern allies, and a hornwort. RNA 4: 1599–1609. Manhart JR, Palmer JD. 1990. The gain of two chloroplast transfer-RNA introns marks the green algal ancestors of land plants. Nature 345: 268–270. Marchant HJ, Pickett-Heaps JD. 1973. Mitosis and cytokinesis in Coleochaete scutata. Journal of Phycology 9: 461–471. Mathews S, Donoghue MJ. 1999. The root of angiosperm phylogeny inferred from duplicate phytochrome genes. Science 286: 947–950. Mattox KR, Stewart KD. 1984. Classification of the gree algae: a concept based on comparative cytology. In: Irvine DEG, John DM eds. Systematics of the green algae. London and Orlando: Academic Press. Mayr E. 1982. The growth of biological thought: diversity, evolution, and inheritance. Cambridge, Massachusetts: The Belknap Press of Harvard University Press. McCourt RM, Karol KG, Bell J, Helm-Bychowski KM, Grajewska A, Wojciechowski MF, Hoshaw RW. 2000. Phylogeny of the conjugating green algae (Zygnemophyceae) based on rbcL sequences. Journal of Phycology 36: 747–758. McManus HA, Qiu Y-L. 2008. Life cycles in major lineages of photosynthetic eukaryotes, with a special reference to the origin of land plants. Fieldiana: in press. Melkonian M. 1989. Flagellar apparatus ultrastructure in Mesostigma viride (Prasinophyceae). Plant Systematics and Evolution 164: 93–122. Melkonian M, Marin B, Surek B. 1995. Phylogeny and evolution of the algae. In: Arai R, Kato M, Doi Y eds. Biodiversity and evolution. Tokyo: National Science Museum Foundation. Men SZ, Boutte Y, Ikeda Y, Li XG, Palme K, Stierhof YD, Hartmann MA, Moritz T, Grebe M. 2008. Sterol-dependent endocytosis mediates post-cytokinetic acquisition of PIN2 auxin efflux carrier polarity. Nature Cell Biology 10: 237–U124. Mishler BD, Churchill SP. 1984. A cladistic approach to the phylogeny of the bryophytes. Brittonia 36: 406–424. Mishler BD, Churchill SP. 1985. Transition to a land flora: phylogenetic relationships of the green algae and bryophytes. Cladistics 1: 305–328. Moore MJ, Bell CD, Soltis PS, Soltis DE. 2007. Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms. Proceedings of the National Academy of Sciences USA 104: 19363–19368. Nedelcu AM, Borza T, Lee RW. 2006. A land plant-specific multigene family in the unicellular Mesostigma argues for its close relationship to streptophyta. Molecular Biology and Evolution 23: 1111–1015. Nickrent DL, Parkinson CL, Palmer JD, Duff RJ. 2000. Multigene phylogeny of land plants with special reference to bryophytes and the earliest land plants. Molecular Biology and Evolution 17: 1885–1895. Niklas KJ. 1997. The evolutionary biology of plants. Chicago: The University of Chicago Press. Nishiyama T, Kato M. 1999. Molecular phylogenetic analysis 2008 among bryophytes and tracheophytes based on combined data of plastid coded genes and the 18S rRNA gene. Molecular Biology and Evolution 16: 1027–1036. Nishiyama T, Wolf PG, Kugita M, Sinclair RB, Sugita M, Sugiura C, Wakasugi T, Yamada K, Yoshinaga K, Yamaguchi K, Ueda K, Hasebe M. 2004. Chloroplast phylogeny indicates that bryophytes are monophyletic. Molecular Biology and Evolution 21: 1813–1819. Nixon KC, Crepet WL, Stevenson D, Friis EM. 1994. A reevaluation of seed plant phylogeny. Annals of the Missouri Botanical Garden 81: 484–533. Palme K, Dovzhenko A, Ditengou FA. 2006. Auxin transport and gravitational research: perspectives. Protoplasma 229: 175–181. Palmer JD, Thompson WF. 1982. Chloroplast DNA rearrangements are more frequent when a large inverted repeat sequence is lost. Cell 29: 537–550. Paponov IA, Teale WD, Trebar M, Blilou I, Palme K. 2005. The PIN auxin efflux facilitators: evolutionary and functional perspective. Trends in Plant Science 10: 170–177. Parkinson CL, Adams KL, Palmer JD. 1999. Multigene analyses identify the three earliest lineages of extant flowering plants. Current Biology 9: 1485–1488. Petersen J, Teich R, Becker B, Cerff R, Brinkmann H. 2006. The GapA/B gene duplication marks the origin of streptophyta (Charophytes and land plants). Molecular Biology and Evolution 23: 1109–1118. Pickett-Heaps JD. 1967. Ultrastructure and differentiation in Chara sp. II. Mitosis. Australian Journal of Biological Science 20: 883–894. Pickett-Heaps JD. 1975. Green algae: structure, reproduction and evolution in selected genera. Sunderland, MA: Sinauer. Pickett-Heaps JD, Gunning BES, Brown RC, Lemmon BE, Cleary AL. 1999. The cytoplast concept in dividing plant cells: Cytoplasmic domains and the evolution of spatially organized cell division. American Journal of Botany 86: 153–172. Pickett-Heaps JD, Marchant HJ. 1972. The phylogeny of the green algae: a new proposal. Cytobios 6: 255–264. Pringsheim N. 1860. Beitraege zur Morphologie und Systematik der Algen. III. Die Coleochaeteen. Jahrbuecher fuer wissenschaftliche Botanik 2: 1–38. Pringsheim N. 1878. Ueber Sprossung der Moosfruechte und den Generationswechel der Thallophyten. Jahrbuecher fuer wissenschaftliche Botanik 11: 1–46. Pryer KM, Schneider H, Smith AR, Cranfill R, Wolf PG, Hunt JS, Sipes SD. 2001. Horsetails and ferns are a monophyletic group and the closest living relatives to seed plants. Nature 409: 618–622. Qiu Y-L, Chase MW, Les DH, Parks CR. 1993. Molecular phylogenetics of the Magnoliidae—cladistic analyses of nucleotide-sequences of the plastid gene rbcL. Annals of the Missouri Botanical Garden 80: 587–606. Qiu Y-L, Cho YR, Cox JC, Palmer JD. 1998. The gain of three mitochondrial introns identifies liverworts as the earliest land plants. Nature 394: 671–674. Qiu Y-L, Dombrovska O, Lee J, Li LB, Whitlock BA, Bernasconi-Quadroni F, Rest JS, Davis CC, Borsch T, Hilu KW, Renner SS, Soltis DE, Soltis PS, Zanis MJ, QIU: Plant phylogeny and evolution Cannone JJ, Gutell RR, Powell M, Savolainen V, Chatrou LW, Chase MW. 2005. Phylogenetic analyses of basal angiosperms based on nine plastid, mitochondrial, and nuclear genes. International Journal of Plant Sciences 166: 815–842. Qiu Y-L, Estabrook GF. 2008. Resolving phylogenetic relationships among key angiosperm lineages using a compatibility method on a molecular data set. Journal of Systematics and Evolution 46: 130–141. Qiu Y-L, Lee J. 2000. Transition to a land flora: A molecular phylogenetic perspective. Journal of Phycology 36: 799–802. Qiu Y-L, Lee J, Bernasconi-Quadroni F, Soltis DE, Soltis PS, Zanis M, Zimmer EA, Chen Z, Savolainen V, Chase MW. 2000. Phylogeny of basal angiosperms: Analyses of five genes from three genomes. International Journal of Plant Sciences 161: S3–S27. Qiu Y-L, Lee J, Whitlock BA, Bernasconi-Quadroni F, Dombrovska O. 2001. Was the ANITA rooting of the angiosperm phylogeny affected by long-branch attraction? Molecular Biology and Evolution 18: 1745–1753. Qiu Y-L, Lee JH, Bernasconi-Quadroni F, Soltis DE, Soltis PS, Zanis M, Zimmer EA, Chen ZD, Savolainen V, Chase MW. 1999. The earliest angiosperms: evidence from mitochondrial, plastid and nuclear genomes. Nature 402: 404–407. Qiu Y-L, Li LB, Hendry TA, Li R, Taylor DW, Issa MJ, Ronen AJ, Vekaria ML, White AM. 2006a. Reconstructing the basal angiosperm phylogeny: evaluating information content of the mitochondrial genes. Taxon 55: 837–856. Qiu Y-L, Li LB, Wang B, Chen ZD, Dombrovska O, Lee J, Kent L, Li RQ, Jobson RW, Hendry TA, Taylor DW, Testa CM, Ambros M. 2007. A nonflowering land plant phylogeny inferred from nucleotide sequences of seven chloroplast, mitochondrial, and nuclear genes. International Journal of Plant Sciences 168: 691–708. Qiu Y-L, Li LB, Wang B, Chen ZD, Knoop V, Groth-Malonek M, Dombrovska O, Lee J, Kent L, Rest J, Estabrook GF, Hendry TA, Taylor DW, Testa CM, Ambros M, Crandall-Stotler B, Duff RJ, Stech M, Frey W, Quandt D, Davis CC. 2006b. The deepest divergences in land plants inferred from phylogenomic evidence. Proceedings of the National Academy of Sciences USA 103: 15511–15516. Raubeson LA, Jansen RK. 1992. Chloroplast DNA evidence on the ancient evolutionary split in vascular land plants. Science 255: 1697–1699. Raven JA. 1997. Multiple origins of plasmodesmata. European Journal of Phycology 32: 95–101. Raven JA, Edwards D. 2001. Roots: evolutionary origins and biogeochemical significance. Journal of Experimental Botany 52: 381–401. Renzaglia KS, Duff RJ, Nickrent DL, Garbary DJ. 2000. Vegetative and reproductive innovations of early land plants: implications for a unified phylogeny. Philosophical Transactions of the Royal Society of London Series B-Biological Sciences 355: 769–793. Rodriguez-Ezpeleta N, Philippe H, Brinkmann H, Becker B, Melkonian M. 2007. Phylogenetic analyses of nuclear, mitochondrial, and plastid multigene data sets support the placement of Mesostigma in the Streptophyta. Molecular Biology and Evolution 24: 723–731. 305 Rogers CE, Domozych DS, Stewart KD, Mattox KR. 1981. The flagellar apparatus of Mesostigma viride (Prasinophyceae) —multilayered structures in a scaly green flagellate. Plant Systematics and Evolution 138: 247–258. Rothwell GW, Serbet R. 1994. Lignophyte phylogeny and the evolution of spermatophytes—a numerical cladistic analysis. Systematic Botany 19: 443–482. Rydin C, Kallersjo M, Friist EM. 2002. Seed plant relationships and the systematic position of Gnetales based on nuclear and chloroplast DNA: Conflicting data, rooting problems, and the monophyly of conifers. International Journal of Plant Sciences 163: 197–214. Saarela JM, Rai HS, Doyle JA, Endress PK, Mathews S, Marchant AD, Briggs BG, Graham SW. 2007. Hydatellaceae identified as a new branch near the base of the angiosperm phylogenetic tree. Nature 446: 312–315. Samigullin TK, Yacentyuk SP, Degtyaryeva GV, Valiehoroman KM, Bobrova VK, Capesius I, Martin WM, Troitsky AV, Filin VR, Antonov AS. 2002. Paraphyly of bryophytes and close relationship of hornworts and vascular plants inferred from analysis of chloroplast rDNA ITS (cpITS) sequences. Arctoa 11: 31–43. Savolainen V, Chase MW, Hoot SB, Morton CM, Soltis DE, Bayer C, Fay MF, De Bruijn AY, Sullivan S, Qiu YL. 2000. Phylogenetics of flowering plants based on combined analysis of plastid atpB and rbcL gene sequences. Systematic Biology 49: 306–362. Sluiman HJ. 1985. A cladistic evaluation of the lower and higher green plants (Viridiplantae). Plant Systematics and Evolution 149: 217–232. Sluiman HJ, Guihal C, Mudimu O. 2008. Assessing phylogenetic affinities and species delimitations in Klebsormidiales (Streptophyta): Nuclear-encoded rDNA phylogenies and its secondary structure models in Klebsormidium, Hormidiella, and Entransia. Journal of Phycology 44: 183–195. Smith GM. 1955. Cryptogamic botany. Vol. I. Bryophytes and pteridophytes. New York: McGraw-Hill Book Co., Inc. Soltis DE, Soltis PS, Chase MW, Mort ME, Albach DC, Zanis M, Savolainen V, Hahn WH, Hoot SB, Fay MF, Axtell M, Swensen SM, Prince LM, Kress WJ, Nixon KC, Farris JS. 2000. Angiosperm phylogeny inferred from 18S rDNA, rbcL, and atpB sequences. Botanical Journal of the Linnean Society 133: 381–461. Soltis DE, Soltis PS, Zanis MJ. 2002. Phylogeny of seed plants based on evidence from eight genes. American Journal of Botany 89: 1670–1681. Soltis PS, Soltis DE, Chase MW. 1999. Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology. Nature 402: 402–404. Stebbins GL. 1974. Flowering plants: evolution above the species level. Belknap, Cambridge, MA: Harvard University Press. Stefanovic S, Rice DW, Palmer JD. 2004. Long branch attraction, taxon sampling, and the earliest angiosperms: Amborella or monocots? BMC Evolutionary Biology 4: 35. Stewart WDP, Rodgers GA. 1977. The cyanophyte-hepatic symbiosis. 2. Nitrogen fixation and the interchange of nitrogen and carbon. New Phytologist 78: 459–471. Strasburger E. 1894. The periodic reduction of the number of 306 Journal of Systematics and Evolution Vol. 46 No. 3 the chromosomes in the life-history of living organisms. Annals of Botany 8: 281–316. Sugiura C, Kobayashi Y, Aoki S, Sugita C, Sugita M. 2003. Complete chloroplast DNA sequence of the moss Physcomitrella patens: evidence for the loss and relocation of rpoA from the chloroplast to the nucleus. Nucleic Acids Research 31: 5324–5331. Svedelius N. 1927. Alternation of generations in relation to reduction division. Botanical Gazette 83: 362–384. Sztein AE, Cohen JD, Slovin JP, Cooke TJ. 1995. Auxin metabolism in representative land plants. American Journal of Botany 82: 1514–1521. Takhtajan A. 1976. Neoteny and the origin of flowering plants. In: Beck CB ed. Origin and early evolution of angiosperms. New York: Columbia University Press. Taylor TN, Taylor EL. 1993. The biology and evolution of fossil plants. Englewood Cliffs, New Jersey: Prentice Hall. Theissen G, Becker A, Di Rosa A, Kanno A, Kim JT, Munster T, Winter KU, Saedler H. 2000. A short history of MADS-box genes in plants. Plant Molecular Biology 42: 115–149. Tsuji S, Ueda K, Nishiyama T, Hasebe M, Yoshikawa S, Konagaya A, Nishiuchi T, Yamaguchi K. 2007. The chloroplast genome from a lycophyte (microphyllophyte), Selaginella uncinata, has a unique inversion, transpositions and many gene losses. Journal of Plant Research 120: 281–290. Turmel M, Ehara M, Otis C, Lemieux C. 2002. Phylogenetic relationships among streptophytes as inferred from chloroplast small and large subunit rRNA gene sequences. Journal of Phycology 38: 364–375. Turmel M, Otis C, Lemieux C. 2003. The mitochondrial genome of Chara vulgaris: Insights into the mitochondrial DNA architecture of the last common ancestor of green algae and land plants. Plant Cell 15: 1888–1903. Turmel M, Otis C, Lemieux C. 2006. The chloroplast genome sequence of Chara vulgaris sheds new light into the closest green algal relatives of land plants. Molecular Biology and Evolution 23: 1324–1338. Turmel M, Pombert JF, Charlebois P, Otis C, Lemieux C. 2007. The green algal ancestry of land plants as revealed by the chloroplast genome. International Journal of Plant Sciences 168: 679–689. Upchurch GR. 1984. Cuticle evolution in Early Cretaceous angiosperms from the Potomac Group of Virginia and Maryland. Annals of the Missouri Botanical Garden 71: 522–550. 2008 Van Beneden E. 1883. Recherches sur la maturation de l’oeuf et la fécondation. Ascaris megalocephala. Archives de Biologie 4: 265–640. van den Hoek C, Mann DG, Jahns HM. 1995. Algae: an introduction to phycology. Cambridge: Cambrideg University Press. Veit B, Briggs SP, Schmidt RJ, Yanofsky MF, Hake S. 1998. Regulation of leaf initiation by the terminal ear 1 gene of maize. Nature 393: 166–168. Volkmann D, Baluska F. 2006. Gravity: one of the driving forces for evolution. Protoplasma 229: 143–148. Wagner TA, Cove DJ, Sack FD. 1997. A positively gravitropic mutant mirrors the wild-type protonemal response in the moss Ceratodon purpureus. Planta 202: 149–154. Wakasugi T, Nishikawa A, Yamada K, Sugiura M. unpublished. Complete nucleotide sequence of the chloroplast genome from a fern, Psilotum nudum. GenBank AP004638. Walker JW, Doyle JA. 1975. Bases of angiosperm phylogeny— palynology. Annals of the Missouri Botanical Garden 62: 664–723. Watanabe Y, Yamamoto M. 1994. S. pombe mei2+ encodes an RNA-binding protein essential for premeiotic DNA synthesis and meiosis I, which cooperates with a novel RNA species meiRNA. Cell 78: 487–498. Wellman CH, Osterloff PL, Mohiuddin U. 2003. Fragments of the earliest land plants. Nature 425: 282–285. Williams JH, Friedman WE. 2002. Identification of diploid endosperm in an early angiosperm lineage. Nature 415: 522–526. Winter KU, Becker A, Munster T, Kim JT, Saedler H, Theissen G. 1999. MADS-box genes reveal that gnetophytes are more closely related to conifers than to flowering plants. Proceedings of the National Academy of Sciences USA 96: 7342–7347. Wodehouse RP. 1935. Pollen grains: their structure, identification and significance in science and medicine. New York: McGraw-Hill Book Co., Inc. Wodehouse RP. 1936. Evolution of pollen grains. Botanical Review 2: 67–84. Wolf PG, Karol KG, Mandoli DF, Kuehl J, Arumuganathan K, Ellis MW, Mishler BD, Kelch DG, Olmstead RG, Boore JL. 2005. The first complete chloroplast genome sequence of a lycophyte, Huperzia lucidula (Lycopodiaceae). Gene 350: 117–128. Zanis MJ, Soltis DE, Soltis PS, Mathews S, Donoghue MJ. 2002. The root of the angiosperms revisited. Proceedings of the National Academy of Sciences USA 99: 6848–6853.

Tutor Answer

LeoProfessor
School: Rice University

Attached.

Running Head: GENETICS, EVOLUTION AND BIODIVERSITY

Genetics, Evolution and Biodiversity (Manuscript)
Student’s Name

Institutional affiliation

1

GENETIC, EVOLUTION AND BIODIVERSITY

2

Abstract

The research on genetic variation within plant species depend on random markers on the
DNA, but the correlation with quantitative variation is contentious. The functional variation in the
quantitative characters plays a significant role in the determination of the evolutionary potential and
a tool for genetic resources. The markers are useful in the assessment of the genetic and
phylogenetic diversity of plants by directing on specific genes on plant families. According to the
research preview on the experiment, the data on trees derived from DNA sequences does not give a
full insight of the phylogeny of seed plants. The few living lineages represent less lineages
according to the fossil records. The nucleotide data in molecular analysis supports hypotheses that
are conflicting and thus advanced analysis is crucial. The experiment focuses on using an improved
way of sampling gymnosperm diversity that will help in the estimation of the seed plant phylogeny.
The research uses the targeted genes on the extracted DNA to calculate the significance of genetic
and the phylogenetic diversity which determines the rate of evolution. The laboratory research
involved extraction of the targeted samples of DNA from the seed samples of plants. The extracted
DNA was amplified on a specific region of the DNA for each sample using PCR. The amplified
sample was run through the agarose gel electrophoresis to obtain DNA sequences that were used to
construct a phylogenetic tree. The phylogenetic tree constructed was based on the results obtained
on visualization of the bases of DNA samples but the focus was on the cycads and the Ginkgo.

Introduction

The Study of the evolutionary pattern of plants has taken scientist year with the case of
environmental progression. The understanding of the evolutionary patterns through the analysis of
the genetic components, majorly the DNA play a significant role in the conservation of plant

GENETIC, EVOLUTION AND BIODIVERSITY

3

species. Plant biologists have specific goals in their research work; the evolutionary scientist has the
primary goal to acquire knowledge on the genetic diversification of plants. The research on the
evolutionary lineage on the seed plants through the study of physical characters gives less
information. In-situ conservation according to studies plays a significant role in genetic
conservation.

The discovery of large phylogenies is crucial to the scientist in their understanding of the
plant life. Macro-evolutionary patterns refer to the conspicuous characters of a plant and can help in
structuring the shape of plant life. According to previous research, phylogenies aid in the analysis
of evolutionary, systematic and ecologic relations of different plants including the non-flowering
plants such as algae. Although, a little experiment can explain the phylogeny of most seeded plants,
the limited resources to perform such laboratory studies explains the scientific gap present. The
irregular distribution of plant varieties and their presence throughout is a challenge to the research
based on evolutionary relationships (Mathews, 2009). Evolutionary biologists opine that
diversification of the plant species has its isolation based on the clade age. The diversification does
not depend on the range of clade ages under examination and thus provides a general conclusion
according to the experiments. The scientific studies on the phylogenetic relationship give a forum
to contrast the hypothesis on the evolutionary chain of the seeded plant species.

Despite the molecular phylogenetic reviews on different plant species, the phylogeny of
gymnosperms is unknown, and that of Gnetales is still a controversial issue. According to a
phylotranscriptomic study conducted on thirteen families from the angiosperms and gymnosperms,
cycads plus Ginkgo is closely related to the remaining gymnosperms in the classification. The rate
of molecular evolution on the angiosperms and Gnetales is common due to the similarity in

GENETIC, EVOLUTION AND BIODIVERSITY

4

selective pressure undergone during evolution. The scientific research in the present today aims at
filli...

flag Report DMCA
Review

Anonymous
Top quality work from this guy! I'll be back!

Similar Questions
Hot Questions
Related Tags

Brown University





1271 Tutors

California Institute of Technology




2131 Tutors

Carnegie Mellon University




982 Tutors

Columbia University





1256 Tutors

Dartmouth University





2113 Tutors

Emory University





2279 Tutors

Harvard University





599 Tutors

Massachusetts Institute of Technology



2319 Tutors

New York University





1645 Tutors

Notre Dam University





1911 Tutors

Oklahoma University





2122 Tutors

Pennsylvania State University





932 Tutors

Princeton University





1211 Tutors

Stanford University





983 Tutors

University of California





1282 Tutors

Oxford University





123 Tutors

Yale University





2325 Tutors