of 9
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Research Article On the Tempo of Genome Size Evolution in Angiosperms

Category:

General

Publish on:

Views: 36 | Pages: 9

Extension: PDF | Download: 0

Share
Description
Journal of Botany Volume 2010, Article ID , 8 pages doi: /2010/ Research Article On the Tempo of Genome Size Evolution in Angiosperms Jeremy M. Beaulieu, 1 Stephen A. Smith, 2 and Ilia
Transcript
Journal of Botany Volume 2010, Article ID , 8 pages doi: /2010/ Research Article On the Tempo of Genome Size Evolution in Angiosperms Jeremy M. Beaulieu, 1 Stephen A. Smith, 2 and Ilia J. Leitch 3 1 Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT , USA 2 National Evolutionary Synthesis Center, 2024 W. Main St. A200, Durham, NC , USA 3 Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AD, UK Correspondence should be addressed to Jeremy M. Beaulieu, Received 4 January 2010; Accepted 17 April 2010 Academic Editor: Jan Suda Copyright 2010 Jeremy M. Beaulieu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Broadly sampled phylogenies have uncovered extreme deviations from a molecular clock with the rates of molecular substitution varying dramatically within/among lineages. While growth form, a proxy for life history, is strongly correlated with molecular rate heterogeneity, its influence on trait evolution has yet to be examined. Here, we explore genome size evolution in relation to growth form by combining recent advances in large-scale phylogeny construction with model-based phylogenetic comparative methods. We construct phylogenies for Monocotyledonae (monocots) and Fabaceae (legumes), including all species with genome size information, and assess whether rates of genome size evolution depend on growth form. We found that the rates of genome size evolution for woody lineages were consistently an order of magnitude slower than those of herbaceous lineages. Our findings also suggest that growth form constrains genome size evolution, not through consequences associated with the phenotype, but instead through the influence of life history attributes on the tempo of evolution. Consequences associated with life history now extend to genomic evolution and may shed light on the frequently observed threshold effect of genome size variation on higher phenotypic traits. 1. Introduction The concept of a molecular clock predicts that nucleotide substitution rates should scale linearly with time and therefore be equal among lineages. However, rarely do datasets conform to a molecular clock (e.g., [1]) and broadly sampled phylogenies have clearly documented dramatic lineagespecific molecular rate heterogeneity across the Tree of Life (e.g., [2 6]). Life history, or more specifically, generation time, is a strong correlate of among/within lineage rate heterogeneity in both animals and plants [6 8]. In plants, molecular rates are consistently more variable and typically higher in herbaceous species when compared to woody (i.e., trees/shrubs) species. Generation time may play a role in this pattern as herbaceous species typically have shorter generation times than woody species, and hence a greater capacity to accumulate nucleotide substitutions per unit time. Implicit in these results is a renewed appreciation for the link between microevolutionary process and macroevolutionary pattern [9]. The consistent pattern of life history influences on rates of molecular evolution across several loci [5, 6, 10] implies this pattern may manifest at the whole genome-level the first phenotypic scale above molecules. The size of any given genome is determined by rates of DNA accumulation (e.g., retrotransposition and polyploidy) and deletions (e.g., via unequal crossing over and illegitimate recombination). The rate of genome size evolution is therefore set by the interplay between selection and drift promoting and eliminating these mutational changes [11 13]. Indeed, several phylogenetic studies have revealed increases and decreases in genome size [14 17]. Extant angiosperms exhibit a growth form dependent distribution in genome size. Woody angiosperms are characterized by small genome sizes with lower overall variance compared to herbaceous species [18, 19]. This asymmetry in genome size variance among growth forms has been interpreted as an indication of large increases in DNA content negatively impacting woody species [19, 20]. However, when viewed in the context of microevolutionary processes, 2 Journal of Botany the growth form dependent distribution of genome size could also be explained in part by consequences associated with life history. For example, woody angiosperms take many years to reach reproductive maturity [21]. For genome size, this may allow fewer opportunities for insertion/deletions to occur per unit time. Therefore, in terms of generation time, the smaller and lower variance in genome size exhibited by woody species need not be explained only by functional constraints on the phenotype [19, 22]. Here, we test for growth-form dependent rates of genome size evolution between woody and herbaceous lineages. Specifically, we test whether woody species exhibit slower rates of genome size evolution than related herbaceous species. To explore genome size evolution in relation to growth form, we combine recent advances in large-scale phylogeny construction [23] with model-based phylogenetic comparative methods [24]. We focus our analyses on two major branches of the angiosperms that are well represented in the Plant DNA C-value database [25]: the Monocotyledonae (monocots; [26]) and the Fabaceae (Leguminosae or legumes). The monocots are a large clade of mainly herbaceous angiosperms that also contain a few clades of predominately woody species, including the palms (Arecales; [27]). The legumes are the third largest family of angiosperms and exhibit a wide range of growth habits throughout the clade. It is worth noting that unlike the woody legumes, woody monocots do not produce true wood. In this context, however, we generally define the tree/shrub or woody category as simply large plants with long generation times, for example, [6]. 2. Methods and Materials 2.1. Genome Size Data. The amount of DNA in the unreplicated gametic nucleus (i.e., pollen or egg) is referred to as the 1C DNA amount or holoploid genome size, regardless of ploidy level [28]. However, since many angiosperms undergo polyploidy, the monoploid genome size, or 1Cx value, is also often reported and analyzed. The monoploid genome size represents the amount of DNA in the unreplicated monoploid chromosome set and is calculated by dividing the 2C DNA amount by ploidy. Because rates of evolution can be inflated due to polyploidy, we compare and contrast evolutionary rates between the two measures (see below). We compiled genome size estimates for legumes and monocot species where both the 1C amount and the ploidy level were known. Data from the Plant DNA C-values database [25] were combined with additional genome size estimates not yet listed in the database but published in the literature, resulting in an initial list of 1659 and 565 monocot and legume species, respectively, to search GenBank (see below) Mega-Phylogeny Construction. We constructed a megaphylogeny of legumes and monocots using the procedures described in [23]. The mega-phylogeny method applies orthology tests, sequence saturation analyses, and multiple profile-to-profile alignment methodology to user-specified gene regions. Sequence saturation is detected by calculating the median absolute deviation (MAD) assessed on the one-dimensional Euclidean distance between the raw and Jukes-Cantor corrected pair-wise sequence distances. For a given gene region, if the most inclusive grouping of these sequences is saturated (MAD 0.01) then the group is broken up into less inclusive groups using the next level in the NCBI (National Center for Biotechnology Information) taxonomic hierarchy. After every sequence has been placed in an alignment, the individual alignments are then profile aligned into a larger alignment. Profile-to-profile alignment combines separate alignments, while preserving the structural elements that are highly conserved between them [29, 30]. We employed a guide tree based on the phylogeny of the NCBI taxonomy to carry out profile alignments. For the monocots, we specified atpb, matk, ndhf, rbcl, rps16, trnl-f, and ITS as our gene regions of interest. For the legumes we specified matk, psba-trnh, rbcl, trnl-f, ITS, and ETS. However, instead of compiling all possible monocot and legume taxa for a given gene region, we limited our GenBank search to only return sequences for taxa represented in our genome size dataset. The mega-phylogeny matrix construction pipeline was carried out in Python (Ver. 2.5) with the BioPython (Ver. 1.48) module using the BioSQL (Ver ) database schema. Each phylogeny was inferred from the resulting matrix using RAxML (Ver ; [31]), partitioning each gene region and applying a GTRMIX model of rate substitution. For monocots, the maximum likelihood tree was rooted with Acorales (sensu [32]) and the legumes were rooted with the tribe Cercideae (sensu [33]). In both cases, due to synonymy and errors in Genbank, the trees were further pruned to match our genome size data sets (for a complete list see supplementary materials) Time Calibrating the Mega-Phylogeny. We time-calibrated the legume mega-phylogeny using the nonparametric rate smoothing method (NPRS; [34]) with the Powell algorithm in r8s (Ver. 1.71; [35]). The NPRS analysis was restarted three times with different starting values to ensure convergence to a global optimum. We selectively assigned five age constraints from age estimates inferred by Lavin et al. [36]. These included the Umtzia crown group (54.0 million years ago, Mya), the Hologalegina crown (50.6 Mya), the Vigna-Phaeseolus split (8.0 Mya), and one assigned to crown Fabaceae (59.0 Mya). We also assigned a constraint within the dalbergioid clade that corresponded to a node in our tree (49.1 Mya). For the monocots, we selectively assigned eight age constraints using the mean absolute age estimates from Smith et al. [37]. Six age constraints corresponded to the crown age estimates for major clades of monocots (Asparagales, 99.8 Mya; Arecales, 70.9 Mya; Poales, 74.8 Mya; Zingiberales, 88.5 Mya; Commelinales, 76.8 Mya), two corresponded to deep divergences (Liliales + Asparagales, Mya; crown Commelinids, Mya), and one was assigned to crown monocots (163.5 Mya). We initially used the same procedure to date the monocot tree as above, but the nonparametric rate smoothing analysis did not run to completion. To deal with this problem, we reduced the dataset to 200 tips and reran the NPRS analysis to completion. We obtained the estimated ages for all nodes in the reduced dataset and placed Journal of Botany 3 them in the full dataset. We then used the nonparametric dating method PATHD8 [38] to infer ages for the remaining uncalibrated nodes. PATHD8 uses mean path lengths from the node to tips and deals with substitution rate variation by smoothing rates locally Comparative Analyses. To test for differences in the rate of genome size evolution (1C and 1Cx DNA content) among woody and herbaceous lineages, we compared the fit of single- and two-rate models of Brownian motion evolution. Any phenotypic trait found to accumulate evolutionary change in proportion to time is best described by Brownian motion [39]. The time-independent parameter, σ 2, or the variance of phenotypic evolution, describes the rate at which this process proceeds. The single-rate model assumes that all analyzed branches accumulate evolutionary changes in genome size at the same rate, σ 2, while the multiple-rate model assigns a separate rate to each lineage that differs in a particular discrete character state (e.g., σwoody 2, σ herb 2 ). We carried out the single- versus two-rate model comparisons using the noncensored approach in BROWNIE (Ver. 2.1; [24]). Because the noncensored approach assumes the discrete character state of internal branches are known, we used a procedure implemented in BROWNIE that estimates the likeliest growth form state (e.g., woody or herbaceous) across all branches in a given tree based on character codings at the tips. Evaluating the best-fit model between the singleand two- rate models was based on the sample size corrected Akaike Information Criterion (AICc; [40]). The best fit model was chosen based on a slightly modified AICc. Because we are only comparing two models, we always calculated AICc as AICc obtained from the single rate model minus the AICc from the two-rate model. A AICc of 2 was taken as evidence for the single-rate model, whereas a AICc 2 indicated considerable evidence for the two-rate model. We also tested for mean differences in genome size among extant woody and herbaceous species in both our monocot and legume datasets. However, many types of evolutionary processes could have produced the observed trait differences, including Brownian motion. Therefore, we assessed genome size differences among growth form and compared the results of a conventional ANOVA to a null distribution based on ANOVA results obtained from simulations of Brownian motion evolution [41]. This was used to test whether significant species differences between growth forms were larger than would be expected given a random model of Brownian motion evolution. We used the R [42] packagegeiger[43] to generate 1000 Monte Carlo simulations using our input tree topology and timecalibrated branch lengths. We compared the observed F- statistic calculated using an ordinary ANOVA to a null distribution of F-statistics obtained from the Monte Carlo simulations to test for significance. If the observed F-statistic was greater than 95% of the null distribution, then trait differences were greater than expected based on a model of Brownian motion evolution. We carried out this test within each clade separately, using both 1C and 1Cx DNA content. We log 10 transformed the genome size data prior to all analyses to ensure the data minimally conformed to Brownian motion evolution [23, 44]. Under a simple Brownian motion model of evolution (as we employ throughout), a given trait should have an equal probability of increasing or decreasing in the same magnitude given its current state. However, this assumption is inherently violated when traits, such as genome size, are constrained to be nonzero. For example, given a genome size of 0.25 pg, an increase or decrease of 0.50 pg is not likely to occur in equal probability. Rather, in this case, change would be better expressed as a proportion, where the probability of an increase or decrease of say, 50%, is likely to occur regardless of the initial genome size at speciation. Thus, it is generally acknowledged that genome size evolution may be better represented as proportional change through an apriorilog 10 transformation [23, 44]. 3. Results 3.1. Mega-Phylogeny. Our final matrices for the Monocotyledonae (monocots) and Fabaceae (legumes) consisted of 495 and 250 species, respectively. The combined matrix for the legumes comprised 60 woody species from 20 genera and 190 herbaceous species from 21 genera. The woody species were mostly confined to the clades corresponding to the Cercideae, Mimosoideae, and Caesalpinioideae, with additional occurrences found within the Papilionoideae. For monocots, the matrix comprised 213 genera belonging to 9 of the 10 orders of monocots recognized by the Angiosperm Phylogeny Group [27]. Slow growing, tall and/or woody genera have been described in several different monocot families, including Arecaceae, (e.g., Cordyline, Dasylirion, Dracaena, Nolina), Bromeliaceae (e.g., Puya), Dasypogonaceae (Dasypogon, Kingia), Pandanaceae (Pandanus), Strelitziaceae (e.g., Ravenala), Velloziaceae (Vellozia), Xanthorrhoeaceae (e.g., Aloe and Xanthorrhoea), and the woody bamboo genera in the tribe Bambuseae of Poaceae (e.g., Phyllostachys, Sasa, Semiarundinaria). However, due to the absence of genome size and/or sequence data for many of these genera the effect of growth form analyses were restricted to comparisons between (i) Dasypogon (Dasypogonaceae; 1 species), the woody palms (Arecaceae; 34 species), and the woody Aloe (Xanthorrhoeaceae; 5 species), (ii) the remaining species which were classified as herbaceous (452 species). The combined matrix for the monocots contained 10,922 sites and 74.5% gaps or missing sequence, while the legume matrix had a total length of 8221 sites that contained 80.4% gaps or missing sequence. In both cases, the majority of the sequence data came from ITS (Table 1). Additionally, the degree of saturation varied among gene regions, ranging from profiling broad clades (e.g., rbcl) to profiling mostly tribes and genera (e.g., ITS; Table 1). Interestingly, of the all genes sampled, only rbcl did not require some degree of profile alignment (Table 1). It is worth noting that the degree of saturation was not related to whether or not the gene was protein coding. For example, in both the legume and monocot data set, the noncoding trnl-f regions required as much profile aligning as the coding matk(table 1). 4 Journal of Botany Table 1: Gene regions specified in the mega-phylogeny construction of Monocotyledonae (monocots) and Fabaceae (legumes). The median absolute deviation (MAD) was used to assess sequence saturation and to parse sequences into separate files based on NCBI taxonomy and brought together again using NCBI-based guide tree and profile-to-profile alignment methodology (see Methods and Materials). Phylogeny Gene region Description N MAD Profiles Monocotyledonae atpb Atp synthase beta chain none Monocotyledonae ITS Internal transcribed spacer 1, 5.8S ribosomal RNA, and internal transcribed spacer mostly to tribe and genus Monocotyledonae matk Maturase K mostly to family Monocotyledonae ndhf NADH-plastoquinone oxidoreductase mostly to order Monocotyledonae rbcl Ribulose bisphosphate carboxylase none Monocotyledonae rps16 Ribosomal protein S16 intron mostly to order Monocotyledonae trnl-trnf trnl-trnf intergenic spacer mostly to family Fabaceae ETS External transcribed spacer and 18S ribosomal RNA mostly to tribe and genus Fabaceae ITS Internal transcribed spacer 1, 5.8S ribosomal RNA, and internal transcribed spacer mostly to tribe and genus Fabaceae matk Maturase K mostly to tribe and genus Fabaceae psba-trnh psba-trnh intergenic spacer none Fabaceae rbcl Ribulose bisphosphate carboxylase none Fabaceae trnl-trnf trnl-trnf intergenic spacer mostly to tribe MAD scores in bold italics indicate the gene region was saturated across the most inclusive taxonomic-level and broken up into profiles of various taxonomic levels. N indicates the number of sequences in GenBank returned according to our input search list; however, due to synonymy and errors in GenBank the final tree was pruned to exactly match our genome size data set. Table 2: Parameter estimates from comparisons of single- versus two-rate models of Brownian motion (BM) and applied to both 1C DNA and 1Cx DNA content separately. 1C DNA content 1Cx DNA content Single-rate Two-rate Single-rate Two-rate Clade σ 2 (My 1 ) σwoody(my 2 1 ) σherb(my 2 1 ) ΔAICc σ 2 (My 1 ) σwoody(my 2 1 ) σherb(my 2 1 ) ΔAICc Monocotyledonae Fabaceae AICc is calculated as the AICc obtained from the single rate model minus the AICc obtained from the two-rate model, where a AICc 2wastakenas evidence for the single-rate model, whereas a AICc 2 indicated strong evidence for the two-rate model Rates of Genome Size Evolution. In both the monocots and legumes, we found that the genome size data were best fit by a two-rate model of Brownian motion evolution, which inferred a separate rate for woody and herbaceous lineages (Table 2). For legumes, the two-rate model applied to the 1C DNA content was strongly supported (ΔAICc = 85.4) and woody lineages were inferred to accumulate changes in genome size an order of magnitude slower than related herbaceous lineages. Even when testing 1Cx DNA content, the disparity in rates between woody and herbaceous
Search Related
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks