Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Birds as a Model for Comparative Genomic Studies
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Ecology and Genetics, Evolutionary Biology.
2011 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Comparative genomics provides a tool to investigate large biological datasets, i.e. genomic datasets. In my thesis I focused on inferring patterns of selection in coding and non-coding regions of avian genomes. Until recently, large comparative studies on selection were mainly restricted to model species with sequenced genomes. This limitation has been overcome with advances in sequencing technologies and it is now possible to gather large genomic data sets for non-model species. 

Next-generation sequencing data was used to study patterns of nucleotide substitutions and from this we inferred how selection has acted in the genomes of 10 non-model bird species. In general, we found evidence for a negative correlation between neutral substitution rate and chromosome size in birds. In a follow up study, we investigated two closely related bird species, to study expression levels in different tissues and pattern of selection. We found that between 2% and 18% of all genes were differentially expressed between the two species.

We showed that non-coding regions adjacent to genes are under evolutionary constraint in birds, which suggests that noncoding DNA plays an important functional role in the genome. Regions downstream to genes (3’) showed particularly high level of constraint. The level of constraint in these regions was not correlated to the length of untranslated regions, which suggests that other causes play also a role in sequence conservation.

We compared the rate of nonsynonymous substitutions to the rate of synonymous substitutions in order to infer levels of selection in protein-coding sequences. Synonymous substitutions are often assumed to evolve neutrally. We studied synonymous substitutions by estimating constraint on 4-fold degenerate sites of avian genes and found significant evolutionary constraint on this category of sites (between 24% and 43%). These results call for a reappraisal of synonymous substitution rates being used as neutral standards in molecular evolutionary analysis (e.g. the dN/dS ratio to infer positive selection).

Finally, the problem of sequencing errors in next-generation sequencing data was investigated. We developed a program that removes erroneous bases from the reads. We showed that low coverage sequencing projects and large genome sequencing projects will especially gain from trimming erroneous reads.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis , 2011. , 62 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 868
Keyword [en]
Birds, Selection, Gene expression, Sequence evolution, Next-generation sequencing, Comparative genomics, Molecular evolution, Genomics, Substitution Rates, Non-coding DNA
National Category
Evolutionary Biology Bioinformatics and Systems Biology
Research subject
Biology with specialization in Molecular Evolution
Identifiers
URN: urn:nbn:se:uu:diva-159766ISBN: 978-91-554-8186-5 (print)OAI: oai:DiVA.org:uu-159766DiVA: diva2:447764
Public defence
2011-11-25, Lindahlsalen, Evolutionary Biology Centre, Norbyvägen 18A, Uppsala, 13:00 (English)
Opponent
Supervisors
Available from: 2011-11-04 Created: 2011-10-10 Last updated: 2011-11-10Bibliographically approved
List of papers
1. Comparative genomics based on massive parallel transcriptome sequencing reveals patterns of substitution and selection across 10 bird species
Open this publication in new window or tab >>Comparative genomics based on massive parallel transcriptome sequencing reveals patterns of substitution and selection across 10 bird species
Show others...
2010 (English)In: Molecular Ecology, ISSN 0962-1083, E-ISSN 1365-294X, Vol. 19, no Suppl.1, 266-276 p.Article in journal (Refereed) Published
Abstract [en]

Next-generation sequencing technology provides an attractive means to obtain largescale sequence data necessary for comparative genomic analysis. To analyse the patterns of mutation rate variation and selection intensity across the avian genome, we performed brain transcriptome sequencing using Roche 454 technology of 10 different non-model avian species. Contigs from de novo assemblies were aligned to the two available avian reference genomes, chicken and zebra finch. In total, we identified 6499 different genes across all 10 species, with ∼1000 genes found in each full run per species. We found evidence for a higher mutation rate of the Z chromosome than of autosomes (male-biased mutation) and a negative correlation between the neutral substitution rate (dS) and chromosome size. Analyses of the mean dN/dS ratio (ω) of genes across chromosomes supported the Hill-Robertson effect (the effect of selection at linked loci) and point at stochastic problems with x as an independent measure of selection. Overall, this study demonstrates the usefulness of next-generation sequencing for obtaining genomic resources for comparative genomic analysis of non-model organisms.

Keyword
Avian genomics, Hill-Robertson effect, Male-mutation bias, Next generation sequencing 454, Selection
National Category
Biological Sciences
Identifiers
urn:nbn:se:uu:diva-136234 (URN)10.1111/j.1365-294X.2009.04487.x (DOI)000275645700021 ()20331785 (PubMedID)
Available from: 2010-12-10 Created: 2010-12-10 Last updated: 2017-12-11
2. Gene content and patterns of gene expression in the flycatcher genome
Open this publication in new window or tab >>Gene content and patterns of gene expression in the flycatcher genome
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Phenotypic evolution may be driven by changes in the sequence of protein-coding genes or by the way (when, where, at what level) proteins are expressed. Generally, our knowledge about the evolution of gene expression is relatively limited, and this is particularly so for wild populations. Collared flycatcher (Ficedula albicollis) and pied flycatcher (F. hypoleuca) are two recently diverged passerine birds, which have been subject to extensive ecological research, including aspects of speciation. We obtained RNA-seq data with Illumina technology from 10 adult individuals per species (five females and five males) using brain, kidney, liver, lung, muscle, skin, ovary, and testis tissue (plus eight embryos of each species). A total of more than 1 billion sequencing reads were assembled into >15.000 gene models for each species. The proportion of differentially expressed genes between species ranged from 8% to 18% per adult tissue. Very few GO categories were found to be overrepresented among differentially expressed genes, which at least in part might reflect that orphan and not yet annotated genes are prone to evolve more rapidly in gene expression level. However, in testis, the category olfactory receptor activity was significantly overrepresented among differentially expressed genes and it is of interest to note that this category of genes is involved in sperm-egg communication and thereby potentially may contribute to reproductive incompatibility between the two species. Genes with a high degree of differentiation in gene expression between species tended to have high rates of sequence evolution (high dN/dS). Overall, this study illustrates both the feasibility and usefulness of deep transcriptome sequencing in non-model organisms.

Keyword
Collared flycatcher, Pied flycatcher, Zebra finch, RNA-Seq, Transcriptome sequencing, Species comparison, Gene expression
National Category
Evolutionary Biology
Identifiers
urn:nbn:se:uu:diva-159916 (URN)
Available from: 2011-10-11 Created: 2011-10-11 Last updated: 2011-11-10
3. Evolutionary Constraint in Flanking Regions of Avian Genes
Open this publication in new window or tab >>Evolutionary Constraint in Flanking Regions of Avian Genes
2011 (English)In: Molecular biology and evolution, ISSN 0737-4038, E-ISSN 1537-1719, Vol. 28, no 9, 2481-2489 p.Article in journal (Refereed) Published
Abstract [en]

An important comprehension from comparative genomic analysis is that sequence conservation beyond neutral expectations is frequently found outside protein-coding regions, indicating important functional roles of noncoding DNA. Understanding the causes of constraint on noncoding sequence evolution forms an important area of research, not least in light of the importance for understanding the evolution of gene expression. We aligned all orthologous genes of chicken and zebra finch together with 5 kb of their upstream and downstream noncoding sequences, to study the evolution of gene flanking sequences in the avian genome. Using ancestral repeats as a neutral reference, we detected significant evolutionary constraint in the 3' flanking region, highest directly after termination (60%) and then gradually decreasing to about 20% 5 kb downstream. Constraint was higher in annotated 3' untranslated regions (UTRs) than in non-UTRs at the same distance from the stop codon and higher in sequences annotated as microRNA (miRNA)-binding sites than in non-miRNA-binding sites within 3' UTRs. Constraint was also higher when estimated for a smaller data set of genes from more closely related songbird species, indicating turnover of functional elements during avian evolution. On the 5' flanking side constraint was readily seen within the first 125 bp immediately upstream of the start codon (34%) and was about 10% for remaining sequence within 5 kb upstream. Analysis of chicken polymorphism data gave further support for the highest constraint directly before and after the translated region. Finally, we found that genes evolving under the highest constraint measured by d(N)/d(S) also had the highest level of constraint in the 3' flanking region. This study broadens the insights into gene flanking sequence evolution by adding new findings from a vertebrate lineage other than mammals.

Keyword
UTR, non-coding DNA, purifying selection, chicken, zebra finch
National Category
Biological Sciences
Identifiers
urn:nbn:se:uu:diva-158881 (URN)10.1093/molbev/msr066 (DOI)000294552700010 ()
Available from: 2011-09-20 Created: 2011-09-19 Last updated: 2017-12-08Bibliographically approved
4. Significant Selective Constraint at 4-Fold Degenerate Sites in the Avian Genome and Its Consequence for Detection of Positive Selection
Open this publication in new window or tab >>Significant Selective Constraint at 4-Fold Degenerate Sites in the Avian Genome and Its Consequence for Detection of Positive Selection
2011 (English)In: Genome Biology and Evolution, ISSN 1759-6653, E-ISSN 1759-6653, Vol. 3, 1381-1389 p.Article in journal (Refereed) Published
Abstract [en]

A major conclusion from comparative genomics is that many sequences that do not code for proteins are conserved beyond neutral expectations, indicating that they evolve under the influence of purifying selection and are likely to have functional roles. Due to the degeneracy of the genetic code, synonymous sites within protein-coding genes have previously been seen as "silent" with respect to function and thereby invisible to selection. However, there are indications that synonymous sites of vertebrate genomes are also subject to selection and this is not necessarily because of potential codon bias. We used divergence in ancestral repeats as a neutral reference to estimate the constraint on 4-fold degenerate sites of avian genes in a whole-genome approach. In the pairwise comparison of chicken and zebra finch, constraint was estimated at 24-32%. Based on three-species alignments of chicken, turkey, and zebra finch, lineage-specific estimates of constraint were 43%, 29%, and 24%, respectively. The finding of significant constraint at 4-fold degenerate sites from data on interspecific divergence was replicated in an analysis of intraspecific diversity in the chicken genome. These observations corroborate recent data from mammalian genomes and call for a reappraisal of the use of synonymous substitution rates as neutral standards in molecular evolutionary analysis, for example, in the use of the well-known d(N)/d(S) ratio and in inferences on positive selection. We show by simulations that the rate of false positives in the detection of positively selected genes and sites increases several-fold at the levels of constraint at 4-fold degenerate sites found in this study.

Keyword
Chicken, turkey, zebra finch, 4-fold degenerate sites, purifying selection, nearly neutral theory, comparative genomics
National Category
Evolutionary Biology
Research subject
Biology with specialization in Molecular Biology
Identifiers
urn:nbn:se:uu:diva-159765 (URN)10.1093/gbe/evr112 (DOI)000301535100030 ()
Available from: 2011-10-10 Created: 2011-10-10 Last updated: 2017-12-08Bibliographically approved
5. ConDeTri: A content dependent read trimmer for Illumina data
Open this publication in new window or tab >>ConDeTri: A content dependent read trimmer for Illumina data
2011 (English)In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 6, no 10, e26314- p.Article in journal (Refereed) Published
Abstract [en]

During the last few years, DNA and RNA sequencing have started to play an increasingly important role in biological and medical applications, especially due to the greater amount of sequencing data yielded from the new sequencing machines and the enormous decrease in sequencing costs. Particularly, Illumina/Solexa sequencing has had an increasing impact on gathering data from model and non-model organisms. However, accurate and easy to use tools for quality filtering have not yet been established. We present ConDeTri, a method for content dependent read trimming for next generation sequencing data using quality scores of each individual base. The main focus of the method is to remove sequencing errors from reads so that sequencing reads can be standardized. Another aspect of the method is to incorporate read trimming in next-generation sequencing data processing and analysis pipelines. It can process single-end and paired-end sequence data of arbitrary length and it is independent from sequencing coverage and user interaction. ConDeTri is able to trim and remove reads with low quality scores to save computational time and memory usage during de novo assemblies.  Low coverage or large genome sequencing projects will especially gain from trimming reads.  The method can easily be incorporated into preprocessing and analysis pipelines for Illumina data.

Availability and implementation:

Freely available on the web athttp://code.google.com/p/condetri

Keyword
Next Generatiom Sequencing, Software, Sequencing Errors
National Category
Bioinformatics and Systems Biology
Research subject
Bioinformatics
Identifiers
urn:nbn:se:uu:diva-159761 (URN)10.1371/journal.pone.0026314 (DOI)000296507500049 ()
Available from: 2011-10-10 Created: 2011-10-10 Last updated: 2017-12-08Bibliographically approved

Open Access in DiVA

fulltext(1966 kB)1153 downloads
File information
File name FULLTEXT01.pdfFile size 1966 kBChecksum SHA-512
cdb31e9ece13b0cfcaa388fb10bb1770f66241fbf463949345c9d59882c807144f55a99b38839cede438fa004c4eb09553e2097458310d9172b00fdfbafad68e
Type fulltextMimetype application/pdf
Buy this publication >>

Search in DiVA

By author/editor
Künstner, Axel
By organisation
Evolutionary Biology
Evolutionary BiologyBioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 1153 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 821 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf