Phylogenetics I, WS 2010/2011
Lecture with Exercises (Vorlesung mit Übung):
Instructor: Prof. Dr. Dirk Metzler
EES students who attend this course will practice to apply the methods
described below by analysing datasets
of Michael
Schrödl, Ulrich
Schliewen
or Michael
Balke in the Bavarian State Collection of Zoology.
Time: EES Block III from 30.11.2010 to 23.12.2010
Lecture: Each Tuesday and each Thursday from 9 to 11 a.m. in room C00.013
Exercises: Each Tuesday from 11 a.m. to 12 p.m. and on Thursday from 3 p.m. to
5 p.m. in computer room C00.005
Contents
Data sets of DNA, RNA or protein sequences contain a lot of hidden informations
about the history of evolution, about evolutionary processes and about the roles of
particular genes in evolutionary adaptation. It is a challenge to develop
methods to uncover these informations. Methods that are based on explicit
models for evolutionary processes and on the application of statistical principles
(like likelihood-maximization or Bayesian inferrence) are most promising. Some of these methods,
however, can be very demanding - computationally and intellectually. A
thorough understanding of the models and methods is crucial, not only for those
who aim to contribute to the further development of such methods but also for those who
want to apply these methods to their datasets and have to decide which method
to choose, how to set their optional parameters and how to interprete the outcome.
We discuss methods from computational statistics and their applications in
phylogenetic tree reconstruction. First we compare maximum-likelihood (ML)
methods to parsimonious and distance-based methods. Then we turn to Bayesian
methods that are based on Markov-Chain Monte-Carlo (MCMC) approaches like the
Metropolis-Hastings algorithm and Gibbs sampling. Such methods allow to sample
phylogenies (approximately) according to their posterior probability,
i.e. conditioned on the given sequence data. Thus, it is also possible to
assess the uncertainty of the estimation.
Statistical methods are always based on probabilistic models for the origin of
the data. Therefore, we discuss evolution models for biological sequences
(Jukes-Cantor, PAM, F81, HKY, F84, GTR, Gamma-distributed rates,....) and the
fundamentals about Markov processes that are necessary to understand these
models.
Software:
PHYLIP,
Seq-Gen,
R
with the ape package,
RAxML,
MrBayes,
BEAST,
Bali-Phy,
....
Language: English
Announcement in official LMU course overview
web page last updated: Dirk Metzler, October 5, 2010