Statistical methods in genetics and bioinformatics (A): Pattern Recognition and biological data analysis
[Credits]
[Literature]
[Syllabus]
[Labs]
Lecturer: Matteo Pardo/Christine Steinhoff
Time: Thursday: 10:15-12:00
Timetable:
23.10.2008 10.15 - 11.45 Takustr.9 Lecture
30.10.2008 10.15 - 11.45 Takustr.9 Lecture
06.11.2008 10.15 - 11.45 Takustr.9 Lecture
13.11.2008 10.15 - 11.45 MPI Lecture
20.11.2008 10.15 - 11.45 MPI Lecture
26.11.2008 14.15 - 16.45 LAB 1 MPI
27.11.2008 10.15 - 11.45 MPI Lecture
04.12.2008 10.15 - 11.45 MPI Lecture
05.12.2008 15.00 - 16.30 MPI Lecture
10.12.2008 14.15 - 16.45 MPI Lecture
11.12.2008 10.15 - 11.45 MPI Lecture
17.12.2008 14.15 - 16.45 MPI Lecture
18.12.2008 10.15 - 11.45 MPI Lecture
- Christmas Break -
19.01.2009 10.15 - 12.00 Lecture
19.01.2009 16.00 - 17.30 Lecture
28.01.2009 15.00 - 16.30 LAB 2 MPI
29.01.2009 10.00 - 13.00 LAB 3 MPI
30.01.2009 15.00 - 18.00 LAB 4 MPI
23.02.2009 11.00 - 12.30 Lecture
23.02.2009 14.00 - 17.00 LAB 5 MPI
Course Description: The course will consist of three blocks:
* General pattern recognition (~20h). We will mainly follow Duda-Hart-Stork’s book.
* Specific bioinformatics topics (~10h).
* Labs (15h)
Location: Takustr. 9, SR 053
from 13th november 2008 Location: Seminar room, Max Planck Institute, Ihnestr 73, tower 2
KVV
Web page
Contacts:
pardo@molgen.mpg.de
Credits
There will be a term project that will involve reading papers and applying pattern recognition techniques. The project will require a final report and a presentation at the end of the semester.
Possibly, there will be some homework exercises.
Students will earn 6 credit points for regular attendance, successful attandance of the lab meeting + exam.
Literature
* R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification, 2nd edition, John Wiley & Sons, Inc., 2001 (Main textbook).
* A. Webb, Statistical Pattern Recognition, 2nd edition, John Wiley & Sons, Inc., 2002.
* C. M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
* T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning, Springer, 2003.
* Hahne, F., Huber, W., Gentleman, R., Falcon, S. Bioconductor Case Studies. Series: Use R. Springer, 2008
We thank Selim Aksoy from Bilkent University, Ankara, Turkey
(Link)
; Tim Beißbarth, DKFZ, Heidelberg, Germany and colleagues from NGFN
(Link)
and
Rainer Spang, University of Regensburg, Germany,
(Link)
for their slides.
Syllabus
* Introduction to Pattern Recognition (30.10.2008)
Slides (pdf)
* Experimental platforms for high-throughput genome-wide investigation and microarray preprocessing (6.11.2008 + 20.11.2008)
Slides (pdf)
* Bayesian Decision Theory (13.11.2008 + 27.11.2008)
Slides (pdf)
* Second generation sequencing (20.11.2008)
Slides (pdf)
* Differential gene expression and multiple testing (04.12.2008)
Slides (differential expr) (pdf)
Slides (multiple testing) (pdf)
* Parametric Models (05. + 10. + 11.12.2008)
– Maximum-likelihood estimation
– Bayesian estimation
Slides (pdf)
– Sufficient Statistic
– Expectation-Maximization and mixture density estimation
(Link)
* Discriminant analysis (17.12.2008)
– Linear discriminant functions
(Link)
* Combining classifiers (18.12.2008)
– Bagging, Boosting, Random forests
(Link)
* General Strategies for the Analysis of [Chip,RNA]Seq data by Hugues Richard (19.01.2009)
(Link)
* Feature Selection (19.01.2009)
(Link)
* Algorithm-Independent Learning Issues (23.02.09)
– Performance assessment
– Classifier comparison
– Bias-Variance decomposition
(Link)
Labs
* Lab 1: Highthroughput data analysis (26.11.2008)
Preprocessing (quality control, normalization and experimental design)
Exercises
Solexa data
* Lab 2+3: Classification (28.01.2009 and 29.01.2009)
Exercises
* Lab 4: [Chip,RNA]Seq data analysis by Hugues Richard (30.01.2009)
Exercises
Data
* Lab 5: Exercises (23.02.2009)
Exercises
Matteo Pardo, Christine Steinhoff, 2008
|