Essential skills in the analysis of high-throughput data
20.06.2017 - 07.07.2017
Prof. Dr. Jochen Wolf
Dr. Saurabh Pophaly
The completion of the first human genome draft sequence during the turn of the last century heralded the post-genomic era. Yet, the Human Genome Project could only be accomplished by an international consortium worth several billion dollars – limiting genomic approaches to few indiviuals from selected model organisms. High throughput nano-sequencing (next generation sequencing (NGS)) entering the stage only ten years ago changed the game and revolutionized many areas of biology. Progress of NGS technologies is analogous to the semiconductor revolution as it involves massive miniaturization and parallelization of individual sequencing reactions thereby dramatically reducing costs and increasing throughput. This enabled not only the sequencing of the genomes of several new species, but also catalyzed the development of detailed variation catalogs exploring variation within and between species, among communities (metageomics) or within individuals (e.g single cell). NGS based deep expression quantification via RNA-Seq, genome occupancy assays via Chip-Seq and NGS variants to detect epigenetic DNA modifications are closing the knowledge gap between genotype and phenotype. As a consequence, nearly all areas of biology and medicine are seeing massive percolation by NGS. This has created a demand for individuals trained in handling, processing and assimilating the flood of data generated in academia and industry alike. This master-level course will help you comprehend basic terminologies, navigate NGS based analyses, understand available literature and design your own studies/analysis. During the course you will get hands-on experience which will prepare you for handling large genomics datasets.
In brief, you will -
- get insights into current technologies generating massively parallel genome and transcriptome data.
- obtain basic skills in maneuvering UNIX–based computer clusters.
- familiarize yourself with established types of data encoding (.fastqc, .bam, .vcf files).
- get hands-on experience with basic components of common bioinformatics analysis pipelines including quality assessment of raw data, genome assembly, read mapping, variant calling and genotyping.
This course mainly consists of hands-on problem solving accompanied by lectures, and is complemented by a final project that you conduct independently with guidance of the instructors. A final report is required for passing the course for which attendance is compulsory.
Please take the online course called Learn the command line
. Take a screen-shot of the page when you finish the course. This screen-shot of the webpage indicating the completion of the course by you is a mandatory prerequisite.
Register for the Course at LMU moodle Here
- saurabh AT bio.lmu.de
- j.wolf AT bio.lmu.de
6 ECTS, 6 SWS
Every week Tuesday to Friday, 10:00-17:00
To Be Decided
A final report is required for passing the course for which attendance is compulsory.
Students will be given genomics data analysis projects in the last week. Grading will be based on the final reports resulting thereof.
Laptops running Linux will be provided. You will use these to login to the computer cluster where all analyses will be performed.