Introduction to RNA Sequencing Analysis Summer 2019
Home | Notes | Assignments | Notices
Instructor: William L. Kath, Tech Room M460, x1-8784,

Class times: MW 10:00-11:30 a.m., Tech M416.

Office hours: TBA.

Textbook: There is no required textbook.

References and online notes will be posted here or on the Notes page.
New postings will be indicated on the Notices page.

A starting point is the book "Computational Biology: Unix/Linux Data Processing and Programming", available electronically through the Northwestern Library.

Assignments: The grade in the course will be assessed through computational projects.

Course Description:

This course will give an introduction to the theory and practice of analyzing high-throughput RNA sequencing. Through lectures and hands-on exercises, it will cover:

  1. The format of/working with raw sequencing data
  2. Aligning reads to a reference genome
  3. The format of/working with aligned SAM/BAM files
  4. Different ways to perform read-based gene counting
  5. How to visually explore reads and read counts; variance shrinkage and principal components
  6. The theory of/doing differential expression analysis

Module 1: Files and software tools
This module will discuss the formats of the various files arising in high throughput sequencing, and the unix commands and other software tools most useful for downloading data and doing the analysis.

Module 2: Aligning and counting reads
This module will cover some of the theory of aligning reads to reference genomes or of doing pseudo-alignments. It will also cover viewing reads, assessing read quality, and the expected probability distribution for read counts.

Module 3: Normalizing and tranforming read counts
This module will cover how to normalize counts for sequencing depth differences, the difference between counts and TPM ("transcripts per million"), log2 transformation, variance reduction and principal components analysis

Module 4: Differential expression analysis
How to use DESeq2 to identify transcripts that are differentially expressed between two conditions, the theory behind the analysis, and how to visualize the results.

aligned reads