From Reads to Result Reference VS Denovo RNA Seq
From Reads to Results: Reference vs. De Novo RNA-Seq The Ultimate Guide to Mastering Transcriptome Assembly and Differential Gene Expression
Course Description
In the era of Big Data Genomics, understanding how to process Next-Generation Sequencing (NGS) data is a critical skill for any life scientist. This course provides an end-to-end deep dive into RNA-Seq analysis, comparing the high-precision Reference-Based workflow with the complex, "from-scratch" De Novo Assembly approach. You will learn to navigate the command-line interface to perform quality control, read mapping, and transcript reconstruction. Beyond traditional pipelines, we explore how AI and Machine Learning (ML) are revolutionizing gene expression estimation and isoform discovery. By the end of this course, you will be able to transform millions of raw FASTQ reads into statistically significant results and professional-grade visualizations, ready for publication or clinical application.
What You'll Learn
The fundamental differences between Reference-Guided and De Novo workflows.
How to perform rigorous Quality Control (QC) and adapter trimming on raw reads.
Mastering alignment tools like HISAT2 and STAR for model organisms.
Assembling transcriptomes for non-model species using Trinity.
Utilizing AI algorithms for feature selection and batch effect removal.
Conducting Differential Gene Expression (DGE) analysis with DESeq2 and EdgeR.
Functional annotation and Gene Ontology (GO) enrichment.
Curriculum
-
Module 1: Foundations of Transcriptomics & Raw Read Quality Control
Lesson -
Core paradigms of RNA-Seq experimental design: single-end vs. paired-end sequencing strategy and depth calculation.
Lesson -
Navigating raw data files: understanding the 4-line structure of FASTQ formats and Phred quality scores.
Lesson -
Quality assessment execution: running FastQC to detect sequence bias, adapter contamination, and low-quality bases.
Lesson -
Raw data cleaning: automated sequence trimming and filtering utilizing Trimmomatic or Cutadapt utilities.
Lesson -
Module 2: Reference-Guided RNA-Seq Workflow (Model Organisms)
Lesson -
Understanding splice-aware mapping logic: parsing standard genomic architectures (exons, introns, and untranslated regions).
Lesson -
High-throughput index creation: formatting reference genomes and structural annotation definitions (GTF/GFF3).
Lesson -
Alignment execution: executing splicing-competent mappers like HISAT2, STAR, or TopHat2 to generate raw outputs.
Lesson -
Downstream file alignment management: utilizing Samtools to sort, index, and convert SAM alignments into compressed BAM files.
Lesson -
Feature quantification: applying featureCounts or HTSeq-count to generate precise, gene-level raw read counts.
Lesson -
Module 3: De Novo Transcriptome Assembly Workflow (Non-Model Organisms)
Lesson -
The computational challenge of reference-free reconstruction: introducing short-read assembly math via De Bruijn Graphs.
Lesson -
In silico read normalization: down-sampling massive datasets to optimize high-performance cluster compute times.
Lesson -
Complete de novo execution: building unified transcriptional contigs using the Trinity software suite.
Lesson -
Post-assembly quality control: measuring contig metrics, N50 scores, E90 profiles, and assessing completeness via BUSCO.
Lesson -
Functional transcript annotation: predicting open reading frames (ORFs) using TransDecoder and implementing the Trinotate database suite.
Lesson -
Module 4: Downstream Statistical Analysis & Biological Insights
Lesson -
The mechanics of matrix normalization: understanding why FPKM/TPM calculation differs from raw structural read counts.
Lesson -
Differential Gene Expression (DGE): programming statistical pipelines in R using DESeq2 or edgeR packages.
Lesson -
Data visualization workflows: rendering production-grade Volcano plots, hierarchical clustering heatmaps, and PCA plots.
Lesson -
Biological enrichment: executing Gene Ontology (GO) terms and KEGG metabolic pathway analysis to deliver structural results.
Lesson