eRibo Pro Peak Differential Expression | Bioinformatics eBlog

A common end goal of an RNA-Seq experiment is to identify what genes have responded to a treatment. For example, has a newly developed drug increased the expression of a target or has a knockdown worked to decrease expression? In order to answer these questions we perform a type of analysis called differential expression (DE) analysis.

A DE analysis is a statistical procedure that identifies differentially up or downregulated genes between two or more conditions or samples. It involves comparing the expression levels of each gene in one group of samples (e.g., disease samples) to the expression levels in another (e.g., healthy samples) to identify genes that have changed across conditions. DE analysis can have a significant impact by identifying disease-associated genes (which can be used as potential drug development targets), and identify biomarkers that can be used for diagnosis, prognosis, or monitoring of disease progression.

DE analysis is typically performed using specialized software to maximize our ability to identify differences when the number of replicates by the condition is small (e.g., 2 or 3 technical repetitions) while accounting for differences in library size and false discovery rate due to multiple tests conducted all at once. At Eclipsebio, we use a powerful tool called DESeq2 to identify differentially expressed genes. One way that we use DESeq2 is with our eRibo service, where we can detect changes in  ribosome-associated and total transcriptome counts between different conditions.

DESeq2 uses information across all genes in the experiment to produce a robust estimate of the variability (dispersion) between samples for each gene in a way that considers the logarithmic nature of read count data. It then uses these dispersions to divide the log2 fold changes between conditions and calculate a statistical test called the “Wald” test. This test helps us determine whether our observed differences are likely real or just due to chance, and provides robust lists of DE genes to support answering specific scientific questions.

The same framework that is used to identify differentially expressed genes can also be applied different data modalities. For example, a similar analysis can be performed with eCLIP peaks to determine if a region has differential enrichment following a treatment. In the case of eCLIP, to account for the presence of an input we compare the ratio of fold changes rather than the observed counts in the immunoprecipitated libraries alone.

Creating the right framework for an accurate differential analysis can take a lot of effort. At Eclipsebio we have experts with extensive experience in statistical methods to take out the guesswork of identifying differential genes or peaks. Contact us today to see how we can help you examine differentials in your experiment.

Related articles

eBlogs

09
20
23

RBP-eCLIP Motif Calling | Bioinformatics eBlog

The simplest definition of a motif is a short, patterned sequence of nucleotides that play some role in the biology of a system. In the case of RBPs, this role is to bind selectively to defined regions of a given RBP’s protein structure enabling RBPs to target specific transcripts and specific gene features... [READ MORE]

read more

eBlogs

09
15
23

RBP-eCLIP Peak Calling | Bioinformatics eBlog

The simplest definition of a motif is a short, patterned sequence of nucleotides that play some role in the biology of a system. In the case of RBPs, this role is to bind selectively to defined regions of a given RBP’s protein structure enabling RBPs to target specific transcripts and specific gene features.... [READ MORE]

read more

eBlogs

08
04
23

RBP-eCLIP Peak Annotation | Bioinformatics eBlog

The goal of RBP-eCLIP is to identify where an RNA-binding protein (RBP) is binding; these regions are often called peaks due to their mountain-like appearance on a genome browser. After peaks have been called, it is important to determine what genes and gene features are associated with those sites.... [READ MORE]

read more

eBlogs

07
27
23

Stranded Libraries | Bioinformatics eBlog

Forward, reverse, sense, antisense, first strand, second strand, unstranded. Different methods for sequencing RNA-Seq data can lead to differently stranded libraries all with different names. This can make it challenging to figure out how different kits compare to one another or what parameters to use with different software tools to make sure you are doing an analysis correctly.... [READ MORE]

read more