Median ratio method is used to estimate the size factor per sample.
The size factor is used for normalizing counts (per gene per sample).
Normalized counts allow minimizing biais linked to library size.
By normalizing the counts DESEQ2 aims to make sure differential expression are based on factors study and not to sequencing depth
/!\ gene length is not take into account !
### 2)Estimate dispersion
Purpose: Estimate the variability between replicates <br/>
Get dispersion estimate for each gene using Maximum Linkelihood Estimatation <br/>
Fit a curve to wise gene dispersion estimate
### 3) Fit linear model
The differential expression analysis uses a generalized linear model of the form: <br/>
Kij ∼ NB(µij , α i )<br/>
µij = s j q ij <br/>
log 2 (q ij ) = x j. β i <br/>
where counts K ij for gene i, sample j are modeled using a Negative Binomial distribution with
fitted mean µ ij and a gene-specific dispersion parameter α i . The fitted mean is composed of a
sample-specific size factor s j and a parameter q ij proportional to the expected true concentration of fragments for sample j. The coefficients β i give the log2 fold changes for gene i for each column of the model matrix X. <br/>
### 4) Wald Test:
H0: Test if Log(FC) = 0 <br/>
With DESeq2, the Wald test is the default used for hypothesis testing when comparing two groups. The Wald test is a test of hypothesis usually performed on parameters that have been estimated by maximum likelihood. The Wald test is also a standard way to extract a P value from a regression fit.