Skip to content
Snippets Groups Projects
Verified Commit faa9be07 authored by Laurent Modolo's avatar Laurent Modolo
Browse files

update dea

parent 52a1d388
Branches
No related tags found
No related merge requests found
Pipeline #350 passed
...@@ -26,7 +26,7 @@ classoption: aspectratio=169 ...@@ -26,7 +26,7 @@ classoption: aspectratio=169
- Differential expression analysis between groups - Differential expression analysis between groups
- Regression analysis - Regression analysis
- Multiple testing - Multiple testing
- Multivariate Differential expression analysis - Multivariate differential expression analysis
# Hypothesis testing # Hypothesis testing
...@@ -38,7 +38,7 @@ classoption: aspectratio=169 ...@@ -38,7 +38,7 @@ classoption: aspectratio=169
We reject the hypothesis at risk $\alpha$, the probability that the null hypothesis was true for the observed value. We reject the hypothesis at risk $\alpha$, the probability that the null hypothesis was true for the observed value.
### $p$-value ### $p$-value
The $p$-value is the probability to observe a value as or more extreme under the null hypothesis model. the $p$-value is the probability to observe a value as or more extreme under the null hypothesis model.
## Hypothesis testing ## Hypothesis testing
...@@ -162,7 +162,7 @@ receiver operating characteristic ...@@ -162,7 +162,7 @@ receiver operating characteristic
\end{tikzpicture} \end{tikzpicture}
\end{center} \end{center}
\column{0.5\textwidth} \column{0.5\textwidth}
For a given gene $x_i$ we can test: For a given gene $x_i$ we can test
\vspace{1em} \vspace{1em}
\begin{itemize} \begin{itemize}
\item $H_0$: $E\left(x_i\right) = E\left(x_{i'}\right)$ \item $H_0$: $E\left(x_i\right) = E\left(x_{i'}\right)$
...@@ -206,7 +206,7 @@ $P(X = x)$ for $\mathcal{NB}(\lambda, \alpha = 1)$ ...@@ -206,7 +206,7 @@ $P(X = x)$ for $\mathcal{NB}(\lambda, \alpha = 1)$
\includegraphics[width=0.7\textwidth]{img/NB_sigma_1.png} \includegraphics[width=0.7\textwidth]{img/NB_sigma_1.png}
\end{center} \end{center}
## Non-parametric approaches ## Nonparametric approaches
### We don't try to model the data distribution ### We don't try to model the data distribution
...@@ -214,16 +214,16 @@ Instead we work with: ...@@ -214,16 +214,16 @@ Instead we work with:
- ranks of the values - ranks of the values
- the sign of the difference between two groups (Wilcoxon) - the sign of the difference between two groups (Wilcoxon)
- the distribution of differances - the distribution of differences
If we know the distribution the parametric approach is often more powerfull If we know the distribution, the parametric approach is often more powerful.
### Often limited to the 2 groups setting ### Often limited to the 2 groups setting
## Wilcoxon rank sum test ## Wilcoxon rank sum test
### $H_0$: the median are equal ### $H_0$: the medians are equal
\begin{center} \begin{center}
\href{https://www.nature.com/articles/s41467-021-27464-5}{ \href{https://www.nature.com/articles/s41467-021-27464-5}{
...@@ -233,7 +233,7 @@ If we know the distribution the parametric approach is often more powerfull ...@@ -233,7 +233,7 @@ If we know the distribution the parametric approach is often more powerfull
## WaddR ## WaddR
### Base on 2-Wasserstein distance ### Based on 2-Wasserstein distance
\begin{center} \begin{center}
\href{https://pubmed.ncbi.nlm.nih.gov/33792651/}{ \href{https://pubmed.ncbi.nlm.nih.gov/33792651/}{
...@@ -241,7 +241,7 @@ If we know the distribution the parametric approach is often more powerfull ...@@ -241,7 +241,7 @@ If we know the distribution the parametric approach is often more powerfull
} }
\end{center} \end{center}
## Model based approaches ## Model-based approaches
\begin{center} \begin{center}
\begin{columns} \begin{columns}
...@@ -282,14 +282,14 @@ X \sim \pi \delta_0 + \left(1 - \pi\right) \mathcal{NB}(\lambda, \alpha) ...@@ -282,14 +282,14 @@ X \sim \pi \delta_0 + \left(1 - \pi\right) \mathcal{NB}(\lambda, \alpha)
\end{center} \end{center}
## Model based approaches ## Model-based approaches
### NB distributed counts with excess of zeros ### NB distributed counts with excess of zeros
\begin{center} \begin{center}
\includegraphics[width=0.8\textwidth]{img/ziNB_1} \includegraphics[width=0.8\textwidth]{img/ziNB_1}
\end{center} \end{center}
## Model based approaches ## Model-based approaches
### Mixture of two NB distributions ### Mixture of two NB distributions
...@@ -297,9 +297,32 @@ X \sim \pi \delta_0 + \left(1 - \pi\right) \mathcal{NB}(\lambda, \alpha) ...@@ -297,9 +297,32 @@ X \sim \pi \delta_0 + \left(1 - \pi\right) \mathcal{NB}(\lambda, \alpha)
\includegraphics[width=0.8\textwidth]{img/ziNB_2} \includegraphics[width=0.8\textwidth]{img/ziNB_2}
\end{center} \end{center}
## Model based approaches
### $y = \beta_0 + \beta_1 x$ ## Model-based approaches
### GLM framework
\[
X_i \sim \mathcal{NB}(\lambda, \alpha)
\]
\[E(X_i|\mathbf{Y}) = \boldsymbol{\mu}_i = g^{-1}(\mathbf{Y}\boldsymbol{\beta})\]
with :
\begin{itemize}
\item $\boldsymbol{\mu}_i$ the mean of the gene $i$ distribution
\item $g$ is the link function
\item $\beta$ the unknown parameters of the model
\end{itemize}
\[E(X_i|\mathbf{Y}) = \boldsymbol{\mu}_i = g^{-1}(Y_1 \beta_1 + \dots Y_n \beta_n)\]
### We can also model the variance as a function of the mean
\[ Var(X_i|\mathbf{Y}) = V( \boldsymbol{\mu}_i ) = \operatorname{V}(g^{-1}(\mathbf{X}\boldsymbol{\beta})).\]
## Model-based approaches
### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1$
\begin{center} \begin{center}
\includegraphics[width=0.7\textwidth]{img/lm_2_groups_b0_3_b1_05.png} \includegraphics[width=0.7\textwidth]{img/lm_2_groups_b0_3_b1_05.png}
...@@ -307,19 +330,19 @@ X \sim \pi \delta_0 + \left(1 - \pi\right) \mathcal{NB}(\lambda, \alpha) ...@@ -307,19 +330,19 @@ X \sim \pi \delta_0 + \left(1 - \pi\right) \mathcal{NB}(\lambda, \alpha)
$\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_0 = 3$, $\beta_1 = 0.5$
## Model based approaches ## Model-based approaches
### $y = \beta_0 + \beta_1 x$ ### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1$
### Wald test: ### Wald test:
\[H_0: \beta_1 = 0\] \[H_0: \beta_1 = 0\]
### Likelihood ratio test (LTR) ### Likelihood ratio test (LTR)
\[H_0: L\left(y = \beta_0\right) = L\left(y = \beta_0 + \beta_1 x\right)\] \[H_0: L\left(\boldsymbol{\mu}_i = \beta_0\right) = L\left(\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1\right)\]
## Model based approaches ## Model-based approaches
### $y = \beta_0 + \beta_1 x$ ### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1$
\begin{center} \begin{center}
\includegraphics[width=0.7\textwidth]{img/lm_b0_3_b1_05.png} \includegraphics[width=0.7\textwidth]{img/lm_b0_3_b1_05.png}
...@@ -327,62 +350,83 @@ $\beta_0 = 3$, $\beta_1 = 0.5$ ...@@ -327,62 +350,83 @@ $\beta_0 = 3$, $\beta_1 = 0.5$
$\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_0 = 3$, $\beta_1 = 0.5$
## Model based approaches ## Model-based approaches
### $y = \beta_0 + \beta_1 x$ ### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1$
\begin{center} \begin{center}
\href{https://cole-trapnell-lab.github.io/monocle3/}{ \href{https://cole-trapnell-lab.github.io/monocle3/}{
\includegraphics[width=0.7\textwidth]{img/deg_pseudotime.png} \includegraphics[width=0.6\textwidth]{img/deg_pseudotime.png}
} }
\end{center} \end{center}
## Model based approaches ## Model-based approaches
### $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2$ ### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1 + \beta_2 y_2$
\begin{center} \begin{center}
\includegraphics[width=0.7\textwidth]{img/lm_2_groups_b0_b0_3_b1_05.png} \includegraphics[width=0.7\textwidth]{img/lm_2_groups_b0_b0_3_b1_05.png}
\end{center} \end{center}
$\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$ $\beta_0 = 3$, $\beta_1 = 0.5$, $\beta_2 = 5$
## Model-based approaches
### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1 + \beta_2 y_2$
\begin{center}
\href{https://www.sciencedirect.com/science/article/pii/S2211124721005192}{
\includegraphics[width=0.9\textwidth]{img/deg_time_group.png}
}
\end{center}
## Model based approaches ## Model-based approaches
### $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2$ ### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1 + \beta_2 y_2 + \beta_3 y_1 y_2$
\begin{center} \begin{center}
\includegraphics[width=0.7\textwidth]{img/lm_2_groups_b0_b0_3_b1_05_interaction.png} \includegraphics[width=0.7\textwidth]{img/lm_2_groups_b0_b0_3_b1_05_interaction.png}
\end{center} \end{center}
$\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$ $\beta_0 = 3$, $\beta_1 = 0.5$, $\beta_2 = 5$, $\beta_3 = -0.4$
## Model based approaches ## Model-based approaches
### $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2$ ### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1 + \beta_2 y_2 + \beta_3 y_1 y_2$
\begin{center} \begin{center}
\includegraphics[width=0.7\textwidth]{img/lm_2_groups_2_factors_b0_b0_3_b1_05_interaction.png} \includegraphics[width=0.7\textwidth]{img/lm_2_groups_2_factors_b0_b0_3_b1_05_interaction.png}
\end{center} \end{center}
$\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$ $\beta_0 = 3$, $\beta_1 = 0.5$, $\beta_2 = 5$, $\beta_3 = -0.4$
## Model based approaches ## Model-based approaches
### $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2$ ### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1 + \beta_2 y_2$
\begin{center} \begin{center}
href{doi: 10.1093/nar/gky675}{ \href{doi: 10.1093/nar/gky675}{
\includegraphics[width=0.6\textwidth]{img/deg_time_group.png} \includegraphics[width=0.35\textwidth]{img/deg_time_group_inter.png}
}
\end{center}
## Model-based approaches
### $\boldsymbol{\mu}_i = \beta_0 + \beta_1 y_1 + \beta_2 Z$
$Z \sim \mathcal{N}(\mu_z, \sigma_z)$
\begin{center}
\href{https://www.sciencedirect.com/science/article/pii/S2211124721005192}{
\includegraphics[width=0.35\textwidth]{img/deg_time_mixed.png}
} }
\end{center} \end{center}
# Multiple hypotheses testing # Multiple hypotheses testing
## Multiple hypotheses problem ## Multiple hypothesis problem
\begin{center} \begin{center}
\only<1>{\includegraphics[width=10cm]{img/dnorm_abs}\\[-2.5em]} \only<1>{\includegraphics[width=10cm]{img/pval_2_0.05}}
\only<1>{\includegraphics[width=10cm]{img/pval_alpha}}
\only<2>{\includegraphics[width=10cm]{img/pval_alpha_random_H0_1} \only<2>{\includegraphics[width=10cm]{img/pval_alpha_random_H0_1}
\begin{center} \begin{center}
n = 10 n = 10
...@@ -426,7 +470,7 @@ $\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$ ...@@ -426,7 +470,7 @@ $\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$
\end{center} \end{center}
## Multiple hypotheses solutions ## Multiple hypothesis solutions
\begin{block}{Family Wise Error Rate (FWER)} \begin{block}{Family Wise Error Rate (FWER)}
\begin{itemize} \begin{itemize}
...@@ -438,13 +482,13 @@ $\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$ ...@@ -438,13 +482,13 @@ $\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$
\end{block} \end{block}
\begin{example} \begin{example}
\begin{center} \begin{center}
\emph{``We reject 14 hypothesis with a FWER of 0.05''} \emph{``We reject 14 hypotheses with a FWER of 0.05''}
\emph{``We reject 14 hypothesis at a level of 0.05 after Bonferoni correction''} \emph{``We reject 14 hypotheses at a level of 0.05 after Bonferoni correction''}
\end{center} \end{center}
Means: 14 hypotheses are not following the null distribution and we make this statement with a probability 0.05 of having fewer than one false positives in the 14 tests. Means: 14 hypotheses are not following the null distribution and we make this statement with a probability 0.05 of having fewer than one false positives in the 14 tests.
\end{example} \end{example}
## Multiple hypotheses solutions ## Multiple hypothesis solutions
\begin{block}{False Discovery Rate (FDR)} \begin{block}{False Discovery Rate (FDR)}
\begin{itemize} \begin{itemize}
...@@ -456,26 +500,28 @@ $\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$ ...@@ -456,26 +500,28 @@ $\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$
\end{block} \end{block}
\only<1>{ \only<1>{
\vspace{2em} \vspace{2em}
\begin{center}
\begin{tabular}{l|ccc} \begin{tabular}{l|ccc}
hypothesis & Claimed non-significant & Claimed significant & Total\\ hypothesis & Claimed nonsignificant & Claimed significant & Total\\
\hline \hline
Null & TN & FP & $m_0$\\ Null & TN & FP & $m_0$\\
Non-null & FN & TP & $m_1$\\ Non-null & FN & TP & $m_1$\\
Total & S & R & $m$ Total & S & R & $m$
\end{tabular} \end{tabular}
\end{center}
} }
\only<2-3>{ \only<2-3>{
\begin{example} \begin{example}
\begin{center} \begin{center}
\emph{``We reject 254 hypothesis with a FDR of 0.05''} \emph{``We reject 254 hypotheses with a FDR of 0.05''}
\emph{``We reject 254 hypothesis with a level of 0.05 after BH correction''} \emph{``We reject 254 hypotheses with a level of 0.05 after BH correction''}
\end{center} \end{center}
Means: 254 hypotheses are not following the null distribution and we expect on average 5\% or less of false positives in the 254. Means: 254 hypotheses are not following the null distribution and we expect on average 5\% or less of false positives in the 254.
\end{example} \end{example}
} }
\only<3>{ \only<3>{
\begin{center} \begin{center}
The number of FPs increases with the number of TPs {\bf The number of FPs increases with the number of TPs}
\end{center} \end{center}
} }
...@@ -485,18 +531,18 @@ $\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$ ...@@ -485,18 +531,18 @@ $\beta_0 = 3$, $\beta_1 = 0.5$ $\beta_2 = 5$
\end{center} \end{center}
$$\Pr\left(FP < 1\right) < \alpha_{FWER}$$ $$\Pr\left(FP < 1\right) < \alpha_{FWER}$$
$$\Pr\left(\mathbb{E}\left[\frac{FP}{R}\right | R > 0]\right)\Pr\left(R > 0\right) < \alpha_{FDR}$$ $$\Pr\left(\mathbb{E}\left[\frac{FP}{R}\right | R > 0]\right)\Pr\left(R > 0\right) < \alpha_{FDR}$$
When $TP \leq 1$ FWER and FDR control are identical.\\ when $TP \leq 1$ FWER and FDR control are identical.\\
The difference increases with the number of $TP$s The difference increases with the number of $TP$s
## FDR control ## FDR control
\begin{center} \begin{center}
\includegraphics[width=12cm]{img/pval_hist_H0_H1}\\[-1em] \includegraphics[width=11cm]{img/pval_hist_H0_H1}\\[-1em]
\pause \pause
When we analyse data we hope to get a mixture between:\\ When we analyze data we hope to get a mixture between:\\
\includegraphics[width=12cm]{img/pval_hist_H0}\\[-2em] \includegraphics[width=11cm]{img/pval_hist_H0}\\[-2em]
\pause \pause
\includegraphics[width=12cm]{img/pval_hist_H1} \includegraphics[width=11cm]{img/pval_hist_H1}
\end{center} \end{center}
## FDR control: local FDR ($\ell FDR$) of Efron ## FDR control: local FDR ($\ell FDR$) of Efron
...@@ -525,11 +571,19 @@ When we analyse data we hope to get a mixture between:\\ ...@@ -525,11 +571,19 @@ When we analyse data we hope to get a mixture between:\\
} }
\end{center} \end{center}
## Post-selection inference
\begin{center}
\href{https://pubmed.ncbi.nlm.nih.gov/30206223/}{
\includegraphics[width=0.75\textwidth]{img/post_inference_example.png}
}
\end{center}
## SimCD ## SimCD
\begin{center} \begin{center}
\href{https://arxiv.org/abs/2104.01512v1}{ \href{https://arxiv.org/abs/2104.01512v1}{
\includegraphics[width=0.6\textwidth]{img/simCD.png} \includegraphics[width=\textwidth]{img/simCD.png}
} }
\end{center} \end{center}
......
6_dea/img/deg_time_group.png

1.13 MiB | W: | H:

6_dea/img/deg_time_group.png

1.92 MiB | W: | H:

6_dea/img/deg_time_group.png
6_dea/img/deg_time_group.png
6_dea/img/deg_time_group.png
6_dea/img/deg_time_group.png
  • 2-up
  • Swipe
  • Onion skin
6_dea/img/deg_time_group_inter.png

1.13 MiB

6_dea/img/deg_time_mixed.png

575 KiB

6_dea/img/post_inference_example.png

2.22 MiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment