Skip to content
Snippets Groups Projects
Verified Commit cd907e0a authored by Laurent Modolo's avatar Laurent Modolo
Browse files

add img to normalization.Rmd

parent dfef7256
Branches
Tags v0.2.8
No related merge requests found
Pipeline #321 failed
2_normalization/img/cell_barcode_rank_vs_umi.png

57.3 KiB

--- ---
title: "single-cell RNA-Seq data: Normalization" title: "single-cell RNA-Seq: Normalization"
author: "Laurent Modolo [laurent.modolo@ens-lyon.fr](mailto:laurent.modolo@ens-lyon.fr)" author: "Laurent Modolo [laurent.modolo@ens-lyon.fr](mailto:laurent.modolo@ens-lyon.fr)"
date: "Friday 3 June 2022" date: "Friday 3 June 2022"
output: output:
...@@ -8,7 +8,7 @@ output: ...@@ -8,7 +8,7 @@ output:
fig_caption: no fig_caption: no
highlight: tango highlight: tango
latex_engine: xelatex latex_engine: xelatex
slide_level: 1 slide_level: 2
theme: metropolis theme: metropolis
ioslides_presentation: ioslides_presentation:
highlight: tango highlight: tango
...@@ -16,32 +16,263 @@ output: ...@@ -16,32 +16,263 @@ output:
highlight: tango highlight: tango
classoption: aspectratio=169 classoption: aspectratio=169
--- ---
# Introduction # Introduction
## Programme ## Introduction
### Program
1. Single-cell RNASeq data from 10X Sequencing (Friday 3 June 2022 - 14:00) 1. Single-cell RNASeq data from 10X Sequencing (Friday 3 June 2022 - 14:00)
2. Normalization and spurious effects (Wednesday 8 June 2022 - 14:00) 2. Normalization and spurious effects (Wednesday 8 June 2022 - 14:00)
3. Dimension reduction and data visualization (Monday 13 June 2022 - 15:00) 3. Dimension reduction and data visualization (Monday 13 June 2022 - 15:00)
4. Clustering and annotation (Thursday 23 June 2022 - 14:00) 4. Clustering and annotation (Thursday 23 June 2022 - 14:00)
5. Pseudo-time and velocity inference (Thursday 30 June 2022 - 14:00) 5. Pseudo-time and velocity inference (Thursday 30 June 2022 - 14:00)
6. Differental expression analysis (Friday 8 July 2022 - 14:00) 6. Differential expression analysis (Friday 8 July 2022 - 14:00)
# Introduction ## Introduction
## Programme ### Program
1. Single-cell RNASeq data from 10X Sequencing (Friday 3 June 2022 - 14:00) 1. Single-cell RNASeq data from 10X Sequencing (Friday 3 June 2022 - 14:00)
2. Normalization and spurious effects (Wednesday 8 June 2022 - 14:00) 2. Normalization and spurious effects (Wednesday 8 June 2022 - 14:00)
- Normalization vers transformation - Quality control
- Spurious effects - Normalization
- library effect - Variance stabilization
- dropout effect - Depth normalization
- batch effect - The monotonicity of the normalization
- Variance decomposition - batch effects
- Residuals - Heterogeneous data
3. Dimension reduction and data visualization (Monday 13 June 2022 - 15:00) 3. Dimension reduction and data visualization (Monday 13 June 2022 - 15:00)
4. Clustering and annotation (Thursday 23 June 2022 - 14:00) 4. Clustering and annotation (Thursday 23 June 2022 - 14:00)
5. Pseudo-time and velocity inference (Thursday 30 June 2022 - 14:00) 5. Pseudo-time and velocity inference (Thursday 30 June 2022 - 14:00)
6. Differential expression analysis (Friday 8 July 2022 - 14:00) 6. Differential expression analysis (Friday 8 July 2022 - 14:00)
# Quality control
## Cell filtering
\begin{center}
\begin{columns}
\column{0.5\textwidth}
\begin{center}
\begin{tikzpicture}
\fill
(0.5,3.5) node {\bf $\text{gene}_1$}
-- (0.5,2.5) node {\bf $\text{gene}_2$}
-- (0.5,1.5) node {\bf $\vdots$}
-- (0.5,0.5) node {\bf $\text{gene}_n$};
\fill
(1.5,4.5) node {\bf{$\text{cell}_1$}}
-- (1.5,3.5) node {mRNA}
-- (1.5,2.5) node {mRNA}
-- (1.5,1.5) node {$\vdots$}
-- (1.5,0.5) node {mRNA};
\fill
(2.5,4.5) node {\color{red}\bf{$\text{0 cell}_2$}}
-- (2.5,3.5) node {\color{red}mRNA}
-- (2.5,2.5) node {\color{red}mRNA}
-- (2.5,1.5) node {\color{red}$\vdots$}
-- (2.5,0.5) node {\color{red}mRNA};
\fill
(3.5,4.5) node {\bf{$\cdots$}}
-- (3.5,3.5) node {$\cdots$}
-- (3.5,2.5) node {$\cdots$}
-- (3.5,1.5) node {$\ddots$}
-- (3.5,0.5) node {$\cdots$};
\fill
(4.5,4.5) node {\bf{$\text{cell}_c$}}
-- (4.5,3.5) node {mRNA}
-- (4.5,2.5) node {mRNA}
-- (4.5,1.5) node {$\vdots$}
-- (4.5,0.5) node {mRNA};
\draw (1,0) grid (5,4);
\end{tikzpicture}
\end{center}
\column{0.5\textwidth}
{\large Some cells are not cells.}
\begin{itemize}
\item matrix columns are defined by {\bf cell barcode sequences}
\item {\bf cell barcode sequences identify droplet} in the 10X protocol
\end{itemize}
\end{columns}
\end{center}
## Cell filtering
\begin{center}
\begin{columns}
\column{0.5\textwidth}
\begin{center}
\begin{tikzpicture}
\fill
(0.5,3.5) node {\bf $\text{gene}_1$}
-- (0.5,2.5) node {\bf $\text{gene}_2$}
-- (0.5,1.5) node {\bf $\vdots$}
-- (0.5,0.5) node {\bf $\text{gene}_n$};
\fill
(1.5,4.5) node {\bf{$\text{bc}_1$}}
-- (1.5,3.5) node {mRNA}
-- (1.5,2.5) node {mRNA}
-- (1.5,1.5) node {$\vdots$}
-- (1.5,0.5) node {mRNA};
\fill
(2.5,4.5) node {\color{red}\bf{$\text{bc}_2$}}
-- (2.5,3.5) node {\color{red}mRNA}
-- (2.5,2.5) node {\color{red}mRNA}
-- (2.5,1.5) node {\color{red}$\vdots$}
-- (2.5,0.5) node {\color{red}mRNA};
\fill
(3.5,4.5) node {\bf{$\cdots$}}
-- (3.5,3.5) node {$\cdots$}
-- (3.5,2.5) node {$\cdots$}
-- (3.5,1.5) node {$\ddots$}
-- (3.5,0.5) node {$\cdots$};
\fill
(4.5,4.5) node {\bf{$\text{bc}_c$}}
-- (4.5,3.5) node {mRNA}
-- (4.5,2.5) node {mRNA}
-- (4.5,1.5) node {$\vdots$}
-- (4.5,0.5) node {mRNA};
\draw (1,0) grid (5,4);
\end{tikzpicture}
\end{center}
\column{0.5\textwidth}
{\large Some cells are not cells.}
\begin{itemize}
\item {\bf v2} chemistry $\sim 737,000$ cell barcodes
\item {\bf v3} chemistry $\sim 3,500,000$ cell barcodes
\end{itemize}
\vspace{1em}
To avoid cell barcode collision we need
\[
\text{\bf cell number} \ll \text{\bf cell barcode number}
\]
Most of the droplets will be empty
\end{columns}
\end{center}
## Cell filtering
\begin{center}
\begin{columns}
\column{0.35\textwidth}
Sequenced empty droplets:
\begin{itemize}
\item do not express many genes
\item looks like experimental noise
\end{itemize}
\vspace{1em}
The number of UMI per cell barcode
\column{0.7\textwidth}
\vspace{1.5em}
\includegraphics[width=\textwidth]{img/cell_barcode_rank_vs_umi.png}
\end{columns}
\end{center}
## Cell filtering
\begin{center}
\begin{columns}
\column{0.35\textwidth}
We have {\bf two populations} of cell barcode:
\begin{itemize}
\item a {\bf low} total UMI counts one
\item a {\bf high} total UMI counts one
\end{itemize}
\column{0.7\textwidth}
\vspace{1.5em}
\includegraphics[width=\textwidth]{img/cell_barcode_rank_vs_umi.png}
\end{columns}
\end{center}
## Cell filtering
\begin{center}
\begin{columns}
\column{0.5\textwidth}
\begin{center}
\begin{tikzpicture}
\fill
(0.5,3.5) node {\bf $\text{gene}_1$}
-- (0.5,2.5) node {\bf $\text{gene}_2$}
-- (0.5,1.5) node {\bf $\vdots$}
-- (0.5,0.5) node {\bf $\text{gene}_n$};
\fill
(1.5,4.5) node {\bf{$\text{cell}_1$}}
-- (1.5,3.5) node {mRNA}
-- (1.5,2.5) node {mRNA}
-- (1.5,1.5) node {$\vdots$}
-- (1.5,0.5) node {mRNA};
\fill
(2.5,4.5) node {\color{red}\bf{$\text{2 cells}_2$}}
-- (2.5,3.5) node {\color{red}mRNA}
-- (2.5,2.5) node {\color{red}mRNA}
-- (2.5,1.5) node {\color{red}$\vdots$}
-- (2.5,0.5) node {\color{red}mRNA};
\fill
(3.5,4.5) node {\bf{$\cdots$}}
-- (3.5,3.5) node {$\cdots$}
-- (3.5,2.5) node {$\cdots$}
-- (3.5,1.5) node {$\ddots$}
-- (3.5,0.5) node {$\cdots$};
\fill
(4.5,4.5) node {\bf{$\text{cell}_c$}}
-- (4.5,3.5) node {mRNA}
-- (4.5,2.5) node {mRNA}
-- (4.5,1.5) node {$\vdots$}
-- (4.5,0.5) node {mRNA};
\draw (1,0) grid (5,4);
\end{tikzpicture}
\end{center}
\column{0.5\textwidth}
{\large Some cells are many cells.}
\begin{itemize}
\item not all tissues are easily dissociable
\item two cells glued together will share the same droplet
\end{itemize}
\vspace{1em}
cell barcode corresponding to $n$-plet should be in monority the the preparation went well.
\end{columns}
\end{center}
## Cell filtering
apoptotic cells express MT genes
# Normalization
# Variance stabilization
# Depth normalization
# Monotonicity of the normalization
# batch effects
# heterogeneous data
# Goals of Normalization
- Variance stabilization
- Depth normalization
- Monotonicity of the normalization
- Make different data set similar
\ No newline at end of file
File deleted
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment