diff --git a/Practical_a.Rmd b/Practical_a.Rmd index 84860a608aa8cbfaf903e2246b54b67260971f4a..37eb20fe95c60584e36d1d543d4fdcc4619512c7 100644 --- a/Practical_a.Rmd +++ b/Practical_a.Rmd @@ -1,14 +1,7 @@ --- # https://www.gnu.org/licenses/agpl-3.0.txt -title: "Introduction to Principal Component Analysis" +title: "Practice: Introduction to Principal Component Analysis" author: "Ghislain Durif, Laurent Modolo, Franck Picard" -output: - rmdformats::downcute: - self_contain: true - use_bookdown: true - default_style: "light" - lightbox: true - css: "./www/style_Rmd.css" --- <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a> @@ -33,6 +26,8 @@ rm(list = ls()) knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(comment = NA) +if("conflicted" %in% .packages()) conflicted::conflicts_prefer(dplyr::filter) + first_pc_projection_code <- function(line_slope, x, y){ a <- c(x, y) b <- c(1, line_slope) @@ -41,22 +36,6 @@ first_pc_projection_code <- function(line_slope, x, y){ } ``` -```{r klippy_install, echo=FALSE, include=FALSE} -if (!require("klippy")) { - install.packages("remotes") - remotes::install_github("rlesur/klippy") -} -``` - - -```{r klippy, echo=FALSE, include=TRUE} -klippy::klippy( - position = c('top', 'right'), - color = "white", - tooltip_message = 'Click to copy', - tooltip_success = 'Copied !') -``` - ## Introduction One of the most widely used tools in big data analysis is the principal component analysis or PCA method. PCA applications are multiples, it can be used for data visualization, data exploration or as a preprocessing step to reduce the dimension of your data before applying other methods. @@ -105,7 +84,6 @@ summary(penguins) What are the continuous and categorial variables in this table ? What is going to be the problem with the raw `penguins` table ? </div> - <details><summary>Solution</summary> <p> diff --git a/Practical_b.Rmd b/Practical_b.Rmd index f2daaf8f59b2812752407da7d19957a1a7ef11a8..a3c7a669a59d56464299a9ab9ae0d3d5a1a18c7b 100644 --- a/Practical_b.Rmd +++ b/Practical_b.Rmd @@ -1,15 +1,7 @@ --- # https://www.gnu.org/licenses/agpl-3.0.txt -title: "Introduction to clustering" +title: "Practice: Introduction to clustering" author: "Ghislain Durif, Laurent Modolo, Franck Picard" -output: - rmdformats::downcute: - self_contain: true - use_bookdown: true - default_style: "light" - lightbox: true - css: "./www/style_Rmd.css" - --- <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a> @@ -45,22 +37,10 @@ if (!require("umap")) rm(list = ls()) knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(comment = NA) -``` -```{r klippy_install, echo=FALSE, include=FALSE} -if (!require("klippy")) { - remotes::install_github("rlesur/klippy") -} +if("conflicted" %in% .packages()) conflicted::conflicts_prefer(dplyr::filter) ``` - -```{r klippy, echo=FALSE, include=TRUE} -klippy::klippy( - position = c('top', 'right'), - color = "white", - tooltip_message = 'Click to copy', - tooltip_success = 'Copied !') -``` ## Introduction The goal of single-cell transcriptomics is to measure the transcriptional states of large numbers of cells simultaneously. The input to a single-cell RNA sequencing (scRNAseq) method is a collection of cells. Formally, the desired output is a transcript or genes ($M$) x cells ($N$) matrix $X^{N \times M}$ that describes, for each cell, the abundance of its constituent transcripts or genes. More generally, single-cell genomics methods seek to measure not just transcriptional state, but other modalities in cells, e.g., protein abundances, epigenetic states, cellular morphology, etc. diff --git a/Practical_c.Rmd b/Practical_c.Rmd index 18710566be66e185ad9c1497f74274d37c4a7a1e..c3ac99d0108d284cfdfbf6308ac81f2fe083bb58 100644 --- a/Practical_c.Rmd +++ b/Practical_c.Rmd @@ -1,15 +1,7 @@ --- # https://www.gnu.org/licenses/agpl-3.0.txt -title: "Introduction to linear models and multiple testing" +title: "Practice: Introduction to linear models and multiple testing" author: "Ghislain Durif, Laurent Modolo, Franck Picard" -output: - rmdformats::downcute: - self_contain: true - use_bookdown: true - default_style: "light" - lightbox: true - css: "./www/style_Rmd.css" - --- <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a> @@ -50,20 +42,8 @@ library(palmerpenguins) # to evaluate model rm(list = ls()) knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(comment = NA) -``` - -```{r klippy_install, echo=FALSE, include=FALSE} -if (!requireNamespace("klippy")) { - remotes::install_github("rlesur/klippy") -} -``` -```{r klippy, echo=FALSE, include=TRUE} -klippy::klippy( - position = c('top', 'right'), - color = "white", - tooltip_message = 'Click to copy', - tooltip_success = 'Copied !') +if("conflicted" %in% .packages()) conflicted::conflicts_prefer(dplyr::filter) ``` ::: {.hidden} @@ -121,7 +101,16 @@ Throughout this practical, we will explore the following subjects: ## Requirements -We will need the following packages +We will need the following packages. If not already done, you need to install the following packages: + +```{r eval=FALSE} +install.packages("tidyverse") # to manipulate data and make plot +install.packages("performance") # to evaluate model +install.packages("FactoMineR") # for dimension reduction +install.packages("factoextra") # to plot dimension reduction output +``` + +Now, we can load them: ```{r echo=T, message=F} library(tidyverse) # to manipulate data and make plot @@ -1011,7 +1000,7 @@ How to avoid drawing false conclusions because of confounding factors? - Use randomization to build your experiment, like [randomized controlled trial](https://en.wikipedia.org/wiki/Randomized_controlled_trial), possibly with [double-blinding](https://en.wikipedia.org/wiki/Blinded_experiment), to remove the potential effect of confounding variables (should be used for any serious drug trial), or at least control the potential sampling bias caused by confounding factors (like [case-control studies](https://en.wikipedia.org/wiki/Case%E2%80%93control_study), [cohort studies](https://en.wikipedia.org/wiki/Cohort_study), [stratification](https://en.wikipedia.org/wiki/Stratified_sampling)). -2. If not possible (it is not always possible depending on the design of the experiment and/or the object of the study), measure and log various metadata regarding your subjects/individuals so that you will be able to account for the potential effect of confounding variables in your analysis (c.f. [later](#one-factor-anova)). +- If not possible (it is not always possible depending on the design of the experiment and/or the object of the study), measure and log various metadata regarding your subjects/individuals so that you will be able to account for the potential effect of confounding variables in your analysis (c.f. [later](#one-factor-anova)). It generally requires a certain level of technical expertise/knowledge in the considered subject to be able to identify potential confounding factors before the experiments (so that you can monitor and log the corresponding quantities during your experiment). diff --git a/index.md b/index.md index 75afaa9e27003618d4bff0b2e65e35786293b196..98fcc6475be949d1633e9d7a1c6a9a017ad7e2d3 100644 --- a/index.md +++ b/index.md @@ -1,18 +1,21 @@ ---- -title: ENS M1 ML ---- +# Welcome {.unnumbered} ## Introduction to dimension reduction -1. [slides](./dimension_reduction.pdf) -2. [Practical a](./Practical_a.html) +1. [Course](./dimension_reduction.pdf) (pdf) +2. [Practice](./Practical_a.html) ## Introduction to clustering -1. [slides](./clustering.pdf) -2. [Practical b](./Practical_b.html) +1. [Course](./clustering.pdf) (pdf) +2. [Practice](./Practical_b.html) ## Introduction to linear models and multiple testing -1. [slides](./regression_multiple_testing.pdf) -2. [Practical c](./Practical_c.html) +1. [Course](./regression_multiple_testing.pdf) (pdf) +2. [Practice](./Practical_c.html) + +## Introduction to model selection and regularization + +1. [Course](./model_selection_regularization.pdf) (pdf) +2. [Practice](./Practical_d.html)