From d18af35796f370acae1de2ff0240ec84895c2464 Mon Sep 17 00:00:00 2001 From: aduvermy <arnaud.duvermy@ens-lyon.fr> Date: Mon, 28 Mar 2022 15:28:51 +0000 Subject: [PATCH] update tuto --- src/tutorial_htrsim.Rmd | 58 ++++++++++++++++++++++++++++++++++------- 1 file changed, 49 insertions(+), 9 deletions(-) diff --git a/src/tutorial_htrsim.Rmd b/src/tutorial_htrsim.Rmd index 36d2092..40526a2 100644 --- a/src/tutorial_htrsim.Rmd +++ b/src/tutorial_htrsim.Rmd @@ -9,21 +9,15 @@ $$ Phenotype = Genotype + Environment + Genotype.Environment $$ From this expression, $\beta_{G}$, $\beta_{E}$, $\beta_{G*E}$ can be seen as coefficients which allow quantifying the participation of each factors (Genotype, Environment and interaction Genotype/Environment). -Then, +In mathematical term, it leads to a linear expression such: $$ P = \beta_{G}*G + \beta_{E}*E + \beta_{G*E}*G.E + \beta_{0} $$ In order to estimate these coefficients, a Generalized Linear Model (GLM) can be used. +# B. HTRSIM getting started -```{r echo=FALSE, out.width='100%'} -knitr::include_graphics('../img/schema_loop.jpg') -``` - - - -This is a tutorial for *htrsim* utilization - + <u>a. Required</u> ```{r required, message=FALSE, echo = T, results = "hide"} library(data.table) @@ -40,6 +34,52 @@ setwd("/home/rstudio/mydatalocal/counts_simulation/src/") ``` + <u>b. Workflow</u> + + +```{r echo=FALSE, out.width='50%'} +knitr::include_graphics('../img/schema_loop.jpg') +``` + + <u>c. RNA-seq pipeline</u> + +You can used your favorite pipeline to obtain table counts from real data. +If you don't have any idea of how to obtain such table counts rdv [at](https://gitbio.ens-lyon.fr/aduvermy/rna-seq_public_library_investigations) + + + <u> d. BioProject PRJNA675209b as input</u> + +To easily test *HTRSIM* we produced an usual table counts from BioProject PRJNA675209b. +Take the time to clean up your table counts. + +```{r} +tabl_cnts <- read.table("/home/rstudio/mydatalocal/rna-seq_public_library_investigations/results/2022-02-09/salmon.merged.gene_counts.tsv", header = TRUE) +rownames(tabl_cnts) <- tabl_cnts$gene_id +tabl_cnts <- tabl_cnts %>% select(-gene_id)##suppr colonne GeneID +tabl_cnts <- tabl_cnts %>% select(-gene_name) ##suppr colonne GeneName +tabl_cnts +``` + + <u> e. Launch HTRSIM</u> + +```{r message=FALSE, warning=FALSE} +## import design of bioProject +bioDesign <- read_csv2(file = "/home/rstudio/mydatalocal/rna-seq_public_library_investigations/data/design_deseq__PRJNA675209.csv") +#bioDesign +source(file = "htrsim/main.R") +tabl_cnts %>% dim() +bioDesign %>% dim() +simul_cnts = htrsim(tabl_cnts, bioDesign = bioDesign, 2) +``` + + + + +# C. Advance user + + +This is a tutorial for *htrsim* utilization + To perform counts per genes *htrsim* needs some inputs.</br> Following parameters are required:</br> - number of samples</br> -- GitLab