From 59e7c332f8effd0f5baecb24ed4bdef029eea2d8 Mon Sep 17 00:00:00 2001
From: GD <gd.dev@libertymail.net>
Date: Fri, 28 Oct 2022 11:29:50 +0200
Subject: [PATCH] highlight questions to be treated in the report

---
 Practical_c.Rmd | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/Practical_c.Rmd b/Practical_c.Rmd
index c1116eb..1871056 100644
--- a/Practical_c.Rmd
+++ b/Practical_c.Rmd
@@ -796,7 +796,7 @@ ggplot(yeast_av_data, aes(factor(YDL200C_427), A101)) + geom_boxplot() +
 </p>
 </details>
 
-<div class="pencadre">
+<div class="red_pencadre">
 How to account for the cell cycle in the previous representations? Is it important?
 </div>
 
@@ -1095,7 +1095,7 @@ anova(lm(A101 ~ factor(YDL200C_427), data = yeast_av_subdata))
 </details>
 
 
-<div class="pencadre">
+<div class="red_pencadre">
 What should we check before interpreting the results of the ANOVA?
 </div>
 
@@ -1214,14 +1214,14 @@ Then $\mu$ and $B$ are estimated by a **least square linear regression** (see [h
 </details>
 
 
-<div class="pencadre">
+<div class="red_pencadre">
 **Interpretation of the ANOVA results:** do you find a significant effect of the respective SNPs on the considered morphological trait? Comment the results in relation with your intuition from the analysis of the descriptive statistics [above](#data-description)? 
 </div>
 
 <details><summary>Solution</summary>
 <p>
 
-After verifying the normality and homoskedasticity of the residuals (**if either is not verified, we cannot use the results from the ANOVA significance test because it assumes a Gaussian model**), we find a significant effect of SNP `YAL069W_1` and a non-significant effect of SNP `YDL200C_427` onto the morphological trait `A101` (when focusing on the `C` cell cycle phase), which confirms our intuition from the graphical representation.
+After verifying the normality and homoskedasticity of the residuals (**if it is not verified, we cannot use the results from the ANOVA significance test because it assumes a Gaussian model**), we find a significant effect of SNP `YAL069W_1` and a non-significant effect of SNP `YDL200C_427` onto the morphological trait `A101` (when focusing on the `C` cell cycle phase), which confirms our intuition from the graphical representation.
 
 ---
 
@@ -1593,7 +1593,7 @@ The exploration of the SNP data by linear dimension reduction approaches did not
 
 To do so, we are going to run an ANOVA for each SNP.
 
-<div class="pencadre">
+<div class="red_pencadre">
 Given the number of SNPs in the data (i.e. `r nrow(gt_data)`), what could be the risk when running such an analysis?
 </div>
 
@@ -1602,7 +1602,7 @@ Given the number of SNPs in the data (i.e. `r nrow(gt_data)`), what could be the
 
 We are going to do thousands of tests, computing and using thousands of p-values to assess the potential significant effect of each SNP on the considered morphological trait.
 
-We have a non-negligible risk to wrong reject the null hypothesis for many of the SNPs, and conclude to a non-existing significant effect.
+We have a non-negligible risk to wrongly reject the null hypothesis for many of the SNPs, and conclude to a non-existing significant effect.
 
 Thus, we have to use p-values correction (or adjustment) procedure adapted to the case of multiple testing.
 
@@ -1660,7 +1660,7 @@ ggplot(test_result) + geom_point(aes(x=SNP_index, y=p_values)) +
 
 
 
-<div class="pencadre">
+<div class="red_pencadre">
 What can you say about these results?
 </div>
 
@@ -1720,7 +1720,7 @@ ggplot(
 ```
 
 
-<div class="pencadre">
+<div class="red_pencadre">
 What can you say about the different corrections?
 </div>
 
@@ -1791,7 +1791,7 @@ test_result %>%
 </details>
 
 
-<div class="pencadre">
+<div class="red_pencadre">
 
 What can we do with these results?
 
@@ -1816,7 +1816,7 @@ What can we do with these results?
 ## Full data analysis
 
 
-<div class="pencadre">
+<div class="red_pencadre">
 Open (and optional) question: run the previous analysis to find SNPs with significant effect on the morphological trait `A101` with the full dataset `yeast_data`, i.e. without the average by strain for the morphological traits.
 
 In this case, you will have to account for the `strain_id` factor in the ANOVA.
-- 
GitLab