Skip to content
Snippets Groups Projects
Verified Commit fc54ef09 authored by Laurent Modolo's avatar Laurent Modolo
Browse files

fix session_5.Rmd name

parent a36ca9f8
No related branches found
No related tags found
No related merge requests found
--- ---
title: "R#5: Pipping and grouping" title: "R.5: Pipping and grouping"
author: "Laurent Modolo [laurent.modolo@ens-lyon.fr](mailto:laurent.modolo@ens-lyon.fr) author: "Laurent Modolo [laurent.modolo@ens-lyon.fr](mailto:laurent.modolo@ens-lyon.fr)"
date: "2021" date: "2021"
output: output:
rmdformats::downcute: rmdformats::downcute:
...@@ -116,7 +116,7 @@ Then, when you use the function you already know on grouped data frame and they ...@@ -116,7 +116,7 @@ Then, when you use the function you already know on grouped data frame and they
You can use the following code to compute the average delay per months across years. You can use the following code to compute the average delay per months across years.
```{r summarise_group_by, include=TRUE, fig.width=8, fig.height=3.5} ```{r summarise_group_by, include=TRUE, message=FALSE, fig.width=8, fig.height=3.5}
flights_delay <- flights %>% flights_delay <- flights %>%
group_by(year, month) %>% group_by(year, month) %>%
summarise(delay = mean(dep_delay, na.rm = TRUE), sd = sd(dep_delay, na.rm = TRUE)) %>% summarise(delay = mean(dep_delay, na.rm = TRUE), sd = sd(dep_delay, na.rm = TRUE)) %>%
...@@ -138,6 +138,8 @@ Why did we `group_by` `year` and `month` and not only `year` ? ...@@ -138,6 +138,8 @@ Why did we `group_by` `year` and `month` and not only `year` ?
You may have wondered about the `na.rm` argument we used above. What happens if we don’t set it? You may have wondered about the `na.rm` argument we used above. What happens if we don’t set it?
</div> </div>
<details><summary>Solution</summary>
<p>
```{r summarise_group_by_NA, include=TRUE} ```{r summarise_group_by_NA, include=TRUE}
flights %>% flights %>%
group_by(dest) %>% group_by(dest) %>%
...@@ -146,6 +148,8 @@ flights %>% ...@@ -146,6 +148,8 @@ flights %>%
delay = mean(arr_delay) delay = mean(arr_delay)
) )
``` ```
</p>
</details>
Aggregation functions obey the usual rule of missing values: **if there’s any missing value in the input, the output will be a missing value**. Aggregation functions obey the usual rule of missing values: **if there’s any missing value in the input, the output will be a missing value**.
...@@ -361,7 +365,7 @@ Which carrier has the worst delays? ...@@ -361,7 +365,7 @@ Which carrier has the worst delays?
<details><summary>Solution</summary> <details><summary>Solution</summary>
<p> <p>
```{r grouping_challenges_c, eval=F, echo = T, message=FALSE, cache=T} ```{r grouping_challenges_c1, eval=F, echo = T, message=FALSE, cache=T}
flights %>% flights %>%
group_by(carrier) %>% group_by(carrier) %>%
summarise( summarise(
...@@ -380,7 +384,7 @@ Can you disentangle the effects of bad airports vs. bad carriers? (Hint: think a ...@@ -380,7 +384,7 @@ Can you disentangle the effects of bad airports vs. bad carriers? (Hint: think a
<details><summary>Solution</summary> <details><summary>Solution</summary>
<p> <p>
```{r grouping_challenges_c, eval=F, echo = T, message=FALSE, cache=T} ```{r grouping_challenges_c2, eval=F, echo = T, message=FALSE, cache=T}
flights %>% flights %>%
group_by(carrier, dest) %>% group_by(carrier, dest) %>%
summarise( summarise(
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment