Skip to content
Snippets Groups Projects
Commit c9a59b1c authored by Gilquin's avatar Gilquin
Browse files

fix: correct exercices and typos

* unified the vowels definition (in some exercices the "y" character was missing)
* moved one exercice where the repetition special character "+" was used before the section introducing it
* changed "Grouping" section title by "Capture group"
* illustrated the difference between functions str_extract and str_extract_all
parent 62a49c9e
No related branches found
No related tags found
No related merge requests found
Pipeline #2673 failed
...@@ -254,6 +254,12 @@ str_view(x, "^apple$") ...@@ -254,6 +254,12 @@ str_view(x, "^apple$")
d. Have seven letters or more. d. Have seven letters or more.
Since this list is long, you might want to use the match argument to `str_view()` to show only the matching or non-matching words. Since this list is long, you might want to use the match argument to `str_view()` to show only the matching or non-matching words.
3. What is the difference between these two commands:
```{r, str_viewanchorsdiff, eval=F, cache=T}
str_view(stringr::words, "(or|ing$)")
str_view(stringr::words, "(or|ing)$")
```
::: :::
<details><summary>Solution</summary> <details><summary>Solution</summary>
...@@ -261,7 +267,7 @@ str_view(x, "^apple$") ...@@ -261,7 +267,7 @@ str_view(x, "^apple$")
1. We would need the pattern `"\\$\\^\\$"` 1. We would need the pattern `"\\$\\^\\$"`
<p></p> </p><p>
2. 2.
a. start with "y": `"^y"` a. start with "y": `"^y"`
...@@ -269,6 +275,10 @@ str_view(x, "^apple$") ...@@ -269,6 +275,10 @@ str_view(x, "^apple$")
c. three letters long: `"^...$"` c. three letters long: `"^...$"`
d. seven letters or more: `"......."` d. seven letters or more: `"......."`
</p><p>
3. `"(or|ing$)"` matches words that either contain "or" or end with "ing", while `"(or|ing)$"` matches words that end either with "or" or "ing".
</p> </p>
</details> </details>
...@@ -301,9 +311,8 @@ str_view(c("grey", "gray"), "gr(e|a)y") ...@@ -301,9 +311,8 @@ str_view(c("grey", "gray"), "gr(e|a)y")
Create regular expressions to find all words that: Create regular expressions to find all words that:
1. Start with a vowel. 1. Start with a vowel.
2. That only contains consonants (Hint: thinking about matching "not"-vowels). 2. End with "ed", but not with "eed".
3. End with "ed", but not with "eed". 3. End with "ing" or "ise".
4. End with "ing" or "ise".
::: :::
...@@ -311,17 +320,10 @@ Create regular expressions to find all words that: ...@@ -311,17 +320,10 @@ Create regular expressions to find all words that:
<p> <p>
1. start with a vowel: `"^[aeiouy]"` 1. start with a vowel: `"^[aeiouy]"`
2. decomposition:
- start with a consonant: `"^[^aeiouy]"`
- contains one or more consonant: `"[^aeiouy]+"`
- end with a consonant: `"[^aeiouy]$"`
result is: `"^[^aeiouy][^aeiouy]+[^aeiouy]$"`.
3. `"[^e]ed$"` 2. `"[^e]ed$"`
4. `"(ing|ise)$"` 3. `"(ing|ise)$"`
</p> </p>
</details> </details>
...@@ -369,6 +371,7 @@ str_view(x, "C{2,3}") ...@@ -369,6 +371,7 @@ str_view(x, "C{2,3}")
a. Start with three consonants. a. Start with three consonants.
b. Have three or more vowels in a row. b. Have three or more vowels in a row.
c. Have two or more vowel-consonant pairs in a row. c. Have two or more vowel-consonant pairs in a row.
d. Contain only consonants (Hint: thinking about matching "not"-vowels).
::: :::
...@@ -385,15 +388,16 @@ str_view(x, "C{2,3}") ...@@ -385,15 +388,16 @@ str_view(x, "C{2,3}")
<p></p> <p></p>
2. 2.
a. `"^[^aeoiouy]{3}"` a. `"^[^aeiouy]{3}"`
b. `"[aeiou]{3,}"` b. `"[aeiouy]{3,}"`
c. `"([aeiou][^aeiou]){2,}"` c. `"([aeiouy][^aeiouy]){2,}"`
d. `"^[^aeiouy]+$"`
</p> </p>
</details> </details>
### Grouping ### Capture group
You learned about parentheses as a way to disambiguate complex expressions. Parentheses also create a numbered capturing group (number 1, 2 etc.). A capturing group stores the part of the string matched by the part of the regular expression inside the parentheses. You can refer to the same text as previously matched by a capturing group with back references, like `\1`, `\2` etc. You learned about parentheses as a way to disambiguate complex expressions. Parentheses also create a numbered capturing group (number 1, 2 etc.). A capturing group stores the part of the string matched by the part of the regular expression inside the parentheses. You can refer to the same text as previously matched by a capturing group with back references, like `\1`, `\2` etc.
...@@ -459,7 +463,7 @@ sum(str_detect(words, "^t")) ...@@ -459,7 +463,7 @@ sum(str_detect(words, "^t"))
What proportion of common words ends with a vowel? What proportion of common words ends with a vowel?
```{r str_view_match_c, eval=T, cache=T} ```{r str_view_match_c, eval=T, cache=T}
mean(str_detect(words, "[aeiou]$")) mean(str_detect(words, "[aeiouy]$"))
``` ```
### Combining detection ### Combining detection
...@@ -467,25 +471,21 @@ mean(str_detect(words, "[aeiou]$")) ...@@ -467,25 +471,21 @@ mean(str_detect(words, "[aeiou]$"))
Find all words containing at least one vowel, and negate Find all words containing at least one vowel, and negate
```{r str_view_detection, eval=T, cache=T} ```{r str_view_detection, eval=T, cache=T}
no_vowels_1 <- !str_detect(words, "[aeiou]") no_vowels_1 <- !str_detect(words, "[aeiouy]")
``` ```
Find all words consisting only of consonants (non-vowels) Find all words consisting only of consonants (non-vowels)
```{r str_view_detection_b, eval=T, cache=T} ```{r str_view_detection_b, eval=T, cache=T}
no_vowels_2 <- str_detect(words, "^[^aeiou]+$") no_vowels_2 <- str_detect(words, "^[^aeiouy]+$")
identical(no_vowels_1, no_vowels_2) identical(no_vowels_1, no_vowels_2)
``` ```
### With tibble ### With tibble
```{r str_detecttibble, eval=T, cache=T} ```{r str_detecttibble, eval=T, cache=T}
df <- tibble( df <- tibble(word = words) %>% mutate(i = rank(word))
word = words, df %>% filter(str_detect(word, "x$"))
i = seq_along(word)
)
df %>%
filter(str_detect(word, "x$"))
``` ```
### Extract matches ### Extract matches
...@@ -502,14 +502,15 @@ colour_match <- str_c(colours, collapse = "|") ...@@ -502,14 +502,15 @@ colour_match <- str_c(colours, collapse = "|")
colour_match colour_match
``` ```
### Extract matches We can select the sentences that contain a colour, and then extract the first colour from each sentence:
We can select the sentences that contain a colour, and then extract the colour to figure out which one it is:
```{r color_regex_extract, eval=T, cache=T} ```{r color_regex_extract, eval=T, cache=T}
has_colour <- str_subset(sentences, colour_match) sentences %>% str_subset(colour_match) %>% str_extract(colour_match)
matches <- str_extract(has_colour, colour_match) ```
head(matches)
We can also extract all colours from each selected sentence, as a list of vectors:
```{r color_regex_extract_all, eval=F, cache=T}
sentences %>% str_subset(colour_match) %>% str_extract_all(colour_match)
``` ```
### Grouped matches ### Grouped matches
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment