The goal of this practical is to practices advanced features of ggplot2
.
The objectives of this session will be to:
ggplot2
statistical transformationsWe are going to use the diamonds
data set included in tidyverse
.
help
and View
command to explore this data set.str
command, which information are displayed ?ggplot2
statistical transformationsMore diamonds are available with high quality cuts.
ggplot2
statistical transformationsOn the x-axis, the chart displays cut, a variable from diamonds. On the y-axis, it displays count, but count is not a variable in diamonds!
The algorithm used to calculate new values for a graph is called a stat, short for statistical transformation. The figure below describes how this process works with geom_bar()
.
ggplot2
statistical transformationsYou can generally use geoms and stats interchangeably. For example, you can recreate the previous plot using stat_count()
instead of geom_bar()
:
ggplot2
statistical transformationsEvery geom has a default stat; and every stat has a default geom. This means that you can typically use geoms without worrying about the underlying statistical transformation. There are three reasons you might need to use a stat explicitly:
geom_col()
do? How is it different to geom_bar()
?stat_smooth()
compute? What parameters control its behaviour?group = 1
. Why? In other words what is the problem with these two graphs?You can colour a bar chart using either the colour
aesthetic, or, more usefully, fill
:
You can colour a bar chart using either the colour
aesthetic, or, more usefully, fill
:
You can also use fill
with another variable:
The stacking is performed by the position adjustment position
ggplot(data = diamonds,
mapping = aes(x = cut, colour = clarity)) +
geom_bar(fill = NA, position = "identity")
The stacking is performed by the position adjustment position
The stacking is performed by the position adjustment position
The stacking is performed by the position adjustment position
The stacking is performed by the position adjustment position
geom_jitter()
control the amount of jittering?geom_jitter()
with geom_count()
geom_boxplot()
? Create a visualisation of the mpg
dataset that demonstrates it.Cartesian coordinate system where the x and y positions act independently to determine the location of each point. There are a number of other coordinate systems that are occasionally helpful.
bar <- ggplot(data = diamonds) +
geom_bar(
mapping = aes(x = cut, fill = cut),
show.legend = FALSE,
width = 1
) +
theme(aspect.ratio = 1) +
labs(x = NULL, y = NULL)
3_d
3_d
ggplot(data = mpg) +
geom_jitter(mapping = aes(x = cty, y = hwy)) +
scale_y_log10() +
scale_x_log10()
coord_polar()
.labs()
do? Read the documentation.city
and highway mpg
? Why is coord_fixed()
important? What does geom_abline()
do?ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity),
position = "fill") +
coord_polar()