Data Visualization - Aesthetic Mappings and Facets
February 29, 2024
<-
<-
command + ⬆️/⬇️/⬅️/➡️
shift + ⬆️/⬇️/⬅️/➡️
command + shift + ⬆️/⬇️/⬅️/➡️:
command + PgUp/PgDn
shift + PgUp/PgDn
command + shift + PgUp/PgDn:
Ctrl + ⬆️/⬇️/⬅️/➡️
Shift + ⬆️/⬇️/⬅️/➡️
Ctrl + shift + ⬆️/⬇️/⬅️/➡️:
Ctrl + PgUp/PgDn
Shift + PgUp/PgDn
Ctrl + Shift + PgUp/PgDn:
data.frame
, a geom
function, or a collection of mappings such as x = VAR_1
and y = VAR_2
.ggplot
workflowggplot()
ggplot2
graphics is to put the +
in the wrong place.
In the plot above, one group of points (highlighted in red) seems to fall outside of the linear trend.
An aesthetic is a visual property (e.g., size
, shape
, color
) of the objects (e.g., class
) in our plot.
We can display a point in different ways by changing the values of its aesthetic properties.
color
to the plotshape
to the plotsize
to the plotalpha
(transparency) to the plotalpha
(transparency) to the plotmpg
data.frame?Many points overlap each other.
When points overlap, it’s hard to know how many data points are at a particular location.
Overplotting can obscure patterns and outliers, leading to potentially misleading conclusions.
We can set a transparency level (alpha
) between 0 (full transparency) and 1 (no transparency).
alpha
color
to the plot
geom_
function; i.e. it goes outside of aes()
.
color
as a character string.size
of a point in mm.shape
of a point as a number, as shown above.color
to the plot?factor
or character
.
as.factor(variable)
to make a variable factor
.numeric
.
as.numeric(variable)
to make a variable numeric
.For data visualization, integer
-type variables could be treated as either categorical or continuous, depending on the context of analysis.
If the values of an integer-type variable means an intensity or an order, the integer variable could be continuous.
If not, the integer variable is categorical.
facet_wrap( VAR ~ . )
facet_wrap()
.facet_grid( VAR_ROW ~ VAR_COL )
To facet our plot on the combination of two variables, add facet_grid( VAR_ROW ~ VAR_COL )
to our plot call.
The first argument of facet_grid()
is also a formula.
~
.facet_grid( VAR_ROW ~ VAR_COL )
scales
in Facettingscales
in facet_*()
is whether scales is
"fixed"
, the default),"free_x"
, "free_y"
), or"free"
).scales
in FacettingHow are these two plots similar?
geom_*()
is the geometrical object that a plot uses to represent data.
geom_bar()
or geom_col()
;geom_histogram()
or geom_freqpoly()
;geom_line()
;geom_boxplot()
;geom_point()
;geom_smooth()
;geom_*()
to plot the same data.Every geom function in ggplot2
takes a mapping argument.
However, not every aesthetic works with every geom
.
shape
of a point, but we could not set the shape
of a line;linetype
of a line.geom_smooth()
geom_smooth(method = lm)
method = lm
manually in geom_smooth()
gives a straight line that fits into data points.geom_smooth(group = CATEGORICAL_VAR)
group
aesthetic to a categorical variable to draw multiple objects.
ggplot2
will draw a separate object for each unique value of the grouping variable.geom_smooth(group = CATEGORICAL_VAR)
geom_smooth(group = CATEGORICAL_VAR)
ggplot2
will automatically group the data for these geoms
whenever we map an aesthetic to a categorical variable (as in the linetype
example).geom_*()
functions to ggplot()
:geom_point()
, geom_smooth()
, and geom_smooth(method = lm)
together is an excellent option to visualize the relationship between the two variables.ggplot2
will treat them as local mappings for the layer.We can use the same idea to specify different data for each layer.
Here, our smooth line displays just a subset of the mpg
data.frame, the subcompact
cars.
filter()
is the tidyverse-way to filter observations in a data.frame.The local data argument in geom_smooth()
overrides the global data argument in ggplot()
for that layer only.
se
) tells us how much the predicted values from a model might differ from the actual values we’re trying to predict.