Show the right number
February 5, 2025
group
aestheticgapminder
data.frame.group
aestheticWhat happened?
geom_line()
joins up all the lines for each particular year in the order they appear in the dataset.
group
aestheticggplot()
does not know that the yearly observations in the data are grouped by country.group
aestheticgroup
aesthetic is usually only needed when the grouping information we need to tell ggplot()
about is not built-in to the variables being mapped.group
aestheticcolor
aesthetic, instead of group
?group
aestheticfacet_wrap( VAR1 ~ . )
or facet_wrap( . ~ VAR1 )
facet_grid( VAR1 ~ . )
: row-wise splitfacet_grid( . ~ VAR1 )
: colum-wise splitfacet_grid( VAR1 ~ VAR2 )
p +
geom_line(color="gray70",
aes(group = country)) +
geom_smooth(size = 1.1,
method = "loess",
se = FALSE) +
facet_wrap(.~ continent, nrow = 1) +
scale_y_log10(labels=scales::dollar) +
theme(axis.text.x =
element_text(
angle = 45),
axis.title.x =
element_text(
margin = margin(t = 25))) +
labs(x = "Year",
y = "GDP per capita",
title = "GDP per capita on Five Continents")
socviz
package includes the gss_sm
data frame.
gss_sm
is a dataset containing an extract from the 2016 General Social Survey.sex
and race
.facet_grid()
function is best used when you cross-classify some data by two categorical variables.If we want a chart of relative frequencies rather than counts, we will need to get the prop
statistic instead.
Our call to statistic from the aes()
function generically looks like this:
<mapping> = <..statistic..>
;<mapping> = stat(statistic)
; or<mapping> = after_stat(statistic)
.group = 1
inside the aes()
call.Let’s look at another question from the survey. The gss_sm
data contains a religion variable derived from a question asking:
color
, only the border lines of the bars will be assigned colors, and the insides will remain gray.If the gray bars look boring and we want to fill them with color instead, we can map the religion variable to fill
in addition to mapping it to x
.
If we set guides(fill = "none")
, the legend about the fill
mapping is removed.
A more appropriate use of the fill
aesthetic with geom_bar()
is to cross-classify two categorical variables.
geom_bar()
is a stacked bar chart, with counts on the y-axis.position
argument to "fill"
. position = "dodge"
to make the bars within each region of the country appear side by side.religion
, so we map religion
to the group
aesthetic.How can we have a proportional bar chart such that the sum of all bars in each bigregion is 1?
midwest
, containing information on counties in several midwestern states of the USA.geom_histogram()
function will choose a bin size for us based on a rule of thumb.bins
and also optionally the origin
of the x-axis.geom_density()
.color
(for the lines) and fill
(for the body of the density curve) for aesthetic mappings.geom_density()
, the stat_density()
function can return its default after_stat(density)
statistic, or after_stat(scaled)
, which will give a proportional density estimate.geom_bar()
, it does its calculations on the fly using stat_count()
behind the scenes to produce the counts or proportions it displays.
But often, our data is in effect already a summary table.
Let’s consider the socviz::titanic
data.frame.
fate
and percent
?geom_col()
has exactly the same as geom_bar()
except that it assumes that stat = "identity"
.
Let’s consider socviz::oecd_sum
data.frame.
diff
erence over time using color = hi_lo
.p <- ggplot(data = socviz::oecd_sum,
mapping =
aes(x = year,
y = diff,
fill = hi_lo))
p +
geom_col() +
guides(fill = "none") +
labs(x = NULL,
y = "Difference in Years",
title = "The US Life Expectancy Gap",
subtitle = "Difference between US and OECD
average life expectancies, 1960-2015",
caption = "Data: OECD. After a chart by Christopher Ingraham,
Washington Post, December 27th 2017.") +
theme_minimal()