Lecture 18

Aesthetic Mappings; Clutter is Your Enemy!; Design with Colorblind in Mind

Byeong-Hak Choe

SUNY Geneseo

October 31, 2024

Happy Halloween, everyone! πŸŽƒπŸ‘»πŸŽƒπŸ‘»πŸŽƒπŸ‘»πŸŽƒπŸ‘»πŸŽƒπŸ‘»πŸŽƒπŸ‘»

Aesthetic Mappings

Aesthetic Mappings

  • In the plot above, one group of points (highlighted in red) seems to fall outside of the linear trend.

    • How can you explain these cars? Are those hybrids?

Aesthetic Mappings

  • An aesthetic is a visual property (e.g., size, shape, color) of the objects (e.g., class) in your plot.

  • You can display a point in different ways by changing the values of its aesthetic properties.

Aesthetic Mappings

Adding a color to the plot

ggplot( data = mpg,
        mapping = 
          aes(x = displ,
              y = hwy, 
              color = class) ) + 
  geom_point()

Aesthetic Mappings

Adding a shape to the plot

ggplot( data = mpg,
        mapping = 
          aes(x = displ,
              y = hwy, 
              shape = class) ) + 
  geom_point()

Aesthetic Mappings

Adding a size to the plot

ggplot( data = mpg,
        mapping = 
          aes(x = displ,
              y = hwy, 
              size = class) ) + 
  geom_point()

Aesthetic Mappings

Specifying a color to the plot, manually

ggplot(data = mpg,
       mapping = 
         aes(x = displ, 
             y = hwy) ) + 
  geom_point(color = "blue")

Aesthetic Mappings

Specifying an color to the plot, manually

ggplot(data = mpg,
       mapping = 
         aes(x = displ, 
             y = hwy) ) + 
  geom_smooth(color = "darkorange") 

Aesthetic Mappings

Specifying an fill to the plot, manually

ggplot(data = mpg,
       mapping = 
         aes(x = displ, 
             y = hwy) ) + 
  geom_smooth(color = "darkorange",
              fill = "darkorange") 

  • In general, each geom_*() has a different set of aesthetic parameters.
    • E.g., fill is available for geom_smooth(), not geom_point().

Aesthetic Mappings

Specifying a size to the plot, manually

ggplot(data = mpg,
       mapping = 
         aes(x = displ, 
             y = hwy) ) + 
  geom_point(size = 3)  # in *mm*

Aesthetic Mappings

Specifying an alpha to the plot, manually

ggplot(data = mpg,
       mapping = 
         aes(x = displ, 
             y = hwy) ) + 
  geom_point(alpha = .3) 

  • We’ve done this to address the issue of overplotting in the scatterplot.

Aesthetic Mappings

Mapping Aesthetics vs. Setting Them Manually

Aesthetic Mapping

  • Links data variables to visible properties on the graph
  • Different categories β†’ different colors or shapes

Setting Aesthetics Manually

  • Customize visual properties directly in geom_*() outside of aes()
  • Useful for setting fixed colors, sizes, or transparency unrelated to data variables

Clutter is Your Enemy!

Visualization Principle

Cognitive Load

  • Every element added to a page or screen demands cognitive load
    • Cognitive load: the mental effort needed to process information
  • Extra elements = extra brain power for the audience to process
    • Example: Overly complex slides or graphs can overwhelm viewers
  • Excessive load can lead to disengagement and confusion
    • Goal: a graphic should display as much information as it can, with the lowest possible cognitive strain to the viewer.

Visualization Principle

Why Reduce Clutter?

  • Clutter: Visual elements that occupy space but do not improve understanding
  • Clutter makes information harder to process and can confuse the viewer
  • Strive for clarity: Simplified visuals encourage engagement and improve comprehension
    • Less clutter = clearer message, more focused audience
  • Tips
    • Avoid having the data all skewed to one side or the other of your graph.
    • Avoid too many superimposed elements, such as too many curves (>4) in the same graphing space.

Clutter is Your Enemy!

  • Which one do you prefer?

Log Transformation: Reducing Clutter in Scatterplots

  • Problem: When data points are densely packed, it can obscure insights
    • Often due to extreme values or skewed distributions
    • Dense clusters of points become visual clutter, hiding patterns
  • Solution: Apply a log transformation!
    • Reduces clutter: Points become evenly distributed across the plot
      • Prevents overlapping data points and enhances readability
    • Reduces influence of outliers, clarifying patterns
      • Improves interpretability by revealing underlying relationships
    • Supports focused, informative data communication without extra elements

Log Transformation: Reducing Clutter in Scatterplots

A Little Bit of Math for Logarithm

  • The logarithm function, \(y = \log_{b}\,(\,x\,)\), looks like ….

Log Transformation: Reducing Clutter in Scatterplots

A Little Bit of Math for Logarithm

  • \(\log_{10}\,(\,100\,)\): the base \(10\) logarithm of \(100\) is \(2\), because \(10^{2} = 100\)

  • \(\log_{e}\,(\,x\,)\): the base \(e\) logarithm is called the natural log, where \(e = 2.718\cdots\) is the mathematical constant, the Euler’s number.

  • \(\log\,(\,x\,)\) or \(\ln\,(\,x\,)\): the natural log of \(x\) .

  • \(\log_{e}\,(\,7.389\cdots\,)\): the natural log of \(7.389\cdots\) is \(2\), because \(e^{2} = 7.389\cdots\).

  • In R,

    • log(x): log of x with base e, called natural log.
    • log10(x): log of x with base 10.

Log Transformation: Reducing Clutter in Scatterplots

The Use of Logarithm: Wide Range of Skewed Data

  1. We should consider using a log scale when a variable is heavily skewed.
  • It can help visualize both small and large values effectively.

Log Transformation: Reducing Clutter in Scatterplots

The Use of Logarithm: Percentage Change

  1. Consider using a logarithmic scale when percentage changes are more meaningful than changes in absolute units.
  • Percentage changes are widely used in various fields to better interpret relative differences. Examples include:
    • Stock prices: Percentage changes reflect the magnitude of gains or losses relative to the initial price.
    • Housing prices: Percentage changes show market trends consistently across different neighborhoods or regions.
    • GDP growth: Expressed as a percentage to indicate economic performance over time.
    • Income levels: A $1,000 increase has a greater impact on a lower-income individual compared to someone with a significantly higher income.

Log Transformation: Reducing Clutter in Scatterplots

The Use of Logarithm: Percentage Change

  • For a small change in variable \(x\) from \(x_{0}\) to \(x_{1}\), we have:

\[ \Delta \log(x) = \log(x_{1}) - \log(x_{0}) \approx \frac{x_{1} - x_{0}}{x_{0}} = \frac{\Delta x}{x_{0}}. \]

  • This shows that a log transformation effectively represents percentage change!

Log Transformation: Reducing Clutter in Scatterplots

The Use of Logarithm: Percentage Change

  • One percent increase in GDP per capita is associated with an increase in life expectancy by 0.084 year (30.66 days)!

Design with Colorblind in Mind

Visualization Principle

Design with Colorblind in Mind

Types of Colorblindness

  • Roughly 8% of men and half a percent of women are colorblind.

  • There are several techniques to make visualization more colorblind-friendly:

    1. Use color palettes that are colorblind-friendly
    2. Use shape for scatterplots and linetype for line charts
    3. Have some additional visual cue to set the important numbers apart

Colorblind-Friendly Color Palettes

ggthemes package

  • The ggthemes package provides various themes for ggplot2 visualization:
    • Accessible color palettes, including those optimized for colorblind viewers.
      • E.g., scale_color_colorblind(), scale_color_tableau()
    • Unique, predefined themes for specific styles
      • E.g., theme_economist(), theme_wsj()

Colorblind-Friendly Color Palettes

ggthemes::scale_color_colorblind()

ggplot( data = mpg,
        mapping = 
          aes(x = displ,
              y = hwy, 
              color = class) ) + 
  geom_point(size = 3) +
  scale_color_colorblind()

  • When mapping color in aes(), we can use scale_color_*()

Colorblind-Friendly Color Palettes

ggthemes::scale_color_tableau()

ggplot( data = mpg,
        mapping = 
          aes(x = displ,
              y = hwy, 
              color = class) ) + 
  geom_point(size = 3) +
  scale_color_tableau()

  • scale_color_tableau() provides color palettes used in Tableau.

ggplot Themes

ggthemes::theme_economist()

ggplot( data = mpg,
        mapping = 
          aes(x = displ,
              y = hwy, 
              color = class) ) + 
  geom_point(size = 3) +
  scale_color_colorblind() +
  theme_economist()

  • theme_economist() approximates the style of The Economist.

ggplot Themes

ggthemes::theme_wsj()

ggplot( data = mpg,
        mapping = 
          aes(x = displ,
              y = hwy, 
              color = class) ) + 
  geom_point(size = 3) +
  scale_color_colorblind() +
  theme_wsj()

  • theme_wsj() approximates the style of The Wall Street Journal.