Plots gallery

This article is a How-to-plot page that covers the most frequently used charts. It is using Profinit color theme, of course. We start with displaying distributions, then proportions and relations. Each topic has an initial setup followed by couple of collapsed sections describing various use-cases. Code for ggplot2 is provided, some of the charts are covered by base R graphics code, too.

In case of any bug/edits/contributions feel free to either create a pull-request or raise an issue in the issue tracker.

It is not purpose of this page to cover all the use-cases, though. For more detailed guide how to design a good chart take a look on the Fundamentals of data visualization (either online or in Profinit’s library).

Setup

As a toy dataset, let’s use the dplyr::starwars dataset of Star Wars characters. Be ware, it contains information from the first 7 films in the series.

# load packages
library(tidyverse)
library(profiplots)
library(ggalluvial)
library(ggrepel)

# set the aesthetics (theme) of plots
profiplots::set_theme(pal_name = "blue-red", pal_name_discrete="discrete")

movie_series <- c(
  "The Phantom Menace",
  "Attack of the Clones",
  "Revenge of the Sith",
  "A New Hope",
  "The Empire Strikes Back",
  "Return of the Jedi",
  "The Force Awakens"
)

get_movie_order <- function(movie_names) {
  purrr::map_dbl(movie_names, function(mn) which(mn == movie_series))
}

# prepare dataset: Star Wars characters
sw <- 
  dplyr::starwars %>% 
  mutate(
    bmi = mass/(height/100)^2,
    is_droid = forcats::fct_explicit_na(if_else(sex == "none", "Droid", "Other"), "N/A"),
    first_film = purrr::map_chr(films, function(movies) {
      movie_ord = get_movie_order(movies)
      movies[which.min(movie_ord)]
    }),
    first_film = factor(first_film, labels = movie_series, ordered = TRUE),
    been_in_jedi = purrr::map_lgl(films, ~"Return of the Jedi" %in% .),
    n_films = purrr::map_dbl(films, length)
  )

Distributions

Barplot

Use-case: Visualization of discrete variables distributions.

Histogram

Use case: Visualization of continuous variables distribution.

KDE

Use case: Continuous variables distribution for skilled audience. Esp. useful in case of multiple subgroups to be plotted on one chart.

Proportions

Single Variable

Use-case: Visualizing proportions of category levels. (Avoiding pie-chart).

Two variables, proportion of two categories

Use-case: Visualizing proportions of a category levels in different subgroups based on another variable.

In this case, the best way is to use side-by-side stacked barplots (with fill option).

Two variables, proportion of 3+ categories

Relations

Scatterplot

Use-case: Visualizing relationship of two numeric variables. Visualizing trend (target ~ regresor).

2D Density

Use-case: Visualizing relationship of two numeric variables with too many observations.

With too many observations, the details are hidden in the tons of spots. You can try to set transparency low enough and use scatterplot anyway (see above). But it’s quite convenient to rely on 2D Density plot.

Heatmap

Use-case: Visualizing relationship of two numeric variables. Visualizing trend (target ~ regresor).

Extra: Odds ratio visualization