Data Visualization with ggplot2 : : CHEAT SHEET
Use a geom function to represent data points, use the geom’s aesthetic properties to represent variables. Basics Geoms Each function returns a layer. GRAPHICAL PRIMITIVES TWO VARIABLES ggplot2 is based on the grammar of graphics, the idea that you can build every graph from the same a <- ggplot(economics, aes(date, unemploy)) continuous x , continuous y continuous bivariate distribution components: a data set, a coordinate system, b <- ggplot(seals, aes(x = long, y = lat)) e <- ggplot(mpg, aes(cty, hwy)) h <- ggplot(diamonds, aes(carat, price)) and geoms—visual marks that represent data points. a + geom_blank() e + geom_label(aes(label = cty), nudge_x = 1, h + geom_bin2d(binwidth = c(0.25, 500)) (Useful for expanding limits) nudge_y = 1, check_overlap = TRUE) x, y, label, x, y, alpha, color, fill, linetype, size, weight alpha, angle, color, family, fontface, hjust, F M A b + geom_curve(aes(yend = lat + 1, lineheight, size, vjust h + geom_density2d() xend=long+1,curvature=z)) - x, xend, y, yend, x, y, alpha, colour, group, linetype, size + = alpha, angle, color, curvature, linetype, size e + geom_jitter(height = 2, width = 2) x, y, alpha, color, fill, shape, size data geom coordinate plot a + geom_path(lineend="butt", linejoin="round", h + geom_hex() x = F · y = A system linemitre=1) e + geom_point(), x, y, alpha, color, fill, shape, x, y, alpha, colour, fill, size x, y, alpha, color, group, linetype, size size, stroke
To display values, map variables in the data to visual a + geom_polygon(aes(group = group)) e + geom_quantile(), x, y, alpha, color, group, properties of the geom (aesthetics) like size, color, and x x, y, alpha, color, fill, group, linetype, size linetype, size, weight continuous function and y locations. i <- ggplot(economics, aes(date, unemploy)) b + geom_rect(aes(xmin = long, ymin=lat, xmax= F M A long + 1, ymax = lat + 1)) - xmax, xmin, ymax, e + geom_rug(sides = "bl"), x, y, alpha, color, i + geom_area() ymin, alpha, color, fill, linetype, size linetype, size x, y, alpha, color, fill, linetype, size + = a + geom_ribbon(aes(ymin=unemploy - 900, e + geom_smooth(method = lm), x, y, alpha, i + geom_line() ymax=unemploy + 900)) - x, ymax, ymin, data geom coordinate plot color, fill, group, linetype, size, weight x, y, alpha, color, group, linetype, size x = F · y = A system alpha, color, fill, group, linetype, size color = F e + geom_text(aes(label = cty), nudge_x = 1, i + geom_step(direction = "hv") size = A nudge_y = 1, check_overlap = TRUE), x, y, label, x, y, alpha, color, group, linetype, size alpha, angle, color, family, fontface, hjust, LINE SEGMENTS lineheight, size, vjust common aesthetics: x, y, alpha, color, linetype, size b + geom_abline(aes(intercept=0, slope=1)) visualizing error Complete the template below to build a graph. b + geom_hline(aes(yintercept = lat)) df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2) required b + geom_vline(aes(xintercept = long)) discrete x , continuous y j <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se)) ggplot (data = ) + f <- ggplot(mpg, aes(class, hwy))
RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org • ggplot2 3.1.0 • Updated: 2018-12 Stats An alternative way to build a layer Scales Coordinate Systems Faceting A stat builds new variables to plot (e.g., count, prop). Scales map data values to the visual values of an r <- d + geom_bar() Facets divide a plot into fl cty cyl aesthetic. To change a mapping, add a new scale. r + coord_cartesian(xlim = c(0, 5)) subplots based on the
x ..count.. xlim, ylim values of one or more (n <- d + geom_bar(aes(fill = fl))) The default cartesian coordinate system discrete variables. + = aesthetic prepackaged scale-specific r + coord_fixed(ratio = 1/2) scale_ to adjust scale to use arguments ratio, xlim, ylim data stat geom coordinate plot Cartesian coordinates with fixed aspect ratio t <- ggplot(mpg, aes(cty, hwy)) + geom_point() x = x · system n + scale_fill_manual( between x and y units y = ..count.. values = c("skyblue", "royalblue", "blue", “navy"), r + coord_flip() t + facet_grid(cols = vars(fl)) Visualize a stat by changing the default stat of a geom limits = c("d", "e", "p", "r"), breaks =c("d", "e", "p", “r"), xlim, ylim facet into columns based on fl function, geom_bar(stat="count") or by using a stat name = "fuel", labels = c("D", "E", "P", "R")) Flipped Cartesian coordinates t + facet_grid(rows = vars(year)) r + coord_polar(theta = "x", direction=1 ) facet into rows based on year function, stat_count(geom="bar"), which calls a default range of title to use in labels to use breaks to use in theta, start, direction values to include legend/axis in legend/axis legend/axis geom to make a layer (equivalent to a geom function). in mapping Polar coordinates t + facet_grid(rows = vars(year), cols = vars(fl)) Use ..name.. syntax to map stat variables to aesthetics. r + coord_trans(ytrans = “sqrt") facet into both rows and columns xtrans, ytrans, limx, limy GENERAL PURPOSE SCALES Transformed cartesian coordinates. Set xtrans and t + facet_wrap(vars(fl)) geom to use stat function geommappings wrap facets into a rectangular layout Use with most aesthetics ytrans to the name of a window function. i + stat_density2d(aes(fill = ..level..), Set scales to let axis limits vary across facets geom = "polygon") scale_*_continuous() - map cont’ values to visual ones 60 π + coord_quickmap() t + facet_grid(rows = vars(drv), cols = vars(fl), variable created by stat scale_*_discrete() - map discrete values to visual ones lat π + coord_map(projection = "ortho", orientation=c(41, -74, 0))projection, orienztation, scales = "free") scale_*_identity() - use data values as visual ones xlim, ylim long x and y axis limits adjust to individual facets c + stat_bin(binwidth = 1, origin = 10) scale_*_manual(values = c()) - map discrete values to Map projections from the mapproj package x, y | ..count.., ..ncount.., ..density.., ..ndensity.. manually chosen visual ones (mercator (default), azequalarea, lagrange, etc.) "free_x" - x axis limits adjust scale_*_date(date_labels = "%m/%d"), date_breaks = "2 "free_y" - y axis limits adjust c + stat_count(width = 1) x, y, | ..count.., ..prop.. weeks") - treat data values as dates. Set labeller to adjust facet labels c + stat_density(adjust = 1, kernel = “gaussian") scale_*_datetime() - treat data x values as date times. x, y, | ..count.., ..density.., ..scaled.. Use same arguments as scale_x_date(). See ?strptime for t + facet_grid(cols = vars(fl), labeller = label_both) label formats. Position Adjustments e + stat_bin_2d(bins = 30, drop = T) fl: c fl: d fl: e fl: p fl: r x, y, fill | ..count.., ..density.. Position adjustments determine how to arrange geoms t + facet_grid(rows = vars(fl), X & Y LOCATION SCALES e + stat_bin_hex(bins=30) x, y, fill | ..count.., ..density.. that would otherwise occupy the same space. labeller = label_bquote(alpha ^ .(fl))) c d e p r e + stat_density_2d(contour = TRUE, n = 100) Use with x or y aesthetics (x shown here) ↵ ↵ ↵ ↵ ↵ x, y, color, size | ..level.. scale_x_log10() - Plot x on log10 scale s <- ggplot(mpg, aes(fl, fill = drv))
e + stat_ellipse(level = 0.95, segments = 51, type = "t") scale_x_reverse() - Reverse direction of x axis s + geom_bar(position = "dodge") scale_x_sqrt() - Plot x on square root scale Arrange elements side by side l + stat_contour(aes(z = z)) x, y, z, order | ..level.. s + geom_bar(position = "fill") Labels Stack elements on top of one another, COLOR AND FILL SCALES (DISCRETE) normalize height t + labs( x = "New x axis label", y = "New y axis label", l + stat_summary_hex(aes(z = z), bins = 30, fun = max) title ="Add a title above the plot", x, y, z, fill | ..value.. Use scale functions n <- d + geom_bar(aes(fill = fl)) e + geom_point(position = "jitter") Add random noise to X and Y position of each subtitle = "Add a subtitle below title", to update legend l + stat_summary_2d(aes(z = z), bins = 30, fun = mean) caption = "Add a caption below plot", x, y, z, fill | ..value.. n + scale_fill_brewer(palette = "Blues") element to avoid overplotting labels For palette choices: A
h + stat_summary_bin(fun.y = "mean", geom = "bar") p <- e + geom_point(aes(shape = fl, size = cyl)) r + theme_dark() r + theme_void() Without clipping (preferred) dark for contrast Empty theme e + stat_unique() p + scale_shape() + scale_size() t + coord_cartesian( p + scale_shape_manual(values = c(3:7)) xlim = c(0, 100), ylim = c(10, 20)) With clipping (removes unseen data points) t + xlim(0, 100) + ylim(10, 20) p + scale_radius(range = c(1,6)) t + scale_x_continuous(limits = c(0, 100)) + p + scale_size_area(max_size = 6) scale_y_continuous(limits = c(0, 100))
RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • [email protected] • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org • ggplot2 3.1.0 • Updated: 2018-12