Data Visualization with Ggplot2 : : CHEAT SHEET
Total Page:16
File Type:pdf, Size:1020Kb
Data Visualization with ggplot2 : : CHEAT SHEET Use a geom function to represent data points, use the geom’s aesthetic properties to represent variables. Basics Geoms Each function returns a layer. GRAPHICAL PRIMITIVES TWO VARIABLES ggplot2 is based on the grammar of graphics, the idea that you can build every graph from the same a <- ggplot(economics, aes(date, unemploy)) continuous x , continuous y continuous bivariate distribution components: a data set, a coordinate system, b <- ggplot(seals, aes(x = long, y = lat)) e <- ggplot(mpg, aes(cty, hwy)) h <- ggplot(diamonds, aes(carat, price)) and geoms—visual marks that represent data points. a + geom_blank() e + geom_label(aes(label = cty), nudge_x = 1, h + geom_bin2d(binwidth = c(0.25, 500)) (Useful for expanding limits) nudge_y = 1, check_overlap = TRUE) x, y, label, x, y, alpha, color, fill, linetype, size, weight alpha, angle, color, family, fontface, hjust, F M A b + geom_curve(aes(yend = lat + 1, lineheight, size, vjust h + geom_density2d() xend=long+1,curvature=z)) - x, xend, y, yend, x, y, alpha, colour, group, linetype, size + = alpha, angle, color, curvature, linetype, size e + geom_jitter(height = 2, width = 2) x, y, alpha, color, fill, shape, size data geom coordinate plot a + geom_path(lineend="butt", linejoin="round", h + geom_hex() x = F · y = A system linemitre=1) e + geom_point(), x, y, alpha, color, fill, shape, x, y, alpha, colour, fill, size x, y, alpha, color, group, linetype, size size, stroke To display values, map variables in the data to visual a + geom_polygon(aes(group = group)) e + geom_quantile(), x, y, alpha, color, group, properties of the geom (aesthetics) like size, color, and x x, y, alpha, color, fill, group, linetype, size linetype, size, weight continuous function and y locations. i <- ggplot(economics, aes(date, unemploy)) b + geom_rect(aes(xmin = long, ymin=lat, xmax= F M A long + 1, ymax = lat + 1)) - xmax, xmin, ymax, e + geom_rug(sides = "bl"), x, y, alpha, color, i + geom_area() ymin, alpha, color, fill, linetype, size linetype, size x, y, alpha, color, fill, linetype, size + = a + geom_ribbon(aes(ymin=unemploy - 900, e + geom_smooth(method = lm), x, y, alpha, i + geom_line() ymax=unemploy + 900)) - x, ymax, ymin, data geom coordinate plot color, fill, group, linetype, size, weight x, y, alpha, color, group, linetype, size x = F · y = A system alpha, color, fill, group, linetype, size color = F e + geom_text(aes(label = cty), nudge_x = 1, i + geom_step(direction = "hv") size = A nudge_y = 1, check_overlap = TRUE), x, y, label, x, y, alpha, color, group, linetype, size alpha, angle, color, family, fontface, hjust, LINE SEGMENTS lineheight, size, vjust common aesthetics: x, y, alpha, color, linetype, size b + geom_abline(aes(intercept=0, slope=1)) visualizing error Complete the template below to build a graph. b + geom_hline(aes(yintercept = lat)) df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2) required b + geom_vline(aes(xintercept = long)) discrete x , continuous y j <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se)) ggplot (data = <DATA> ) + f <- ggplot(mpg, aes(class, hwy)) <GEOM_FUNCTION> (mapping = aes( <MAPPINGS> ), b + geom_segment(aes(yend=lat+1, xend=long+1)) j + geom_crossbar(fatten = 2) b + geom_spoke(aes(angle = 1:1155, radius = 1)) f + geom_col(), x, y, alpha, color, fill, group, x, y, ymax, ymin, alpha, color, fill, group, linetype, stat = <STAT> , position = <POSITION> ) + Not linetype, size size required, <COORDINATE_FUNCTION> + sensible j + geom_errorbar(), x, ymax, ymin, alpha, color, f + geom_boxplot(), x, y, lower, middle, upper, group, linetype, size, width (also <FACET_FUNCTION> + defaults ymax, ymin, alpha, color, fill, group, linetype, supplied ONE VARIABLE continuous geom_errorbarh()) <SCALE_FUNCTION> + c <- ggplot(mpg, aes(hwy)); c2 <- ggplot(mpg) shape, size, weight j + geom_linerange() <THEME_FUNCTION> f + geom_dotplot(binaxis = "y", stackdir = x, ymin, ymax, alpha, color, group, linetype, size c + geom_area(stat = "bin") "center"), x, y, alpha, color, fill, group x, y, alpha, color, fill, linetype, size j + geom_pointrange() ggplot(data = mpg, aes(x = cty, y = hwy)) Begins a plot f + geom_violin(scale = "area"), x, y, alpha, color, x, y, ymin, ymax, alpha, color, fill, group, linetype, that you finish by adding layers to. Add one geom c + geom_density(kernel = "gaussian") fill, group, linetype, size, weight shape, size function per layer. x, y, alpha, color, fill, group, linetype, size, weight aesthetic mappings data geom c + geom_dotplot() maps qplot(x = cty, y = hwy, data = mpg, geom = “point") x, y, alpha, color, fill data <- data.frame(murder = USArrests$Murder, Creates a complete plot with given data, geom, and discrete x , discrete y state = tolower(rownames(USArrests))) mappings. Supplies many useful defaults. c + geom_freqpoly() x, y, alpha, color, group, g <- ggplot(diamonds, aes(cut, color)) map <- map_data("state") linetype, size k <- ggplot(data, aes(fill = murder)) last_plot() Returns the last plot c + geom_histogram(binwidth = 5) x, y, alpha, g + geom_count(), x, y, alpha, color, fill, shape, k + geom_map(aes(map_id = state), map = map) ggsave("plot.png", width = 5, height = 5) Saves last plot color, fill, linetype, size, weight size, stroke + expand_limits(x = map$long, y = map$lat), as 5’ x 5’ file named "plot.png" in working directory. map_id, alpha, color, fill, linetype, size Matches file type to file extension. c2 + geom_qq(aes(sample = hwy)) x, y, alpha, color, fill, linetype, size, weight THREE VARIABLES seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2))l <- ggplot(seals, aes(long, lat)) discrete l + geom_contour(aes(z = z)) l + geom_raster(aes(fill = z), hjust=0.5, vjust=0.5, d <- ggplot(mpg, aes(fl)) x, y, z, alpha, colour, group, linetype, interpolate=FALSE) size, weight x, y, alpha, fill d + geom_bar() x, alpha, color, fill, linetype, size, weight l + geom_tile(aes(fill = z)), x, y, alpha, color, fill, linetype, size, width RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • [email protected] • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org • ggplot2 3.1.0 • Updated: 2018-12 Stats An alternative way to build a layer Scales Coordinate Systems Faceting A stat builds new variables to plot (e.g., count, prop). Scales map data values to the visual values of an r <- d + geom_bar() Facets divide a plot into fl cty cyl aesthetic. To change a mapping, add a new scale. r + coord_cartesian(xlim = c(0, 5)) subplots based on the x ..count.. xlim, ylim values of one or more (n <- d + geom_bar(aes(fill = fl))) The default cartesian coordinate system discrete variables. + = aesthetic prepackaged scale-specific r + coord_fixed(ratio = 1/2) scale_ to adjust scale to use arguments ratio, xlim, ylim data stat geom coordinate plot Cartesian coordinates with fixed aspect ratio t <- ggplot(mpg, aes(cty, hwy)) + geom_point() x = x · system n + scale_fill_manual( between x and y units y = ..count.. values = c("skyblue", "royalblue", "blue", “navy"), r + coord_flip() t + facet_grid(cols = vars(fl)) Visualize a stat by changing the default stat of a geom limits = c("d", "e", "p", "r"), breaks =c("d", "e", "p", “r"), xlim, ylim facet into columns based on fl function, geom_bar(stat="count") or by using a stat name = "fuel", labels = c("D", "E", "P", "R")) Flipped Cartesian coordinates t + facet_grid(rows = vars(year)) r + coord_polar(theta = "x", direction=1 ) facet into rows based on year function, stat_count(geom="bar"), which calls a default range of title to use in labels to use breaks to use in theta, start, direction values to include legend/axis in legend/axis legend/axis geom to make a layer (equivalent to a geom function). in mapping Polar coordinates t + facet_grid(rows = vars(year), cols = vars(fl)) Use ..name.. syntax to map stat variables to aesthetics. r + coord_trans(ytrans = “sqrt") facet into both rows and columns xtrans, ytrans, limx, limy GENERAL PURPOSE SCALES Transformed cartesian coordinates. Set xtrans and t + facet_wrap(vars(fl)) geom to use stat function geommappings wrap facets into a rectangular layout Use with most aesthetics ytrans to the name of a window function. i + stat_density2d(aes(fill = ..level..), Set scales to let axis limits vary across facets geom = "polygon") scale_*_continuous() - map cont’ values to visual ones 60 π + coord_quickmap() t + facet_grid(rows = vars(drv), cols = vars(fl), variable created by stat scale_*_discrete() - map discrete values to visual ones lat π + coord_map(projection = "ortho", orientation=c(41, -74, 0))projection, orienztation, scales = "free") scale_*_identity() - use data values as visual ones xlim, ylim long x and y axis limits adjust to individual facets c + stat_bin(binwidth = 1, origin = 10) scale_*_manual(values = c()) - map discrete values to Map projections from the mapproj package x, y | ..count.., ..ncount.., ..density.., ..ndensity.. manually chosen visual ones (mercator (default), azequalarea, lagrange, etc.) "free_x" - x axis limits adjust scale_*_date(date_labels = "%m/%d"), date_breaks = "2 "free_y" - y axis limits adjust c + stat_count(width = 1) x, y, | ..count.., ..prop.. weeks") - treat data values as dates. Set labeller to adjust facet labels c + stat_density(adjust = 1, kernel = “gaussian") scale_*_datetime() - treat data x values as date times. x, y, | ..count.., ..density.., ..scaled.. Use same arguments as scale_x_date(). See ?strptime for t + facet_grid(cols = vars(fl), labeller = label_both) label formats. Position Adjustments e + stat_bin_2d(bins =