Insert a code chunk and load 2 important libraries
Insert a new code chunk- Write source(here("scripts", "01_import.R")) in the chunk
Discuss what the code does and run it
Write a short headline to each code chunk
# Load librarieslibrary(tidyverse) # Data wrangling and plotslibrary(here) # File control in project# Load the cleaned soldiers datasetsource(here("scripts", "01_import.R"))
1 Guess the output
Look at and discuss the code below.
You’ll need to guess a little because you haven’t seen all the datasets and functions yet, but use your common sense! See if you can predict what each plot will look like before running the code.
Use one or more of: ?mpg, head(), glimpse(), summary(), and/or skim()
?mpgmpg %>%head()mpg %>%glimpse()mpg %>% skimr::skim() # or library(skimr) and then mpg %>% skim()
2.2 Experiment with the colour, shape, and size aesthetics.
Use this basic code, and add/change the colour, shape, and size aesthetics.
ggplot(mpg, aes(cty, hwy)) +geom_point()
What happens when you map continuous variables to one or more of the aesthetics?
What about categorical variables?
What happens when you use multiple aesthetics in a plot?
## Examples# Categorialggplot(mpg, aes(cty, hwy, colour = class)) +geom_point()# Continuous ggplot(mpg, aes(cty, hwy, size = displ)) +geom_point()# Continuous ggplot(mpg, aes(cty, hwy, color = hwy)) +# Notice hwy is mapped to both y axis and colorgeom_point()## A continuous variable doesn't work for shapeggplot(mpg, aes(cty, hwy, shape = displ)) +geom_point()# Multiple Categorical - a legend for each aesthetic is createdggplot(mpg, aes(cty, hwy, colour = class, shape = fl)) +geom_point()
2.3 What’s gone wrong with this code? Why are the points not blue?
ggplot(data = mpg) +geom_point(mapping =aes(x = displ, y = hwy, color ="blue"))
# If you want all the points to be colored blute, then color = "blue" must be placed outside the aes() function.ggplot(data = mpg) +geom_point(mapping =aes(x = displ, y = hwy),color ="blue")
3geom_point() and geom_smooth()
Use your soldiers dataset
Explore the relationship between heightcm and weightkg using geom_point()
soldiers %>%ggplot(aes(x = heightcm, y = weightkg))+geom_point()
Color the points according to BMI
soldiers %>%ggplot(aes(x = heightcm, y = weightkg, color = BMI))+geom_point()
Color the points according to category
soldiers %>%ggplot(aes(x = heightcm, y = weightkg, color = category))+geom_point()
Explore the relationship between heightcm and weightkg using geom_point() and geom_smooth()
soldiers %>%ggplot(aes(x = heightcm, y = weightkg))+geom_point()+geom_smooth(method ="lm")
color by sex
soldiers %>%ggplot(aes(x = heightcm, y = weightkg, color = sex))+geom_point()+geom_smooth(method ="lm")
What arguments can you use to control how many rows and columns appear in the output?
What does the scales argument in facet_wrap() do? When might you use it?
Explore the relationship between heightcm and weightkg
Use geom_point() and geom_smooth()
facet by sex
color the points by category
soldiers %>%ggplot(aes(x = heightcm, y = weightkg))+geom_point(aes(color = category))+geom_smooth(method ="lm")+facet_wrap(~sex)
Use geom_bar() to explore how many soldiers of each race there is
soldiers %>%ggplot(aes(x = race))+geom_bar()
Use geom_bar() to explore how many soldiers are at each Installation
soldiers %>%ggplot(aes(x = Installation))+geom_bar()# OR - Whats the difference?soldiers %>%ggplot(aes(y = Installation))+geom_bar()
Use the fill aestetic to color the Installation bars according to race
soldiers %>%ggplot(aes(y = Installation, fill = race))+geom_bar()
Change something so that you can visually evaluate if race is equally distributed across the Installations
soldiers %>%ggplot(aes(y = Installation, fill = race))+geom_bar(position ="fill", # All bars have full length/height - this makes it easier to see proportional differences between groupscolor ="black"# Adds a black line around each box. )
Use geom_boxplot to explore weightkg across the different Installations
soldiers %>%ggplot(aes(x = Installation, y = weightkg))+geom_boxplot()# ORsoldiers %>%ggplot(aes(y = Installation, x = weightkg))+geom_boxplot()
Remove the outliers from the boxplot (read the documentation)
Add some jittered points to give an impression of the nr of soldiers at each installation
Give the jittered points some transparency (decrease alpha) to avoid overplotting
soldiers %>%ggplot(aes(y = Installation, x = weightkg))+geom_boxplot(outlier.shape =NA)+geom_jitter(height =0.2,alpha =0.1)
* Use facet_wrap() to split on sex have the plots placed in one column
To recreate the basic plot use: ggplot(mpg, aes(hwy, cty))+ geom_point()
Try another theme. Type theme_ and try some of the suggestions.
Want more?
Use install.packages() to download the ggthemes package. When you have done that add library(ggthemes) to the code chunk where you call your libraries and execute the line.
Watch the gallery
9 Explore soldiers further
Together with your neighbor
Come up with a research question for the dataset
Discuss what variables to map
Discuss what geoms to use
Fix the labels in your plot (x, y, title, etc..)
When you are done - show your plot to another group and take a look at their plot
Try to work out the code they must have used to create that plot
10 Saving plots
In your project folder, create a new folder for saving plots and/or tables (e.g. “plots”)
read the documentation for ggsave()
What does the arguments I have specified below do?
Are they different from the defaults?
ggsave(filename =here("[YOUR FOLDER]", "[THE NAME OF YOUR FILE].png"), # .png .pdf .jpg are typical optionsplot = [The name of the ggplot object], # dpi ="retina", device =NULL# Why can we leave this as NULL? )
Create a simple plot and save it to your folder using ggsave()
mtcars %>%ggplot(aes(x = mpg, y = disp, color = hp)) +geom_point(size =6)
mtcars %>%ggplot(aes(x = mpg, y = disp, color = hp)) +geom_point(size =6) +scale_color_continuous(type ="viridis", option ="magma")
Using diamonds, recreate the R code necessary to generate the following graphs.
Using diamonds, recreate the R code necessary to generate the following graph.
Using diamonds, recreate the R code necessary to generate the following graphs.
HINT: Think position=?????
12 Overplotting
Fix this plot
change alpha
change shape
diamonds %>%filter(x>3& x<9) %>%ggplot(aes(x = x, y = price))+geom_point()
# alpha and pointdiamonds %>%filter(x>3& x<9) %>%ggplot(aes(x = x, y = price))+geom_point(alpha =0.2,shape =".") # This shape gives each point the size of a pixel
13 Arranging Plots
Install the patchwork package and load it
Use install.packages() to download the patchwork package. When you have done that add library(patchwork) to the code chunk where you call your libraries and execute the line.
Run this code and then recreate the plot below
p <-ggplot(diamonds)pf <-ggplot(diamonds %>%filter(carat<3))p1 <- p +geom_bar(aes(x = cut, fill = clarity), position ="fill") +labs(title ="p1")p2 <- p +geom_bar(aes(x = cut, fill = clarity)) +labs(title ="p2")p3 <- pf +geom_histogram(aes(x = carat, fill = clarity), binwidth =0.1, position ="fill", na.rm =TRUE) +labs(title ="p3")p4 <- pf +geom_histogram(aes(x = carat, fill = clarity), binwidth =0.1, position ="dodge", na.rm =TRUE) +labs(title ="p4")
(p1|p2/(p3+p4)) +plot_layout(guide ="collect")
Read the documentation for plot_annotation()
What does the function overall do?
What does the argument tag_levels = 'A' do?
Tag the plots with roman numerals, and suffix the numerals with a “.”