Hypothesis testing
ggstatsplot
1 ggstatsplot
Install the ggstatsplot
and rstatix
packages and add the library calls for these packages to your library code chunk
1.1 The bugs_long
dataset
bugs_long
provides information on the extent to which men and women want to kill arthropods that vary in disgustingness (low, high) and freighteningness (low, high) (four groups in total). Each participant rated their attitude towards all four kinds of anthropods. bugs_long
is a subset of the data reported by Ryan et al.(2013) .
Note that this is a repeated measures design because the same participant gave four different ratings across four different conditions (LDLF, LDHF, HDLF, HDHF).
desire
- The desire to kill an arthropod was indicated on a scale from 0 to 10gender
Male/Femaleregion
condition
- LDLF: low disgustingness and low freighteningness
- LDHF: low disgustingness and high freighteningness
- HDLF: high disgustingness and low freighteningness
- HDHF: high disgustingness and high freighteningness
- LDLF: low disgustingness and low freighteningness
1.1.0.1 In bugs_long
, is there a difference within the participants in their desire
to kill bugs from the four different conditions
?
- Should you use
ggwithinstats()
orggbetweenstats()
when comparing - Is it reasonable to assume normality?
Code
%>% group_by(condition) %>% shapiro_test(desire)
bugs_long
# qqplot
%>%
bugs_long ggplot(aes(sample = desire, group = condition)) +
geom_qq()+
geom_qq_line()
Code
# Density plot
%>%
bugs_long ggplot(aes(x = desire, fill = condition)) +
geom_density(alpha = 0.2)
- Make the appropriate test
Code
%>%
bugs_long ggwithinstats(x = condition,
y = desire,
type = "nonparametric")
Code
# Note that the ggstatstutorial actually runs this as a "parametric" test
- What is the name of the statistical test that was performed?
- What is your interpretation?
- What is the consequence if you change the type of test?
1.1.0.2 Is there a difference between men and women in the desire
to kill bugs that are LDHF (low disgustingness and high freighteningness).
- Create a filtered data frame of bugs_long
Code
<- bugs_long %>% filter(condition == "LDHF") bl_LDHF
- Should you use
ggwithinstats()
orggbetweenstats()
for this test? - Is it reasonable to assume normality?
Code
%>%
bl_LDHF filter(!is.na(gender), !is.na(desire)) %>%
group_by(gender) %>%
shapiro_test(desire)
# qqplot
%>%
bl_LDHF ggplot(aes(sample = desire, color = gender)) +
geom_qq()+
geom_qq_line()
Code
# Density plot
%>%
bl_LDHF ggplot(aes(x = desire, fill = gender)) +
geom_density(alpha = 0.2)
- Make the appropriate test
Code
%>%
bl_LDHF ggbetweenstats(x = gender,
y = desire,
type = "nonparametric")
- What is the name of the statistical test that was performed?
- What is your interpretation?
- What is the consequence if you change the type of test?
1.1.0.3 Is there a difference in the frequency of men and women between North America
and the remaining regions?.
- First, lump
region
togehter to two levels (fct_lump()
) - Reduce the data to one row pr subject ID, and discuss why this is a good idea.
Code
<- bugs_long %>%
bl_region mutate(region = fct_lump(region, 1)) %>%
group_by(subject) %>%
slice(1) %>%
ungroup()
- Should you use
ggwithinstats()
orggbetweenstats()
or perhapsggbarstats()
for this test? - Should you asses normality?
Code
# Both variables are factors (categorical). Normality has to do with continuous data
- Make the appropriate test
Code
%>%
bl_region ggbarstats(x = gender,
y = region)
- What is the name of the statistical test that was performed? (check the help page under the
paired
argument) - What is your interpretation?
- What is the consequence if you change the type of test?
1.2 The ToothGrowth
dataset
ToothGrowth
gives information on tooth length in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).
len
Tooth lengthsupp
Supplement type- VC Vitamin C as ascorbic acid
- OJ Orange Juice
dose
Dose in milligrams/day (0.5, 1, or 2)
1.2.0.1 Is there a difference in Tooth length based on the type of supplement?
- Should you use
ggwithinstats()
orggbetweenstats()
when comparing - Is it reasonable to assume normality?
Code
%>% group_by(supp) %>% shapiro_test(len)
ToothGrowth
# qqplot
%>%
ToothGrowth ggplot(aes(sample = len, color = supp)) +
geom_qq()+
geom_qq_line()
Code
# Density plot
%>%
ToothGrowth ggplot(aes(x = len, fill = supp)) +
geom_density(alpha = 0.2)
- Make the appropriate test
Code
%>%
ToothGrowth ggbetweenstats(x = supp,
y = len,
type = "robust")
Code
# It is likely completely fine to run this as a "parametric" test
- What is the name of the statistical test that was performed?
- What is your interpretation?
- What is the consequence if you change the type of test?