RRR – Workflow #1

Author
Published

January 10, 2024

1 Reproducible Research

Let’s say you have noticed a wrong entry in your data set … in invalid CPR number. You might decide to delete the observation from your data like this:

data <- read.csv("my_data_file.csv")
# excluded as participant entered an invalid CPR number
data <- data %>% filter(id != "2321369-1212")

…but why not just delete that observation from the csv file?

The principles of Reproducible Research admonish you that the entire process from raw data to output should be:

  • scripted
  • reproducible
  • reversible

What happens when at 6 months later, someone notices that you downloaded 106 rows of data from RedCap, but your manuscript reports n=105? Do you think you’ll remember then?

Or for a more dramatic example, look at this youtube video

when you have the time (~20 minutes)