<- read.csv("my_data_file.csv")
data # excluded as participant entered an invalid CPR number
<- data %>% filter(id != "2321369-1212") data
RRR – Workflow #1
1 Reproducible Research
Let’s say you have noticed a wrong entry in your data set … in invalid CPR number. You might decide to delete the observation from your data like this:
…but why not just delete that observation from the csv file?
Reproducible Research
The principles of Reproducible Research admonish you that the entire process from raw data to output should be:
- scripted
- reproducible
- reversible
What happens when at 6 months later, someone notices that you downloaded 106 rows of data from RedCap, but your manuscript reports n=105? Do you think you’ll remember then?
Or for a more dramatic example, look at this youtube video
when you have the time (~20 minutes)