Working towards research that is more reproducible and open

Luke W. Johnston

March 31, 2023

Who am I? 👋

  • Team Leader at Steno Diabetes Center Aarhus and Aarhus University, Denmark
  • Research/work:
    • Teach researchers how to be reproducible
    • Build software to automate/streamline research
    • Do epidemiological research

Goals of this talk 🔈

Spread awareness of tools and resources for doing reproducible research (plus examples)

  • Reproducible documents: Quarto
  • Track file changes: Git
  • Share code: GitHub
  • Run analysis/programming: R

What is reproducibility? 🤔

Question: If you had to explain reproducibility vs replicability to everyone, right now, could you?

Reproducibility: Same data + analysis = same result?

  • Like baking: Data = ingredients, analysis = recipe
  • HOW EXACTLY a result was determined in a study
  • Independent reproduction = success

We don’t share as much as we should

Why are reproducible and open practices important right now? 🤷🤔

They are part of multiple large trends like team science, computing, meta-research, higher quality/rigor

Both reproducibility and open science are on a spectrum

🔑 Independently get the same result with same data and analysis

  • Can someone else bake the same food without either of the ingredients or the recipe?

Simple, tangible steps to being more reproducible (in R)

  • Use RStudio R Projects (.Rproj files) and the {here} package
  • Use standardized project folder/file templates (e.g. {prodigenr} package)
  • Using Quarto (or R Markdown)

Think 💭 then share 🗣️ activity

1 minute to think, after we’ll share 🤩

  1. How do you work when doing data analysis? How do you try to be more open and reproducible?

  2. What are some benefits to being more open and reproducible?

  3. What are some challenges or barriers?

Revelant examples from my work

Reproducible Research in R courses

Main page at r-cubed.rostools.org

Collaborative UK Biobank project at Steno

  • Group of researchers at Steno Aarhus, using Trello to coordinate things.

  • R package to help automate/streamline common tasks.

  • Work on projects through GitHub, for instance, my mesh project.

Scoping review on current literature on open collaboration

  • Communication and co-working happens in Discord

  • We work on the project through GitHub.

Seedcase: Software to improve transparency and openness of managing data

  • Developing and coordinating tasks on GitHub.

  • Communication and virtual meetings over Discord.