Proceedings paper
Title:
R4R: Reproducibility for R
Authors:
P. Donat-Bouillud, F. Křikava, S. Krynski, J. Vitek
Publication:
Proceedings of the 3rd ACM Conference on Reproducibility and Replicability
Year:
2025
ISBN:
9798400719585
Abstract:
Ensuring reproducibility is a fundamental challenge in computational research. Reproducing results often requires reconstructing complex software environments involving data files, external tools, system libraries, and language-specific packages. While various tools aim to simplify this process, they often rely on user-provided metadata, overlook system dependencies, or produce unnecessarily large environments. We present r4r, a tool that automates the creation of minimal, user-inspectable, self-contained execution environments through dynamic program analysis techniques. r4r captures all runtime dependencies of a data analysis pipeline and produces a Docker image capable of reproducing the original execution. Although designed with first-class support for the R programming language, r4r also includes a generic fallback mechanism applicable to other languages. We evaluate r4r on a collection of R Markdown notebooks from Kaggle and find that it achieves exact reproducibility for 97.5% of deterministic notebooks.
BibTeX:
@inproceedings{donatbouillud_r4r_2025,
title = {{R4R: Reproducibility for R}},
author = {Donat-Bouillud, Pierre and Křikava, Filip and Krynski, Sebastian and Vitek, Jan},
year = {2025},
booktitle = {{Proceedings of the 3rd ACM Conference on Reproducibility and Replicability}},
publisher = {Association for Computing Machinery},
series = {{ACM REP '25}},
location = {New York, NY, USA},
doi = {10.1145/3736731.3746156},
isbn = {9798400719585},
pages = {132--142},
url = {https://doi.org/10.1145/3736731.3746156},
}