A Course On Reproducible research, Data pipelines and Scientific computing (CORDS)

Project lead

Mauro Werder


Ludovic Gustave Räss

Project staff

Victor Boussange
Mylène Jacquemart
Ludovic Gustave Räss
Ivan Utkin
Mauro Werder

Project duration

2023 - 2024


The ongoing digital transformation and the push towards open science sets new challenges when it comes to data processing and reproducibility. A vast majority of projects in natural sciences deal with input data to feed models which then output new data to be further interpreted or analysed. Keeping track of this data pipeline is crucial to  ensure reproducibility. The continuous increase in resolution and complexity of current models and data-sets pushes manual handling of the data pipeline to its limits. Our project aims at addressing these issues by providing a course for WSL on the usage of modern software engineering tools and high-performance computing to enable the automation of the data pipeline and ultimately facilitating reproducible and open science. The proposed course, targeted at showcasing the development of a fully automated data and simulation pipeline will also include high-performance physics-based models running on GPUs and geographic information processing.