Let’s reproduce CVPR – #FAUHack
What:
We have all been there: we read a paper that is promising the release of source code and data, we go on their webpage and we only find a release date or we do not succeed to run the code. This is very frustrating and raises the question how reproducible our results are. Whilst papers can be reproducible without code release if the paper describes all components with enough detail, it is definitely much easier if the code is available. The promise to release code is typically appreciated by reviewers, but what does it look like in terms of accountability after?
Before we complain and make a point, we want to find out how big of a problem this actually is – together with you!
How:
Let’s focus on one of the strongest conferences in the field of computer vision, the Conference for Computer Vision and Pattern Recognition (CVPR). All papers since 2013 are available via the Computer Vision Foundation as open access and many of them promise code or directly include github links. Let’s investigate for this conference how easy it is to reproduce results. For the moment, we’d like to focus only on papers that indeed release code. We plan to do this project in to stages:
- stage 0‘s goal is to extract if papers have promised code and if the code is actually available using state of the art AI tools (LLMs). For this, we want to parse all works presented at CVPR over the last ten years. This gives us a first gist and eliminates the papers that do not actually share code. Stage 0 is already completed.
- stage 1 is a hackathon organized in a hybrid format with PhD students. The goal of this first stage is to figure out how easy or hard it is to reproduce a single or multiple result figures from a paper if the code is available. This will give us a first impression how much time is needed per paper and what the common pitfalls could be.
- stage 2 is a scalable course for students at a masters level with the aim to reproduce as many works as possible to get a richer picture of the state of our field in terms of reproducibility. This course is planned to be held in Winter semester 2024 at Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) with the aim to reproduce CVPR 2024.
- The ultimate goal is to publish our findings in a report and share it with the world.
Why:
We believe that this project will be beneficial in multiple ways:
- Highlight strengths and weaknesses of recent papers in terms of reproducibility at a premier CV conference
- Identify best practices on how to ensure reproducibility; thereby also improving accountability and providing guidelines for reviewers on how to judge this.
- Give early career researchers & students opportunities to dive deep into different research topics, beyond just “theory”, and engage with a worldwide community.
And most importantly: it will be a lot of fun!
Stage 2:
the project is currently at stage 2.
Students will submit their reports via a conference management sytem by January 25th 2025
Find information regarding the course and submission format here: https://www.studon.fau.de/studon/goto.php?target=crs_6038915
The Microsoft CMT service is used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.
Our sponsors:
Silver Sponsor:
Who to blame:
Foto: Georg Pöhlein, FAU
Prof. Dr.-Ing. Katharina Breininger, Professorship for Artificial Intelligence in Medical Imaging
Foto: Kris Brewer, MIT
Prof. Dr. Bernhard Egger, Professorship for Cognitive Computer Vision
Foto: Kathrin Kist, Soulmate Photography
Prof. Dr. Andreas M. Kist, Professorship Artificial Intelligence in Communication Disorders