This three-day online workshop provides an introduction to the validation of AI methods in image analysis. Reliable validation is essential to decide whether algorithms have a potential for real-world translation, beyond standard leaderboards. It ensures that model performance estimates are meaningful, reproducible, and reflecting the underlying research questions. The course introduces key principles and practical techniques for validating AI models.
Participants will learn how to select and interpret appropriate performance metrics (Day 1), quantify model performance uncertainty (Day 2), and assess the robustness of rankings in algorithm benchmarking (Day 3). The workshop combines theoretical lectures with guided hands-on exercises using provided datasets and Jupyter notebooks. We will use open tools such as Metrics Reloaded to map certain problems to suitable metrics, and Rankings Reloaded to assess ranking stability, along with new concepts to assess the probability of false outperformance claims in publications. In addition, short coding tasks will target the implementation and critical assessment of validation workflows.