• zur Hauptnavigation springen
  • zum Inhaltsbereich springen
University of Bamberg
open search form
You are here
  • Homepage University of Bamberg
  • Faculties
  • Information Systems and Applied Computer Sciences
  • Subject groups
  • Applied Computer Science
  • Chair of Explainable Machine Learning
  • Software & Datasets
  • Dataset: MedMNIST-C: 12 corrupted benchmark datasets and augmentation APIs for robust medical image classification
  University of Bamberg
  • Deutsch
  • English

Information Systems and Applied Computer Sciences

Chair of Explainable Machine Learning

  1.  University of bamberg
  2. Faculties
  3. Information Systems and Applied Computer Sciences
  4. Subject groups
  5. Applied Computer Science
  6. Chair of Explainable Machine Learning
  7. Software & Datasets
  8. Dataset: MedMNIST-C: 12 corrupted benchmark datasets and augmentation APIs for robust medical image classification
Website section:

Information Systems and Applied Computer Sciences

Chair of Explainable Machine Learning

  • Deutsch
  • English
  • Team
    • Prof. Dr. Christian Ledig
    • Franziska Düsel (Secretary)
    • Sebastian D?rrich, M.Sc.
    • Francesco Di Salvo, M.Sc.
    • Jonas Alle, M.Sc.
    • My Nguyen, M.Sc.
    • Marco Lents, M.Sc.
    • Dr. Shyam Rai
  • Studies
    • Courses
    • Theses
  • Research
  • Software & Datasets
    • Software: Multi-Atlas Label Propagation with EM-refinement (MALPEM)
    • Software: Patch-based Evaluation of Image Segmentation (PEIS)
    • Dataset: MALPEM-ADNI: Features, binary masks, segmentations for 5074 ADNI subjects
    • Dataset: MedMNIST-C: 12 corrupted benchmark datasets and augmentation APIs for robust medical image classification
  • Publications
  • Open Positions
  • News
  • In the Media

Dataset: MedMNIST-C: 12 corrupted benchmark datasets and augmentation APIs for robust medical image classification

xAILab Bamberg

We introduce MedMNIST-C, a comprehensive robustness benchmark based on the MedMNIST+ dataset collection for medical image classification. The dataset covers 12 2D datasets and 9 imaging modalities, and provides modality-specific image corruptions at five severity levels to simulate realistic artifacts and distribution shifts encountered in medical imaging applications. In addition to the benchmark datasets themselves, MedMNIST-C includes software APIs fordata augmentation, facilitating both robustness assessment and robustness-oriented model development.

The following is a summary of the publicly available dataset and accompanying codebase. For further details, please refer to the preprint, the Zenodo release, and the README of the current repository.

Corrupted datasets derived from MedMNIST+

  • Benchmark datasets derived from the MedMNIST+ collection at resolution 224x224 [link]
  • Coverage of 12 2D datasets spanning 9 imaging modalities
  • Corruptions designed to reflect modality-specific imaging artifacts
  • Five predefined corruption severity levels for controlled robustness evaluation
  • Public release of the corrupted datasets via Zenodo

Corruption types

MedMNIST-C organizes corruptions into five main categories:

  • digital corruptions, such as JPEG compression and pixelation
  • noise corruptions, including Gaussian, speckle, impulse, and shot noise
  • blur corruptions, including Gaussian, defocus, motion, and zoom blur
  • color corruptions, including brightness, contrast, saturation, and gamma shifts
  • task-specific corruptions, including stain deposits, bubbles, black corners, and acquisition overlays

These corruptions are evaluated across five increasing severity levels, enabling systematic robustness analysis across a broad range of medical imaging tasks.

Code and APIs

The repository provides the main components required to create, use, and evaluate the benchmark:

  • corruption registry with predefined intensity settings
  • dataset manager for generating corrupted datasets
  • dataset loaders for the corrupted benchmarks
  • visualization tools for inspecting corruption effects
  • augmentation APIs for corruption-based training (termed targeted augmentations)
  • PyTorch evaluation utilities for robustness experiments
  • normalization baselines for model evaluation

License

  • Code: Apache-2.0
  • Dataset: CC BY 4.0, except for DermaMNIST-C: CC BY-NC 4.0

Citation

F. Di Salvo, S. Doerrich, and C. Ledig, “MedMNIST-C: Comprehensive benchmark and improved classifier robustness by simulating realistic image corruptions”, arXiv, 2024. [preprint] [code] [bib] [zenodo]

Please also cite MedMNIST, the underlying source datasets, and ImageNet-C, as recommended by the repository.

Page 175828

  • Contact
  • Legal

Online Services

  • FIS (Research Information System)
  • FlexNow2 for Students
  • FlexNow2 for Employees
  • Portal for Students and Applicants of Master’s Degree Programmes
  • Office 365
  • University Library Catalogue
  • University Portal (HisInOne)
  • Webmail:
    https://mailex.uni-bamberg.de
    https://o365.uni-bamberg.de

Main Menu

  • University
  • Faculties
  • Administration & Institutions
  • Studies
  • Research
  • International

Contact

University of Bamberg
Kapuzinerstra?e 16
D-96047 Bamberg

Phone: +49 951 863-0
Email: post(at)uni-bamberg.de

Follow us

InstagramFacebookBlueskyToktok
Subscribe to newsletter