MMRL4H@EurIPS 2025
Workshop graphic

Multimodal Representation Learning for Healthcare

EurIPS 2025 Workshop


About

Clinical decision making often draws on multiple diverse data sources including, but not limited to, images, text reports, electronic health records (EHR), continuous physiological signals and genomic profiles and yet most AI systems deployed in healthcare remain unimodal. However, integrating diverse medical modalities presents significant challenges including data heterogeneity, data scarcity, missing or asynchronous modalities, variable data quality and the lack of standardized frameworks for aligned representation learning. This workshop aims to bring the EurIPS community together to tackle the problem of fusing these heterogeneous inputs into interpretable, coherent patient representations, aiming to mirror the holistic reasoning of clinicians. We aim to bring together machine learning researchers, clinicians and industry partners dedicated to the theory, methods, and translation of multimodal learning in healthcare.

Goals

In this workshop, we aim to:

  • Advance methods for learning joint representations from images, text, signals, and genomics.
  • Investigate foundation-model pretraining at scale on naturally paired modalities.
  • Address robustness, fairness, and missing-modality issues unique to healthcare fusion.
  • Foster clinician–ML collaboration and outline translational paths to deployment.

Key Info

Important Dates

Our Call for Papers is now open!

Please note all deadlines are Anywhere on Earth (AOE).

  • Submission Deadline: October 15, 2025
  • Acceptance Notification: October 31, 2025
  • Workshop Date: December 6, 2025
  • Camera-Ready Submission Deadline: November 15, 2025

Call for Papers

Authors are invited to submit 4-page abstracts on topics relevant to multimodal representation learning in healthcare. These include, but are not limited to, vision-language models for radiology, temporal alignment of multimodal ICU streams, graph and transformer architectures for patient data fusion, cross-modal self-supervised objectives, and multimodal benchmarks with fairness and bias analysis.

Submission

  • Submission site: via OpenReview
  • Format: NeurIPS 2025 template
  • Length: max 4 pages excluding references
  • Review: Double-blind
  • Anonymization: Required, ensure that there are no names or affiliations in all parts of the submission including any code.

All accepted papers will be published on the website. Please note that there will be no workshop proceedings (non-archival).

Poster Format

All accepted workshop papers will be presented as physical posters during the MMRL4H@EurIPS 2025 workshop in Copenhagen.

  • Size: A1
  • Orientation: Portrait
  • Printing: Authors are responsible for printing and bringing their posters
  • Poster sessions: Provide opportunities for in-depth discussion and networking with attendees and invited speakers

Accepted Contributions

  1. EmoSLLM: Parameter-Efficient Adaptation of LLMs for Speech Emotion Recognition
    by Hugo Thimonier, Antony Perzo, Renaud Seguier
    PDF · OpenReview

  2. Position: Real-World Clinical AI Requires Multimodal, Longitudinal, and Privacy-Preserving Corpora
    by Azmine Toushik Wasi, Shahriyar Zaman Ridoy
    PDF · OpenReview

  3. When are radiology reports useful for training medical image classifiers?
    by Herman Bergström, Zhongqi Yue, Fredrik D. Johansson
    PDF · OpenReview

  4. MIND: Multimodal Integration with Neighbourhood-aware Distributions
    by Hanwen Xing, Christopher Yau
    PDF · OpenReview

  5. NAP: Attention-Based Late Fusion for Automatic Sleep Staging
    by Alvise Dei Rossi, Julia van der Meer, Markus Schmidt, Claudio L. A. Bassetti, Luigi Fiorillo, Francesca D. Faraci
    PDF · OpenReview

  6. Are Large Vision Language Models Truly Grounded in Medical Images? Evidence from Italian Clinical Visual Question Answering
    by Federico Felizzi, Olivia Riccomi, Michele Ferramola, Francesco Andrea Causio, Manuel Del Medico, De Vita Vittorio, Lorenzo De Mori, Alessandra Piscitelli, Pietro Eric Risuleo, Bianca Destro Castaniti, Antonio Cristiano, Alessia Longo, Luigi De Angelis, Mariapia Vassalli, Marcello Di Pumpo
    PDF · OpenReview

  7. Multimodal Alignment for Synthetic Clinical Time Series
    by Arinbjörn Kolbeinsson, Benedikt Kolbeinsson
    PDF · OpenReview

  8. iMML: A Python package for multi-modal learning with incomplete data
    by Alberto López, John Zobolas, Tanguy Dumontier, Tero Aittokallio
    PDF · OpenReview

  9. POEMS: Product of Experts for Interpretable Multi-omic Integration using Sparse Decoding
    by Mihriban Kocak Balik, Pekka Marttinen, Negar Safinianaini
    PDF · OpenReview

  10. Multi-Omic Transfer Learning for the Diagnosis & Prognosis of Blood Cancers
    by Leonardo P.A. Biral, Sandeep Dave
    PDF · OpenReview

  11. From Binning to Joint Embeddings: Robust Numeric Integration for EHR Transformers
    by Maria Elkjær Montgomery, Mads Nielsen
    PDF · OpenReview

  12. A tutorial on discovering and quantifying the effect of latent causal sources of multimodal EHR data
    by Marco Barbero Mota, Eric Strobl, John M Still, William W. Stead, Thomas A Lasko
    PDF · OpenReview

  13. Multi-Modal AI for Remote Patient Monitoring in Cancer Care
    by Yansong Liu, Ronnie Stafford, Pramit Khetrapal, Huriye Kocadag, Graca Carvalho, Patricia de Winter, Maryam Imran, Amelia Snook, Adamos Hadjivasiliou, D Vijay Anand, Weining Lin, John Kelly, Yukun Zhou, Ivana Drobnjak
    PDF · OpenReview

  14. VenusGT: A Trajectory-Aware Graph Transformer for Rare-Cell Discovery in Single-Cell Multi-Omics
    by Natalia Sikora, Rebecca Rees, Sean Righardt Holm, Hanchi Ren, Lewis W. Francis
    PDF · OpenReview

  15. Virtual Breath-Hold (VBH) for Free-Breathing CT/MRI: Segmentation-Guided Fusion with Image-Signal Alignment
    by Rian Atri
    PDF · OpenReview

  16. Towards Multimodal Representation Learning in Paediatric Kidney Disease
    by Ana Durica, John Booth, Ivana Drobnjak
    PDF · OpenReview

  17. A learning health system in Neurorehabilitation as a foundation for multimodal patient representation
    by Thomas Weikert, Eljas Roellin, Diego Paez-Granados, Chris Easthope Awai
    PDF · OpenReview


Schedule

TimeSession
9:00–9:15Opening Remarks
9:15–9:35Speaker Session I - Gunnar Rätsch
9:40-10:00Speaker Session I - Sonali Parbhoo
10:00-10:10Oral Session I - MIND: Multimodal Integration with Neighbourhood-aware Distributions
10:10-10:20Oral Session I - Are Large Vision Language Models Truly Grounded in Medical Images? Evidence from Italian Clinical Visual Question Answering
10:20-11:00Coffee & Poster Session I
11:00-11:20Speaker Session II - Bianca Dumitrascu
11:25-11:45Speaker Session II - Rajesh Ranganath
11:45-12:30Panel (all speakers) - "Translating Multimodal ML to the Bedside"
12:30-14:00Lunch & Networking
14:00-14:20Speaker Session III - Desmond Elliott
14:25-14:45Speaker Session III - Stephanie Hyland
14:45-14:55Oral Session II - When are radiology reports useful for training medical image classifiers
14:55-15:05Oral Session II - POEMS: Product of Experts for Interpretable Multi-omic Integration using Sparse Decoding
15:05-15:45Coffee & Poster Session II
15:45-16:15Open Q&A
16:15-16:30Closing Remarks & Awards

Confirmed Speakers

Desmond Elliott

Desmond Elliott

University of Copenhagen, DK

Sonali Parbhoo

Sonali Parbhoo

Imperial College London, UK

Stephanie Hyland

Stephanie Hyland

Microsoft Research, UK

Rajesh Ranganath

Rajesh Ranganath

New York University, USA

Bianca Dumitrascu

Bianca Dumitrascu

Columbia University, USA

Gunnar Rätsch

Gunnar Rätsch

ETH Zurich, CH


Organizers

Stephan Mandt

Stephan Mandt

Associate Professor, UC Irvine, USA

Ece Ozkan Elsen

Ece Özkan Elsen

Assistant Professor, University of Basel, CH

Samuel Ruiperez-Campillo

Samuel Ruiperez-Campillo

PhD Student, ETH Zurich, CH

Thomas Sutter

Thomas Sutter

Postdoctoral Researcher, ETH Zurich, CH

Julia Vogt

Julia Vogt

Associate Professor, ETH Zurich, CH

Nikita Narayanan

Nikita Narayanan

PhD Student, Imperial College London, UK

Contact