MMRL4H@EurIPS 2025
Workshop graphic

Multimodal Representation Learning for Healthcare

EurIPS 2025 Workshop


About

Clinical decision making often draws on multiple diverse data sources including, but not limited to, images, text reports, electronic health records (EHR), continuous physiological signals and genomic profiles and yet most AI systems deployed in healthcare remain unimodal. However, integrating diverse medical modalities presents significant challenges including data heterogeneity, data scarcity, missing or asynchronous modalities, variable data quality and the lack of standardized frameworks for aligned representation learning. This workshop aims to bring the EurIPS community together to tackle the problem of fusing these heterogeneous inputs into interpretable, coherent patient representations, aiming to mirror the holistic reasoning of clinicians. We aim to bring together machine learning researchers, clinicians and industry partners dedicated to the theory, methods, and translation of multimodal learning in healthcare.

Goals

In this workshop, we aim to:

  • Advance methods for learning joint representations from images, text, signals, and genomics.
  • Investigate foundation-model pretraining at scale on naturally paired modalities.
  • Address robustness, fairness, and missing-modality issues unique to healthcare fusion.
  • Foster clinician–ML collaboration and outline translational paths to deployment.

Key Info

Important Dates

Our Call for Papers is now open!

Please note all deadlines are Anywhere on Earth (AOE).

  • Submission Deadline: October 15, 2025
  • Acceptance Notification: October 31, 2025
  • Workshop Date: December 6, 2025
  • Camera-Ready Submission Deadline: November 15, 2025

Call for Papers

Authors are invited to submit 4-page abstracts on topics relevant to multimodal representation learning in healthcare. These include, but are not limited to, vision-language models for radiology, temporal alignment of multimodal ICU streams, graph and transformer architectures for patient data fusion, cross-modal self-supervised objectives, and multimodal benchmarks with fairness and bias analysis.

Submission

  • Submission site: via OpenReview
  • Format: NeurIPS 2025 template
  • Length: max 4 pages excluding references
  • Review: Double-blind
  • Anonymization: Required, ensure that there are no names or affiliations in all parts of the submission including any code.

All accepted papers will be published on the website. Please note that there will be no workshop proceedings (non-archival).

Poster Format

All accepted workshop papers will be presented as physical posters during the MMRL4H@EurIPS 2025 workshop in Copenhagen.

  • Size: A1
  • Orientation: Portrait
  • Printing: Authors are responsible for printing and bringing their posters
  • Poster sessions: Provide opportunities for in-depth discussion and networking with attendees and invited speakers

Accepted Contributions

  1. EmoSLLM: Parameter-Efficient Adaptation of LLMs for Speech Emotion Recognition
    by Hugo Thimonier, Antony Perzo, Renaud Seguier
    PDF · OpenReview

  2. Position: Real-World Clinical AI Requires Multimodal, Longitudinal, and Privacy-Preserving Corpora
    by Azmine Toushik Wasi, Shahriyar Zaman Ridoy
    PDF · OpenReview

  3. When are radiology reports useful for training medical image classifiers?
    by Herman Bergström, Zhongqi Yue, Fredrik D. Johansson
    PDF · OpenReview

  4. MIND: Multimodal Integration with Neighbourhood-aware Distributions
    by Hanwen Xing, Christopher Yau
    PDF · OpenReview

  5. NAP: Attention-Based Late Fusion for Automatic Sleep Staging
    by Alvise Dei Rossi, Julia van der Meer, Markus Schmidt, Claudio L. A. Bassetti, Luigi Fiorillo, Francesca D. Faraci
    PDF · OpenReview

  6. Are Large Vision Language Models Truly Grounded in Medical Images? Evidence from Italian Clinical Visual Question Answering
    by Federico Felizzi, Olivia Riccomi, Michele Ferramola, Francesco Andrea Causio, Manuel Del Medico, De Vita Vittorio, Lorenzo De Mori, Alessandra Piscitelli, Pietro Eric Risuleo, Bianca Destro Castaniti, Antonio Cristiano, Alessia Longo, Luigi De Angelis, Mariapia Vassalli, Marcello Di Pumpo
    PDF · OpenReview

  7. Multimodal Alignment for Synthetic Clinical Time Series
    by Arinbjörn Kolbeinsson, Benedikt Kolbeinsson
    PDF · OpenReview

  8. iMML: A Python package for multi-modal learning with incomplete data
    by Alberto López, John Zobolas, Tanguy Dumontier, Tero Aittokallio
    PDF · OpenReview

  9. POEMS: Product of Experts for Interpretable Multi-omic Integration using Sparse Decoding
    by Mihriban Kocak Balik, Pekka Marttinen, Negar Safinianaini
    PDF · OpenReview

  10. Multi-Omic Transfer Learning for the Diagnosis & Prognosis of Blood Cancers
    by Leonardo P.A. Biral, Sandeep Dave
    PDF · OpenReview

  11. From Binning to Joint Embeddings: Robust Numeric Integration for EHR Transformers
    by Maria Elkjær Montgomery, Mads Nielsen
    PDF · OpenReview

  12. A tutorial on discovering and quantifying the effect of latent causal sources of multimodal EHR data
    by Marco Barbero Mota, Eric Strobl, John M Still, William W. Stead, Thomas A Lasko
    PDF · OpenReview

  13. Multi-Modal AI for Remote Patient Monitoring in Cancer Care
    by Yansong Liu, Ronnie Stafford, Pramit Khetrapal, Huriye Kocadag, Graca Carvalho, Patricia de Winter, Maryam Imran, Amelia Snook, Adamos Hadjivasiliou, D Vijay Anand, Weining Lin, John Kelly, Yukun Zhou, Ivana Drobnjak
    PDF · OpenReview

  14. VenusGT: A Trajectory-Aware Graph Transformer for Rare-Cell Discovery in Single-Cell Multi-Omics
    by Natalia Sikora, Rebecca Rees, Sean Righardt Holm, Hanchi Ren, Lewis W. Francis
    PDF · OpenReview

  15. Virtual Breath-Hold (VBH) for Free-Breathing CT/MRI: Segmentation-Guided Fusion with Image-Signal Alignment
    by Rian Atri
    PDF · OpenReview

  16. Towards Multimodal Representation Learning in Paediatric Kidney Disease
    by Ana Durica, John Booth, Ivana Drobnjak
    PDF · OpenReview

  17. A learning health system in Neurorehabilitation as a foundation for multimodal patient representation
    by Thomas Weikert, Eljas Roellin, Diego Paez-Granados, Chris Easthope Awai
    PDF · OpenReview


Schedule

TimeSession
9:00–9:15Opening Remarks
9:15–9:35Speaker Session I - Gunnar Rätsch
9:40-10:00Speaker Session I - Sonali Parbhoo
10:00-10:10Oral Session I - MIND: Multimodal Integration with Neighbourhood-aware Distributions
10:10-10:20Oral Session I - Are Large Vision Language Models Truly Grounded in Medical Images? Evidence from Italian Clinical Visual Question Answering
10:20-11:00Coffee & Poster Session I
11:00-11:20Speaker Session II - Bianca Dumitrascu
11:25-11:45Speaker Session II - Rajesh Ranganath
11:45-12:30Panel (all speakers) - "Translating Multimodal ML to the Bedside"
12:30-14:00Lunch & Networking
14:00-14:20Speaker Session III - Desmond Elliott
14:25-14:45Speaker Session III - Stephanie Hyland
14:45-14:55Oral Session II - When are radiology reports useful for training medical image classifiers
14:55-15:05Oral Session II - POEMS: Product of Experts for Interpretable Multi-omic Integration using Sparse Decoding
15:05-15:45Coffee & Poster Session II
15:45-16:15Open Q&A
16:15-16:30Closing Remarks & Awards

Confirmed Speakers

Desmond Elliott

Desmond Elliott

University of Copenhagen, DK

Sonali Parbhoo

Sonali Parbhoo

Imperial College London, UK

Stephanie Hyland

Stephanie Hyland

Microsoft Research, UK

Rajesh Ranganath

Rajesh Ranganath

New York University, USA

Bianca Dumitrascu

Bianca Dumitrascu

Columbia University, USA

Gunnar Rätsch

Gunnar Rätsch

ETH Zurich, CH


Organizers

Stephan Mandt

Stephan Mandt

Associate Professor, UC Irvine, USA

Ece Ozkan Elsen

Ece Özkan Elsen

Assistant Professor, University of Basel, CH

Samuel Ruiperez-Campillo

Samuel Ruiperez-Campillo

PhD Student, ETH Zurich, CH

Thomas Sutter

Thomas Sutter

Postdoctoral Researcher, ETH Zurich, CH

Julia Vogt

Julia Vogt

Associate Professor, ETH Zurich, CH

Nikita Narayanan

Nikita Narayanan

PhD Student, Imperial College London, UK


Program Committee

Mario Wieser, Genedata AG
Florian Barkmann, ETH Zurich
Raphael Pisoni, Software Competence Center Hagenberg
Rian Atri, Wake Technical Community College
Ana Durica, University College London
Maxim Samarin, Swiss Data Science Center
Daphné Chopard, ETH Zurich
Jorge da Silva Gonçalves, ETH Zurich
Hugo Thimonier, Emobot
Khushboo Bhatia, Google
Moritz Vandenhirtz, ETH Zurich
Miguel Rodrigo, Universidad de Valencia
Fabricio Arend Torres, Rekonas GmbH
Olga Ovcharenko, Technische Universität Berlin
Sonia Laguna, ETH Zurich
Simon Böhi, University of Basel
Prasanth Ganesan, Stanford University
Alain Ryser, ETH Zurich
Alice Bizeul, ETH Zurich
Alvise Dei Rossi, Università della Svizzera Italiana
Federico Felizzi, SIIAM
Natalia Sikora, Swansea University
Marc Glettig, ETH Zurich
Alejandro Guerrero-López, University of Zurich
Leonardo P.A. Biral, Duke University
Arinbjörn Kolbeinsson, University of Virginia, Charlottesville
Michael Reiss, University of California, San Diego
Hanwen Xing, University of Oxford
Thomas Weikert, INRIA
Pratik Kumar, California Polytechnic State University, Pomona
Harika Mahapatra, Sree Vidyanikethan Engineering College
Alberto López, University of Oslo
Marco Barbero Mota, Vanderbilt University
Azmine Toushik Wasi, Computational Intelligence and Operations Laboratory
Maria Elkjær Montgomery, University of Copenhagen
Yang Meng, University of California, Irvine
Aayush Grover, ETH Zurich
Sergio Muñoz Gonzalez, University of Basel
Mahule Roy, National Institute of Technology Karnataka
Aditya Acharya, Hochschule für Technik und Wirtschaft des Saarlandes
Herman Bergström, Chalmers University of Technology
Tobias Scheithauer, ETH Zurich
Robin C. Geyer, ETH Zurich
Paul Fischer, Eberhard-Karls-Universität Tübingen
Marina Esteban-Medina, ETH Zurich
Agnieszka Kraft, ETH Zurich
Kalin Nonchev, ETH Zurich
Francesco Ignazio Re, ETH Zurich
Ahlem AZIZ, Karabuk University

Contact