Multimodal Unlearning Across Vision, Language, Video, and Audio: Survey of Methods, Datasets, and Benchmarks
Overview of Survey
Multimodal unlearning requires identifying effective intervention points within the model pipeline. Figure 2 illustrates methods spanning data-side, training-time, architecture-constrained, and decoding-time stages, producing an updated model (MFM′). Training-free approaches instead apply direct parameter or representation edits (Δ).
Figure 2: System-level intervention points for multimodal unlearning across the model pipeline.
We organize multimodal unlearning via a system-first taxonomy across five intervention stages: Data-Side Interventions (Section 3.1); Training-Time Edits (Section 3.2); Architecture-Constrained Unlearning (Section 3.3); Training-Free Unlearning (Section 3.4); Decoding-Time Unlearning (Section 3.5).
Figure 1: Taxonomy of multimodal unlearning by intervention stage and control pathway.
Evaluation Metrics
Evaluation uses metric suites that assess forgetting, utility retention, robustness, and efficiency, as summarized in Figure 3. We defer detailed metric definitions and evaluation protocols to Appendix C.
Figure 3: Evaluation dimensions and representative metrics for multimodal unlearning.
Applications of Multimodal Unlearning
Multimodal unlearning enables selective removal of specific identities, attributes, or concepts without full retraining while preserving overall capability and stability. Detailed use cases and representative studies are provided in Appendix F.
Figure 4: Core application scenarios of multimodal unlearning.
Contact
This repository is actively maintained and continuously updated 🚀.
If you notice any issues or would like your work included, please open an issue or contact us:
BibTeX
@article{sarwar2026mm-unlearning-survey,
title = {{Multimodal Unlearning Across Vision, Language, Video, and Audio: Survey of Methods, Datasets, and Benchmarks}},
author = {Sarwar, Nobin and Roy Dipta, Shubhashis and Liu, Zheyuan and Patil, Vaidehi},
year = {2026},
doi = {10.36227/techrxiv.176945748.88280394/v1},
url = {https://doi.org/10.36227/techrxiv.176945748.88280394/v1},
publisher = {Institute of Electrical and Electronics Engineers (IEEE)},
month = jan
}
,
,