Nobin Sarwar

I am a 3rd-year Computer Science Ph.D. student in the Language Understanding Lab at the University of Maryland, Baltimore County, advised by Prof. Francis Ferraro.

My current research studies post-training optimization and system-level methods for LLMs and multimodal foundation models, with a focus on agentic reasoning and controllable adaptation. Ongoing work spans

Reasoning reliability and verification: retrieval-grounded structured inference and agentic retrieve-verify workflows for scientific claim and feasibility assessment, hallucination mitigation in multimodal QA (FilterRAG)
Privacy-preserving and controllable model adaptation: federated fine-tuning of LLMs with Differential Privacy (FedMentor, FedMentalCare), targeted multimodal unlearning (Multimodal Unlearning Survey)

Previously, I earned my MS in Computer Science from the University of Texas Rio Grande Valley, where I worked on privacy-preserving Federated Learning for biometrics and Differential Privacy, contributing to projects that received NSF funding.

Open to collaborations on agentic reasoning, post-training, and LLM evaluation — and to mentoring undergrad and grad students new to research. Email me anytime.

Research News (View All)

Jul 07, 2026	🌍 Our paper Pluralis v0.1, introducing a multimodal benchmark for multicultural AI safety, is now available on arXiv.
May 24, 2026	🌍 Released Academic Research Opportunities for International Students — a curated collection of global research mentorship programs, fellowships, internships, and predoctoral opportunities.
May 14, 2026	🥐 Our paper Croissant Baker is now available on arXiv, with code released on GitHub.
Apr 06, 2026	🎉 Our Multimodal Unlearning Survey has been accepted as Findings of ACL 2026.
Feb 28, 2026	🚀 Released the repository and project page for our survey paper on Multimodal Unlearning.

Publications Spotlight

Full publication list on Google Scholar →

Preprint’26

Pluralis v0.1: Towards a Multicultural, Multimodal, Multilingual Benchmark for AI Risk and Reliability

Alicia Parrish, Rajat Shinde, ..., Nobin Sarwar, ..., and Lora Aroyo

arXiv preprint arXiv:2607.06196

arXiv Bib PDF

@article{parrish2026pluralis,
  title = {Pluralis v0.1: Towards a Multicultural, Multimodal, Multilingual Benchmark for AI Risk and Reliability},
  author = {Parrish, Alicia and Shinde, Rajat and ... and Sarwar, Nobin and ... and Aroyo, Lora},
  journal = {arXiv preprint arXiv:2607.06196},
  year = {2026},
  dimensions = {true}
}

Coming Soon

CroissantMiner: Automated Extraction and Validation of Croissant Metadata for ML Datasets

Berke Arda, Mubashara Akhtar, Ahmetcan Yavuz, Paul Gerry, Sebastian Lobentanzer, Nobin Sarwar, Joan Giner-Miguelez, and 3 more authors

Coming soon

Bib Code

@article{arda2026croissantminer,
  title = {CroissantMiner: Automated Extraction and Validation of Croissant Metadata for ML Datasets},
  author = {Arda, Berke and Akhtar, Mubashara and Yavuz, Ahmetcan and Gerry, Paul and Lobentanzer, Sebastian and Sarwar, Nobin and Giner-Miguelez, Joan and Chen, Kongtao and Zhang, Luyao and Sachan, Mrinmaya},
  year = {2026},
  note = {Coming soon},
  dimensions = {true},
  demo = {https://huggingface.co/spaces/bearda/croissantminer},
  dataset = {https://huggingface.co/datasets/croissantminer/croissantminer}
}

Preprint’26

Croissant Baker: Metadata Generation for Discoverable, Governable, and Reusable ML Datasets

Rafi Al Attrach, Rajna Fani, Sebastian Lobentanzer, Joan Giner-Miguelez, Debanshu Das, Varuni H. K., Nobin Sarwar, and 13 more authors

arXiv preprint arXiv:2605.15079

arXiv Bib PDF Code

@article{attrach2026croissantbaker,
  title = {Croissant Baker: Metadata Generation for Discoverable, Governable, and Reusable ML Datasets},
  author = {Al Attrach, Rafi and Fani, Rajna and Lobentanzer, Sebastian and Giner-Miguelez, Joan and Das, Debanshu and H. K., Varuni and Sarwar, Nobin and Ghosh, Rajat and Archit, Anwai and Motghare, Surbhi and Parry, Christina Conrad and Oala, Luis and Grosso, Lara and Vanschoren, Joaquin and Vogler, Steffen and Goswami, Sujata and Rosenthal, Eric S. and Ghassemi, Marzyeh and McDermott, Matthew and Pollard, Tom},
  journal = {arXiv preprint arXiv:2605.15079},
  year = {2026},
  dimensions = {true},
}

ACL’26

Multimodal Unlearning Across Vision, Language, Video, and Audio: Survey of Methods, Datasets, and Benchmarks

Nobin Sarwar, Shubhashis Roy Dipta, Zheyuan Liu, and Vaidehi Patil

In Findings of the Association for Computational Linguistics: ACL 2026

arXiv Bib PDF Code Website

@inproceedings{sarwar-etal-2026-multimodal,
  title = {{Multimodal Unlearning Across Vision, Language, Video, and Audio: Survey of Methods, Datasets, and Benchmarks}},
  author = {Sarwar, Nobin and {Roy Dipta}, Shubhashis and Liu, Zheyuan and Patil, Vaidehi},
  booktitle = {Findings of the {A}ssociation for {C}omputational {L}inguistics: {ACL} 2026},
  year = {2026},
  month = jul,
  publisher = {Association for Computational Linguistics},
  dimensions = {true},
}

NeurIPS-W’25

FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health

Nobin Sarwar and Shubhashis Roy Dipta

In GenAI4Health Workshop, NeurIPS

arXiv Bib PDF Code Poster

@inproceedings{sarwar2025fedmentor,
  title = {FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health},
  author = {Sarwar, Nobin and {Roy Dipta}, Shubhashis},
  booktitle = {GenAI4Health Workshop, NeurIPS},
  year = {2025},
  url = {https://genai4health.github.io/},
  dimensions = {true}
}

ICCV-W’25

FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA

Nobin Sarwar

In T2FM Workshop, ICCV

arXiv Bib PDF Poster Slides

@inproceedings{sarwar2025filterrag,
  title = {FilterRAG: Zero-Shot Informed Retrieval-Augmented Generation to Mitigate Hallucinations in VQA},
  author = {Sarwar, Nobin},
  booktitle = {T2FM Workshop, ICCV},
  year = {2025},
  url = {https://t2fm-ws.github.io/T2FM-ICCV25/},
  dimensions = {true}
}