Xiangxi Shi

Xiangxi Shi

Hey, dude! How's your model today? :)

© 2026

Benchmarking Out-of-Distribution Detection in Visual Question Answering

main

Xiangxi Shi    Stefan Lee

WACV 2024

When faced with an out-of-distribution (OOD) question or image, VQA systems may provide unreliable answers. This work benchmarks a suite of OOD detection approaches for the multimodal VQA task, using popular VQA datasets and composite settings to isolate different OOD factors (e.g., visual novelty, linguistic novelty, and image–question agreement). The results suggest that answer confidence alone is often a poor signal, while question-generation-based and attention-based methods can significantly improve detection, though ungrounded pairs and small image distribution shifts remain challenging.


BibTeX

@InProceedings{Shi_2024_WACV,
  author = {Shi, Xiangxi and Lee, Stefan},
  title = {Benchmarking Out-of-Distribution Detection in Visual Question Answering},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  month = {January},
  year = {2024},
  pages = {5485-5495}
}