[B] Why can’t Vision-Language Models like CLIP find OOD data in medical images? Or can they? How?

Master Assignment

Type: Master EE/CS/ITC

Period: TBD

Student: (Unassigned)

If you are interested please contact :

Background:

It has been shown that CLIP can achieve good OOD performance for low-severity OOD cases. There can even be good zero-shot CLIP performance with no fine-tuning. However, this is not always the case for medical images, as their visual features may be different. For this reason, MediCLIP has been proposed.

The goal of this thesis is to investigate the limitations and possibilities of MediCLIP on a set of OOD cases with increasing complexity.

Resources:

CLIPN for Zero Shot OOD detection: Teaching CLIP to say No https://openaccess.thecvf.com/content/ICCV2023/papers/Wang_CLIPN_for_Zero-Shot_OOD_Detection_Teaching_CLIP_to_Say_No_ICCV_2023_paper.pdf
MediCLIP: Adapting CLIP for Few-shot Medical Image Anomaly Detection
1. https://papers.miccai.org/miccai-2024/504-Paper0333.html
2. https://github.com/Mauville/MedCLIP
Adapting Contrastive Language-Image Pretrained (CLIP) Models for Out-of-Distribution Detection https://openreview.net/forum?id=YCgX7sJRF1
BiomedCLIP https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224