Abstract: With its powerful visual-language alignment capability, CLIP performs well in zero-shot and few-shot learning tasks. However, we found in experiments that CLIP’s logits suffer from serious ...
One-shot semantic segmentation is to segment the object regions of unseen categories with only one annotated example as the supervision. Existing methods often adopt the multimodal pre-trained model ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果