General Image Descriptors For Open World Image Retrieval Using Vit CLIP | Awesome Learning to Hash Add your paper to Learning2Hash

General Image Descriptors For Open World Image Retrieval Using Vit CLIP

Marcos V. Conde, Ivan Aerlic, Simon Jégou . Arxiv 2022 – 1 citation

[Paper]   Search on Google Scholar   Search on Semantic Scholar
Few Shot & Zero Shot Image Retrieval

The Google Universal Image Embedding (GUIE) Challenge is one of the first competitions in multi-domain image representations in the wild, covering a wide distribution of objects: landmarks, artwork, food, etc. This is a fundamental computer vision problem with notable applications in image retrieval, search engines and e-commerce. In this work, we explain our 4th place solution to the GUIE Challenge, and our “bag of tricks” to fine-tune zero-shot Vision Transformers (ViT) pre-trained using CLIP.

Similar Work