Slvideo: A Sign Language Video Moment Retrieval Framework | Awesome Learning to Hash Add your paper to Learning2Hash

Slvideo: A Sign Language Video Moment Retrieval Framework

Gonçalo Vinagre Martins, João Magalhães, Afonso Quinaz, Carla Viegas, Sofia Cavaco . Arxiv 2024 – 0 citations

[Other] [Paper]   Search on Google Scholar   Search on Semantic Scholar
Datasets Few Shot & Zero Shot Tools & Libraries

SLVideo is a video moment retrieval system for Sign Language videos that incorporates facial expressions, addressing this gap in existing technology. The system extracts embedding representations for the hand and face signs from video frames to capture the signs in their entirety, enabling users to search for a specific sign language video segment with text queries. A collection of eight hours of annotated Portuguese Sign Language videos is used as the dataset, and a CLIP model is used to generate the embeddings. The initial results are promising in a zero-shot setting. In addition, SLVideo incorporates a thesaurus that enables users to search for similar signs to those retrieved, using the video segment embeddings, and also supports the edition and creation of video sign language annotations. Project web page: https://novasearch.github.io/SLVideo/

Similar Work