Learning Multi-modal Similarity | Awesome Learning to Hash Add your paper to Learning2Hash

Learning Multi-modal Similarity

Brian McFee, Gert Lanckriet . Arxiv 2010 – 117 citations

[Paper]   Search on Google Scholar   Search on Semantic Scholar
Evaluation Graph Based ANN Recommender Systems

In many applications involving multi-media data, the definition of similarity between items is integral to several key tasks, e.g., nearest-neighbor retrieval, classification, and recommendation. Data in such regimes typically exhibits multiple modalities, such as acoustic and visual content of video. Integrating such heterogeneous data to form a holistic similarity space is therefore a key challenge to be overcome in many real-world applications. We present a novel multiple kernel learning technique for integrating heterogeneous data into a single, unified similarity space. Our algorithm learns an optimal ensemble of kernel transfor- mations which conform to measurements of human perceptual similarity, as expressed by relative comparisons. To cope with the ubiquitous problems of subjectivity and inconsistency in multi- media similarity, we develop graph-based techniques to filter similarity measurements, resulting in a simplified and robust training procedure.

Similar Work