[Paper]
We suggest a variation of the Hellerstein–Koutsoupias–Papadimitriou indexability model for datasets equipped with a similarity measure, with the aim of better understanding the structure of indexing schemes for similarity-based search and the geometry of similarity workloads. This in particular provides a unified approach to a great variety of schemes used to index into metric spaces and facilitates their transfer to more general similarity measures such as quasi-metrics. We discuss links between performance of indexing schemes and high-dimensional geometry. The concepts and results are illustrated on a very large concrete dataset of peptide fragments equipped with a biologically significant similarity measure.