Finding Needles In Emb(a)dding Haystacks: Legal Document Retrieval Via Bagging And SVR Ensembles | Awesome Learning to Hash Add your paper to Learning2Hash

Finding Needles In Emb(a)dding Haystacks: Legal Document Retrieval Via Bagging And SVR Ensembles

Kevin Bönisch, Alexander Mehler . Arxiv 2025 – 0 citations

[Paper]   Search on Google Scholar   Search on Semantic Scholar
Datasets Evaluation Neural Hashing Text Retrieval

We introduce a retrieval approach leveraging Support Vector Regression (SVR) ensembles, bootstrap aggregation (bagging), and embedding spaces on the German Dataset for Legal Information Retrieval (GerDaLIR). By conceptualizing the retrieval task in terms of multiple binary needle-in-a-haystack subtasks, we show improved recall over the baselines (0.849 > 0.803 | 0.829) using our voting ensemble, suggesting promising initial results, without training or fine-tuning any deep learning models. Our approach holds potential for further enhancement, particularly through refining the encoding models and optimizing hyperparameters.

Similar Work