[Paper]
ARXIV
Deep Learning
Supervised
Mobile visual search applications are emerging that enable users to sense their surroundings with smart phones. However, because of the particular challenges of mobile visual search, achieving a high recognition bitrate has becomes a consistent target of previous related works. In this paper, we propose a few-parameter, low-latency, and high-accuracy deep hashing approach for constructing binary hash codes for mobile visual search. First, we exploit the architecture of the MobileNet model, which significantly decreases the latency of deep feature extraction by reducing the number of model parameters while maintaining accuracy. Second, we add a hash-like layer into MobileNet to train the model on labeled mobile visual data. Evaluations show that the proposed system can exceed state-of-the-art accuracy performance in terms of the MAP. More importantly, the memory consumption is much less than that of other deep learning models. The proposed method requires only \(13\) MB of memory for the neural network and achieves a MAP of \(97.80\%\) on the mobile location recognition dataset used for testing.