Adaptive sparse attention module based on reciprocal nearest neighbors



Zhonggui Suna,b,*, Can Zhanga, and Mingzhu Zhangc

aLiaocheng University, School of Mathematical Sciences, Liaocheng, China, 252000

bXidian University, Video and Image Processing System Laboratory, School of Electronic Engineering, Xi¨an, China, 710126

cChongqing University of Posts and Telecommunications, College of Computer Science and Technology, Chongqing, China, 400065


 

Abstract

The attention mechanism has become a crucial technique in deep feature representation for computer vision tasks. Using a similarity matrix, it enhances the current feature point with global context from the feature map of the network. However, the indiscriminate utilization of all information can easily introduce some irrelevant contents, inevitably hampering performance. In response to this challenge, sparsing, a common information filtering strategy, has been applied in many related studies. Regrettably, their filtering processes often lack reliability and adaptability. To address this issue, we first define an adaptive-reciprocal nearest neighbors (A-RNN) relationship. In identifying neighbors, it gains flexibility through learning adaptive thresholds. Additionally, by introducing a reciprocity mechanism, the reliability of neighbors is ensured. Then, we use A-RNN to rectify the similarity matrix in the conventional attention module. In the specific implementation, to distinctly consider non-local and local information, we introduce two blocks: the non-local sparse constraint block (NLSCB) and the local sparse constraint block (LSCB). The former utilizes A-RNN to sparsify non-local information, while the latter uses adaptive thresholds to sparsify local information. As a result, an adaptive sparse attention (ASA) module is achieved, inheriting the advantages of flexibility and reliability from A-RNN. In the validation for the proposed ASA module, we use it to replace the attention module in NLNet and conduct experiments on semantic segmentation benchmarks including Cityscapes, ADE20K and PASCAL VOC 2012. With the same backbone (ResNet101), our ASA module outperforms the conventional attention module and its some state-of-the-art (SOTA) variants. The source code is available at http://www.diplab.net/lunwen/ASA.htm.
 

 

Paper and Code

Z. Sun, C. Zhang, M. Zhang. Adaptive sparse attention module based on reciprocal nearest neighbors. Journal of Electronic Imaging (JEI), vol.33, no.3, 033038, 2024.  code

 

Architecture and Algorithm

 

 

 

Visualization Results