Adaptive sparse attention module based on reciprocal nearest neighbors
Zhonggui Suna,b,*,
Can Zhanga,
and Mingzhu
Zhangc
aLiaocheng University, School of Mathematical Sciences, Liaocheng, China, 252000
bXidian University, Video and Image Processing System Laboratory, School of Electronic Engineering, Xi¨an, China, 710126
cChongqing University of Posts and Telecommunications, College of Computer Science and Technology, Chongqing, China, 400065
Abstract
The attention mechanism has become a crucial technique in deep feature representation
for computer vision tasks. Using a similarity matrix, it enhances the current
feature point with global context from the feature map of the network. However,
the indiscriminate utilization of all information can easily introduce some
irrelevant contents, inevitably hampering performance. In response to this
challenge, sparsing, a common information filtering strategy, has been applied
in many related studies. Regrettably, their filtering processes often lack
reliability and adaptability. To address this issue, we first define an
adaptive-reciprocal nearest neighbors (A-RNN) relationship. In identifying
neighbors, it gains flexibility through learning adaptive thresholds.
Additionally, by introducing a reciprocity mechanism, the reliability of
neighbors is ensured. Then, we use A-RNN to rectify the similarity matrix in the
conventional attention module. In the specific implementation, to distinctly
consider non-local and local information, we introduce two blocks: the non-local
sparse constraint block (NLSCB) and the local sparse constraint block (LSCB).
The former utilizes A-RNN to sparsify non-local information, while the latter
uses adaptive thresholds to sparsify local information. As a result, an adaptive
sparse attention (ASA) module is achieved, inheriting the advantages of
flexibility and reliability from A-RNN. In the validation for the proposed ASA
module, we use it to replace the attention module in NLNet and conduct
experiments on semantic segmentation benchmarks including Cityscapes, ADE20K and
PASCAL VOC 2012. With the same backbone (ResNet101), our ASA module outperforms
the conventional attention module and its some state-of-the-art (SOTA) variants.
The source code is available at http://www.diplab.net/lunwen/ASA.htm.
Paper and Code
Z. Sun, C. Zhang, M. Zhang. Adaptive sparse attention module based on reciprocal nearest neighbors. Journal of Electronic Imaging (JEI), vol.33, no.3, 033038, 2024. code
Architecture and Algorithm
Visualization Results