• Home
  • Publications
  • Spatial Cross-Attention RGB-D Fusion Module for Object Detection

Research Area

Author

  • Shangyin Gao, Lev Markhasin, Bi Wang
  • * External authors

Company

  • Sony Europe B.V.

Venue

  • MMSP

Date

  • 2021

Share

Spatial Cross-Attention RGB-D Fusion Module for Object Detection

View Publication

Abstract

We investigate different RGB and depth fusion techniques for object detection with the aim to improve the detection accuracy compared to RGB-only systems. We consider recent proposal-free convolutional object detectors which we modify for RGB-D data. We introduce a third mixed branch in our network beside the RGB and depth branches and define a novel attention mechanism which extracts weighted features from the depth branch and applies them to the RGB feature map thus fusing the branches adaptively. Our method, which we call spatial Cross-Attention Fusion network or CAF-Net yields a state-of-the-art mean average precision of 60.3% on the SUN RGB-D dataset outperforming all previous techniques by a significant margin.

Share