About This Architecture
Attentional Feature Fusion (AFF) architecture combines dilated and attention features through a multi-scale channel attention module that learns adaptive weights for feature integration. Dual input streams (Dilated Features X1 and Attention Features X2) flow through global average and max pooling, shared MLPs with channel reduction, and sigmoid-gated element-wise multiplication to produce weighted feature maps. The architecture demonstrates how channel attention mechanisms can selectively emphasize informative features while suppressing noise, a critical technique for improving model robustness in semantic segmentation and object detection tasks. Fork this diagram on Diagrams.so to customize layer dimensions, explore alternative pooling strategies, or integrate AFF into your own encoder-decoder networks. This pattern is particularly effective in multi-scale vision tasks where feature heterogeneity demands adaptive fusion rather than simple concatenation.