注意力机制 |
您所在的位置:网站首页 › cbamc3 › 注意力机制 |
CAM、SAM、CBAM详见:CBAM——即插即用的注意力模块(附代码) 目录 1.什么是注意力机制? 2.通道注意力机制——SE (1)Squeeze (2)Excitation (3)SE Block 3.CAM 4.SAM 5.CBAM 6.代码 参考 1.什么是注意力机制?从数学角度看,注意力机制即提供一种权重模式进行运算。 神经网络中,注意力机制即利用一些网络层计算得到特征图对应的权重值,对特征图进行”注意力机制“。 2.通道注意力机制——SE论文地址:论文 该论文于2018年发表于CVPR,是较早的将注意力机制引入卷积神经网络,并且该机制是一种即插即用的模块,可嵌入任意主流的卷积神经网络中,为卷积神经网络模型设计提供新思路——即插即用模块设计。 摘要核心: 背景介绍:卷积神经网络的核心是卷积操作,其通过局部感受野的方式融合空间和通道维度的特征;针对空间维度的特征提取方法已被广泛研究。本文内容:本文针对通道维度进行研究,探索通道之间的关系,并提出SE Block,它可自适应的调整通道维度上的特征。研究成果:SE Block可堆叠构成SENet,SENet在多个数据集上表现良好;SENet不仅可以大幅提升精度,同时仅需要增加少量的参数。比赛成绩:ILSVRC 2017分类冠军,top-5 error低至2.251%,相对于2016冠军下降了~25%代码开源SE Block模型图如下所示:由两部分组成Squeeze和Excitation
Squeeze(Global Information Embedding):全局信息低维嵌入 Squeeze操作:采用全局池化,即压缩H和W至1*1,利用1个像素来表示一个通道,实现低维嵌入。压缩后的特征本质是一个向量,无空间维度,只有通道维度。 Squeeze计算公式:
相对应模型的实现位置 Excitation(Adaptative Recalibration):适应变换 Excitation部分是用2个全连接来实现 ,第一个全连接把C个通道压缩成了C/r个通道来降低计算量(后面跟了RELU),第二个全连接再恢复回C个通道(后面跟了Sigmoid),r是指压缩的比例。作者尝试了r在各种取值下的性能 ,最后得出结论r=16时整体性能和计算量最平衡。 Excitation公式:
相对应模型的实现位置 分开看完之后,再整合起来看就是如下图这样的操作过程。 Squeeze:压缩特征图至向量形式Excitation:两个全连接对特征向量进行映射变换Scale:将得到的权重向量于通道的乘法SE Block的嵌入方式:只“重构”特征图,不改变原来结构。
空间注意力模块 import torch from torch import nn class SpatialAttention(nn.Module): def __init__(self, kernel_size=7): super(SpatialAttention, self).__init__() assert kernel_size in (3, 7), 'kernel size must be 3 or 7' padding = 3 if kernel_size == 7 else 1 self.conv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False) # 7,3 3,1 self.sigmoid = nn.Sigmoid() def forward(self, x): avg_out = torch.mean(x, dim=1, keepdim=True) max_out, _ = torch.max(x, dim=1, keepdim=True) x = torch.cat([avg_out, max_out], dim=1) x = self.conv1(x) return self.sigmoid(x) if __name__ == '__main__': SA = SpatialAttention(7) data_in = torch.randn(8,32,300,300) data_out = SA(data_in) print(data_in.shape) # torch.Size([8, 32, 300, 300]) print(data_out.shape) # torch.Size([8, 1, 300, 300])通道注意力模块 import torch from torch import nn class ChannelAttention(nn.Module): def __init__(self, in_planes, ratio=16): super(ChannelAttention, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.max_pool = nn.AdaptiveMaxPool2d(1) self.fc1 = nn.Conv2d(in_planes, in_planes // ratio, 1, bias=False) self.relu1 = nn.ReLU() self.fc2 = nn.Conv2d(in_planes // ratio, in_planes, 1, bias=False) self.sigmoid = nn.Sigmoid() def forward(self, x): avg_out = self.fc2(self.relu1(self.fc1(self.avg_pool(x)))) max_out = self.fc2(self.relu1(self.fc1(self.max_pool(x)))) out = avg_out + max_out return self.sigmoid(out) if __name__ == '__main__': CA = ChannelAttention(32) data_in = torch.randn(8,32,300,300) data_out = CA(data_in) print(data_in.shape) # torch.Size([8, 32, 300, 300]) print(data_out.shape) # torch.Size([8, 32, 1, 1])CBAM注意力机制 import torch from torch import nn class ChannelAttention(nn.Module): def __init__(self, in_planes, ratio=16): super(ChannelAttention, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.max_pool = nn.AdaptiveMaxPool2d(1) self.fc1 = nn.Conv2d(in_planes, in_planes // ratio, 1, bias=False) self.relu1 = nn.ReLU() self.fc2 = nn.Conv2d(in_planes // ratio, in_planes, 1, bias=False) self.sigmoid = nn.Sigmoid() def forward(self, x): avg_out = self.fc2(self.relu1(self.fc1(self.avg_pool(x)))) max_out = self.fc2(self.relu1(self.fc1(self.max_pool(x)))) out = avg_out + max_out return self.sigmoid(out) class SpatialAttention(nn.Module): def __init__(self, kernel_size=7): super(SpatialAttention, self).__init__() assert kernel_size in (3, 7), 'kernel size must be 3 or 7' padding = 3 if kernel_size == 7 else 1 self.conv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False) # 7,3 3,1 self.sigmoid = nn.Sigmoid() def forward(self, x): avg_out = torch.mean(x, dim=1, keepdim=True) max_out, _ = torch.max(x, dim=1, keepdim=True) x = torch.cat([avg_out, max_out], dim=1) x = self.conv1(x) return self.sigmoid(x) class CBAM(nn.Module): def __init__(self, in_planes, ratio=16, kernel_size=7): super(CBAM, self).__init__() self.ca = ChannelAttention(in_planes, ratio) self.sa = SpatialAttention(kernel_size) def forward(self, x): out = x * self.ca(x) result = out * self.sa(out) return result if __name__ == '__main__': print('testing ChannelAttention'.center(100,'-')) torch.manual_seed(seed=20200910) CA = ChannelAttention(32) data_in = torch.randn(8,32,300,300) data_out = CA(data_in) print(data_in.shape) # torch.Size([8, 32, 300, 300]) print(data_out.shape) # torch.Size([8, 32, 1, 1]) if __name__ == '__main__': print('testing SpatialAttention'.center(100,'-')) torch.manual_seed(seed=20200910) SA = SpatialAttention(7) data_in = torch.randn(8,32,300,300) data_out = SA(data_in) print(data_in.shape) # torch.Size([8, 32, 300, 300]) print(data_out.shape) # torch.Size([8, 1, 300, 300]) if __name__ == '__main__': print('testing CBAM'.center(100,'-')) torch.manual_seed(seed=20200910) cbam = CBAM(32, 16, 7) data_in = torch.randn(8,32,300,300) data_out = cbam(data_in) print(data_in.shape) # torch.Size([8, 32, 300, 300]) print(data_out.shape) # torch.Size([8, 1, 300, 300])SE注意力机制 from torch import nn import torch class SELayer(nn.Module): def __init__(self, channel, reduction=16): super(SELayer, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.fc = nn.Sequential( nn.Linear(channel, channel // reduction, bias=False), nn.ReLU(inplace=True), nn.Linear(channel // reduction, channel, bias=False), nn.Sigmoid() ) def forward(self, x): b, c, _, _ = x.size() y = self.avg_pool(x).view(b, c) y = self.fc(y).view(b, c, 1, 1) return x * y.expand_as(x) # return x * y if __name__ == '__main__': torch.manual_seed(seed=20200910) data_in = torch.randn(8,32,300,300) SE = SELayer(32) data_out = SE(data_in) print(data_in.shape) # torch.Size([8, 32, 300, 300]) print(data_out.shape) # torch.Size([8, 32, 300, 300]) 参考注意力机制代码 SE模型详解 SENet |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |