融合注意力机制的卷积网络单像素成像

王翔; 周义深; 张轩阁; 陈希浩

doi:10.7498/aps.74.20250010

融合注意力机制的卷积网络单像素成像

辽宁大学物理学院, 沈阳　110036

通讯作者: xi-haochen@163.com

中图分类号: 42.30.Va, 42.30.Wb, 89.20.Ff, 42.30.-d

Convolutional network single-pixel imaging with fusion attention mechanism

School of Physics, Liaoning University, Shenyang 110036, China

Corresponding author: E-mail: xi-haochen@163.com

MSC: 42.30.Va, 42.30.Wb, 89.20.Ff, 42.30.-d

摘要: 提出了一种基于物理驱动的融合注意力机制的新型卷积网络单像素成像方法. 通过将结合通道与空间注意力机制的模块集成到一个随机初始化的卷积网络中, 利用单像素成像的物理模型约束网络, 实现了高质量的图像重建. 具体来说, 将空间与通道两个维度的注意力机制集成为一个模块, 引入到多尺度U-net卷积网络的各层中, 通过这种方式, 不仅可以利用注意力机制在三维数据立方中提供的关键权重信息, 还充分结合了U-net网络在不同空间频率下强大的特征提取能力. 这一创新方法能够有效捕捉图像细节, 抑制背景噪声, 提升图像重建质量. 实验结果表明, 针对低采样率条件下的图像重建, 与传统非预训练网络相比, 融合注意力机制的方案不仅在直观上图像细节重建得更好, 而且在定量的评价指标(如峰值信噪比和结构相似性)上均表现出显著优势, 验证了其在单像素成像中的有效性与应用前景.
- 单像素成像 /
- 注意力机制 /
- 卷积神经网络 /
- 图像重构
Abstract: This paper presents a novel convolutional neural network-based single-pixel imaging method that integrates a physics-driven fusion attention mechanism. By incorporating a module that combines both channel attention mechanism and spatial attention mechanism into a randomly initialized convolutional network, the method utilizes the physical model constraints of single-pixel imaging to achieve high-quality image reconstruction. Specifically, the spatial and channel attention mechanism are combined into a single module and introduced into various layers of a multi-scale U-net convolutional network. In the spatial attention mechanism, we extract the attention weight features of each spatial region of the pooled feature map by using convolution. In the channel attention mechanism, we pool the three-dimensional feature map into a single-channel signal and input it into a two-layer fully connected network to obtain the attention weight information for each channel. This approach not only uses the critical weighting information provided by the attention mechanism in the three-dimensional data cube but also fully integrates the powerful feature extraction capabilities of the U-net network across different spatial frequencies. This innovative method can effectively capture image details, suppress background noise, and improve image reconstruction quality. During the experimental phase, we employ the optical path of single-pixel imaging to acquire bucket signals for two target images, "snowflake" and "basket". By inputting any noisy image into a randomly initialized neural network with attention mechanism, and using the mean square error between simulated bucket signal and actual bucket signal, we physically constrain the convergence of the network. Ultimately, we achieve a reconstructed image that adheres to the physical model. The experimental results demonstrate that under low sampling rate conditions, the scheme of integrating the attention mechanism can not only intuitively reconstruct image details better, but also demonstrate significant advantages in quantitative evaluation metrics such as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM), confirming its effectiveness and potential application in single-pixel imaging.
- single-pixel imaging /
- attention mechanisms /
- convolutional neural networks /
- image reconstruction .
图 1 实验方案图

Figure 1. Experimental schematic diagram

下载: 全尺寸图片幻灯片

图 2 融合注意力机制的U-net卷积神经网络结构示意图　(a) U-net结构的卷积网络; (b) CBAM模块结构总览; (c)空间注意力机制模块; (d)通道注意力机制模块

Figure 2. Schematic diagram of U-net convolutional neural network structure with integrated attention mechanism: (a) Convolutional neural networks of a U-net architecture; (b) overall structure of CBAM; (c) spatial attention module; (d) channel attention module

下载: 全尺寸图片幻灯片

图 3 融合注意力机制与原始U-net网络重建方案在不同采样率下的结果

Figure 3. Results of the fusion attention mechanism and the original U-net reconstruction scheme under different sampling rates

下载: 全尺寸图片幻灯片

图 4 不同采样率下SSIM的对比

Figure 4. Comparison of SSIM at different sampling rates

下载: 全尺寸图片幻灯片

图 5 不同采样率下PSNR的对比

Figure 5. Comparison of PSNR at different sampling rates

下载: 全尺寸图片幻灯片

图 6 不同迭代次数下PSNR与损失函数的变化对比　(a)两种方案重建图像的PSNR随迭代次数的变化; (b)本文方案的损失函数在不同初始学习率下随迭代次数的变化

Figure 6. Comparison of PSNR and loss function under different iterations: (a) The PSNR of the reconstructed images of the two schemes varies with iterations; (b) the loss function of our scheme varies with iterations under different initial learning rates.

下载: 全尺寸图片幻灯片

下载: 全尺寸图片幻灯片

[1]	Kilcullen P, Ozaki T, Liang J 2022 Nat. Commun. 13 7879 doi: 10.1038/s41467-022-35585-8
[2]	Hahamovich E, Monin S, Hazan Y, Rosenthal A 2021 Nat. Commun. 12 4516 doi: 10.1038/s41467-021-24850-x
[3]	Shapiro J H 2008 Phys. Rev. A 78 061802 doi: 10.1103/PhysRevA.78.061802
[4]	Ferri F, Magatti D, Gatti A, Bache M, Brambilla E, Lugiato L 2005 Phys. Rev. Lett. 94 183602 doi: 10.1103/PhysRevLett.94.183602
[5]	Wang F, Wang C, Deng C, Han S, Situ G 2022 Photon. Res. 10 104 doi: 10.1364/PRJ.440123
[6]	Pan L, Shen Y, Qi J, Shi J, Feng X 2023 Opt. Express 31 13943 doi: 10.1364/OE.484874
[7]	Song K, Bian Y, Wang D, Li R, Wu K, Liu H, Qin C, Hu J, Xiao L 2024 Laser & Photonics Rev. published online 2401397
[8]	Zhao X S, Yu C, Wang C, Li T, Liu B, Lu H, Zhang R, Dou X, Zhang J, Pan J W 2024 Appl. Phys. Lett. 125 211103 doi: 10.1063/5.0232210
[9]	Karpowicz N, Zhong H, Xu J, Lin K I, Hwang J S, Zhang X C 2005 Semicond. Sci. Tech. 20 S293 doi: 10.1088/0268-1242/20/7/021
[10]	Simões M, Vaz P, Cortez A F V 2024. arXiv: 2411.03907 [physics.ins-det]
[11]	Shwartz S 2021 Sci. Bull. 66 857 doi: 10.1016/j.scib.2021.01.019
[12]	Olbinado M P, Paganin D M, Cheng Y, Rack A 2021 Optica 8 1538 doi: 10.1364/OPTICA.437481
[13]	Clemente P, Durán V, Tajahuerce E, Andrés P, Climent V, Lancis J 2013 Opt. Lett. 38 2524 doi: 10.1364/OL.38.002524
[14]	Jiang W, Yin Y, Jiao J, Zhao X, Sun B 2022 Photon. Res. 10 2157 doi: 10.1364/PRJ.461064
[15]	Gibson G M, Sun B, Edgar M P, Phillips D B, Hempler N, Maker G T, Malcolm G P A, Padgett M J 2017 Opt. Express 25 2998 doi: 10.1364/OE.25.002998
[16]	Zhou L, Xiao Y, Chen W 2023 Opt. Express 31 23027 doi: 10.1364/OE.489808
[17]	Xu Y, Lu L, Saragadam V, Kelly K F 2024 Nat. Commun. 15 1456 doi: 10.1038/s41467-024-45856-1
[18]	Li J, Li X, Yardimci N T, Hu J, Li Y, Chen J, Hung Y C, Jarrahi M, Ozcan A 2023 Nat. Commun. 14 6791 doi: 10.1038/s41467-023-42554-2
[19]	Li S, Liu X, Xiao Y, Ma Y, Yang J, Zhu K, Tian X 2023 Opt. Express 31 4712 doi: 10.1364/OE.473659
[20]	Zheng P, Dai Q, Li Z, Ye Z, Xiong J, Liu H C, Zheng G, Zhang S 2021 Sci. Adv. 7 eabg0363 doi: 10.1126/sciadv.abg0363
[21]	Katz O, Bromberg Y, Silberberg Y 2009 Appl. Phys. Lett. 95 131110 doi: 10.1063/1.3238296
[22]	López-García L, Cruz-Santos W, GarcíaArellano A, Filio-Aguilar P, Cisneros-Martínez J A, Ramos-García R 2022 Opt. Express 30 13714 doi: 10.1364/OE.451656
[23]	Zhang Z, Ma X, Zhong J 2015 Nat. Commun. 6 6225 doi: 10.1038/ncomms7225
[24]	Donoho D 2006 IEEE Trans. Inf. Theory 52 1289 doi: 10.1109/TIT.2006.871582
[25]	Duarte M F, Davenport M A, Takhar D, Laska J N, Sun T, Kelly K F, Baraniuk R G 2008 IEEE Signal Process Mag. 25 83 doi: 10.1109/MSP.2007.914730
[26]	Huang L, Luo R, Liu X, Hao X 2022 Light Sci. Appl. 11 61 doi: 10.1038/s41377-022-00743-6
[27]	Figueiredo M A T, Nowak R D, Wright S J 2007 IEEE J. Sel. Top. Signal Process. 11 586
[28]	Pioneers A 2024 Nat. Mach. Intell. 6 1271 doi: 10.1038/s42256-024-00945-0
[29]	查文舒, 李道伦, 沈路航, 张雯, 刘旭亮 2022 力学学报 54 543 doi: 10.6052/0459-1879-21-617 Zha W S, Li D L, Shen L H, Zhang W, Liu X L 2022 Chinese Journal of Theoretical and Applied Mechanics 54 543 doi: 10.6052/0459-1879-21-617
[30]	Zhang H, Wang J, Zhang Y, Du X, Wu H, Zhang T 2024 Astronomical Techniques and Instruments 1 1
[31]	van Leeuwen C, Podareanu D, Codreanu V, Cai M X, Berg A, Zwart S P, Stoffer R, Veerman M, van Heerwaarden C, Otten S, Caron S, Geng C, Ambrosetti F, Bonvin A M J J 2020 arXiv: 2004.03454[cs.CE]
[32]	Barbastathis G, Ozcan A, Situ G 2019 Optica 6 921 doi: 10.1364/OPTICA.6.000921
[33]	Ruget A, Moodley C, Forbes A, Leach J 2024 Opt. Express 32 41057 doi: 10.1364/OE.533343
[34]	Wetzstein G, Ozcan A, Gigan S, Fan S, Englund D, Soljačić M, Denz C, Miller D A B, Psaltis D 2020 Nature 588 39 doi: 10.1038/s41586-020-2973-6
[35]	Lyu M, Wang W, Wang H, Wang H, Li G, Chen N, Situ G 2017 Sci. Rep. 7 17865 doi: 10.1038/s41598-017-18171-7
[36]	Zhang X, Deng C, Wang C, Wang F, Situ G 2023 ACS Photonics 10 2363 doi: 10.1021/acsphotonics.2c01537
[37]	Li J, Li Y, Li J, Zhang Q, Li J 2020 Opt. Express 28 22992 doi: 10.1364/OE.399065
[38]	Wang F, Wang C, Chen M, Gong W, Zhang Y, Han S, Situ G 2022 Light Sci. Appl. 11 1 doi: 10.1038/s41377-021-00680-w
[39]	Peng L, Xie S, Qin T, Cao L, Bian L 2023 Opt. Lett. 48 2527 doi: 10.1364/OL.486078
[40]	Liu H, Bian L, Zhang J 2023 Opt. Laser Technol. 157 108600 doi: 10.1016/j.optlastec.2022.108600
[41]	Liu X, Han T, Zhou C, Huang J, Ju M, Xu B, Song L 2023 Opt. Express 31 9945 doi: 10.1364/OE.481995
[42]	Hammernik K, Küstner T, Yaman B, Huang Z, Rueckert D, Knoll F, Akçakaya M 2023 IEEE Signal Process Mag. 40 98
[43]	Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell G W 2017. arXiv: 1702.08502[cs.CV]
[44]	Ulyanov D, Vedaldi A, Lempitsky V 2020 IJCV 128 1867 doi: 10.1007/s11263-020-01303-4
[45]	Ren W, Nie X, Peng T, Scully M O 2022 Opt. Express 30 47921 doi: 10.1364/OE.478695
[46]	Zhang H, Sindagi V, Patel V M 2020 IEEE Trans. Circuits Syst. Video Technol. 30 3943 doi: 10.1109/TCSVT.2019.2920407
[47]	Lv W, Xiong J, Shi J, Huang Y, Qin S 2021 J. Intell. Manuf. 32 441 doi: 10.1007/s10845-020-01584-z
[48]	Zhang H, Wang Z, Liu D 2014 IEEE Transactions on Neural Networks and Learning Systems 25 1229 doi: 10.1109/TNNLS.2014.2317880
[49]	Baozhou Z, Hofstee P, Lee J, Al-Ars Z 2021 arXiv: 2108.08205 [cs.CV]
[50]	Karim N, Rahnavard N 2021 arXiv: 2107.01330[cs.CV]
[51]	Hoshi I, Shimobaba T, Kakue T, Ito T 2020 Opt. Express 28 34069 doi: 10.1364/OE.410191
[52]	Stollenga M, Masci J, Gomez F, Schmidhuber J 2014 arXiv: 1407.3068[cs.CV]
[53]	Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y 2018 arXiv: 1807.02758[cs.CV]
[54]	Liao X, He L, Mao J, Xu M 2024 Remote Sensing 16 1688 doi: 10.3390/rs16101688
[55]	Yu W K, Wang S F, Shang K Q 2024 Sensors 24 1012 doi: 10.3390/s24031012
[56]	Ronneberger O, Fischer P, Brox T 2015 arXiv: 1505.04597[cs.CV]
[57]	Meng Z, Yu Z, Xu K, Yuan X 2021 arXiv: 2108.12654 [eess.IV]
[58]	Ferri F, Magatti D, Lugiato L A, Gatti A 2010 Phys. Rev. Lett. 104 253603 doi: 10.1103/PhysRevLett.104.253603
[59]	Lin J, Yan Q, Lu S, Zheng Y, Sun S, Wei Z 2022 Photonics 9 343 doi: 10.3390/photonics9050343

图( 7)

计量

文章访问数: 214
HTML全文浏览数: 214
PDF下载数: 8
施引文献: 0

全文HTML

1. 引　言

单像素成像(single-pixel imaging, SPI)是近年来快速发展的一种新型计算成像技术^[1]. 与传统使用阵列探测器的成像技术不同, 它是一种间接成像技术, 通过一组空间模式按时序调制的光场来照射目标, 使用无空间分辨能力的单像素探测器(single-pixel detector, SPD)捕获被目标透射或反射的光场强度, 最后利用预制的调制散斑和单像素测量数据通过各种反演算法重建目标图像^[2]. 由于SPI仅采用单像素探测器捕获信号^[3], 因此它在检测灵敏度^[4]、光谱响应范围^[5]和成像成本^[6]等方面都比传统成像技术具有显著优势. 在过去十多年有关SPI的研究中, 工作波长范围已从最初的可见光波段逐步扩展至紫外^[7]、红外^[8], 甚至太赫兹波段^[9]. 此外, SPI技术在X射线^[10]和粒子源成像^[11]中的应用已经获得了广泛的研究和探索. SPI技术的优异性能也促进了其在医学成像^[12]、生物成像^[13]、三维成像^[14]、气体成像^[15]、透视成像^[16]、高光谱成像^[17]、全息成像^[13]、缺陷检测^[18]、遥感^[19]和光学加密^[20]等领域的广泛应用和研究热潮. 然而, SPI技术面临减少计算时间与保证图像质量的双重挑战. 高质量成像需大量空间模式, 导致采样与重建时间增加, 尤其在大规模或动态场景中更为突出. 通用调制模式(如随机散斑^[21]、哈达玛基^[22]、傅里叶基^[23])适应性较差, 需更多模式确保图像质量, 而优化算法在有限模式下难以恢复高质量图像. 压缩感知(compressed sensing, CS)^[24]结合光学、数学与优化理论^[25], 以更少模式提高SPI速度^[26], 但其迭代优化框架计算资源消耗大, 处理时间受场景复杂度影响, 且子奈奎斯特采样引发的图像质量下降仍是难题^[27].

2024年诺贝尔物理学奖授予John Hopfield和Geoffrey Hinton, 表彰他们在人工神经网络和机器学习领域的开创性贡献^[28]. 实际上物理学不仅是深度学习(deep leaning, DL)神经网络的基础, 而且神经网络反过来也促进了物理学的发展, 包括深度学习助力物理方程的求解^[29]、天体物理学与天文数据分析^[30]、数值计算与模拟^[31]、以及深度学习与光学成像技术的结合^[32]等. 其中在SPI领域^[33], DL相比传统图像重建算法, 不仅显著加速了重建过程, 还在低采样率和复杂环境下表现出卓越的重建质量^[34;35]. 这些研究包括超分辨SPI^[36]、通过散射介质的SPI^[37]、光子级SPI^[38]、基于SPI的光学加密^[39]及无图像传感^[40]等. 此外, 基于DL的SPI技术在目标分类^[41]、图像分割和目标检测^[38]等复杂感知任务中得到成功应用. DL技术在SPI中的应用主要分为数据驱动和物理驱动神经网络两大类^[42]. 数据驱动神经网络在SPI中取得了较好的表现, 但通常依赖大规模训练数据, 且在噪声较大或数据缺失的情况下效果较差, 限制了其实际应用^[43]. 为了解决这些问题, 物理驱动型神经网络通过结合深度图像先验(deep image prior, DIP)理论^[44], 能够实现高质量的图像重建, 且不依赖于大量数据集. 然而, 这些方法在处理复杂图像、捕捉高频细节和降低低采样率下的重建误差方面仍面临困难^[45].

目前卷积神经网络(convolutional neural network, CNN)^[46]、生成对抗网络(generative adversarial network, GAN)^[47]、循环神经网络(recurrent neural network, RNN)^[48]等深度学习模型已在SPI中取得了一定突破, 但各自也存在局限性. CNN在处理复杂场景时容易忽略关键信息, 影响重建精度^[49]; GAN虽然在超分辨率和抗散射成像中表现优异, 但在低采样下的细节恢复能力不足^[50]; RNN适用于动态场景, 但处理大规模数据时存在困难^[51]. 近年来注意力机制^[52]也被引入到深度学习框架中, 且已经在Transformer等模型上展现出巨大的优势. 本文以探索注意力机制在基于非训练物理驱动SPI过程中的应用为目的, 提出了融合注意力机制与U-net卷积网络的方案. 在SPI任务中注意力机制可以通过动态调整网络的关注区域, 使其更加精准地聚焦于图像中的关键信息, 从而提升图像重建的质量和分辨率^[53]. 通过动态调整网络的关注区域, 注意力机制能够帮助网络自适应地关注最重要的图像部分, 尤其在分辨率提升、噪声抑制和模糊恢复^[54]、光学加密^[55]等任务中展现了独特优势.

本文提出了一种将注意力机制融合到卷积神经网络中的SPI重建方案, 通过将空间注意力机制和通道注意力机制两个维度的信息引入到网络的各层结构中, 进一步提升非预训练物理驱动神经网络重建图像的质量. 具体来说, 将结合空间与通道两个维度的注意力机制模块集成到多尺度U-net卷积网络^[56]中, 利用注意力机制在三维数据立方中提供的权重信息与U-net网络在各个空间频率的特征提取能力实现高质量的图像重建. 大量的实验结果表明, 本文提出的融合注意力机制的方案相比于传统的基于非预训练网路SPI的重建方案在图像峰值信噪比与结构相似性等指标上展现出巨大优势.

融合注意力机制的卷积网络单像素成像

通讯作者: xi-haochen@163.com

Convolutional network single-pixel imaging with fusion attention mechanism

Corresponding author: E-mail: xi-haochen@163.com

计量

融合注意力机制的卷积网络单像素成像

通讯作者: xi-haochen@163.com

English Abstract

Convolutional network single-pixel imaging with fusion attention mechanism

Corresponding author: E-mail: xi-haochen@163.com

全文HTML

2.1. 实验方案

2.2. 图像重建

2.3. 网络结构

目录

融合注意力机制的卷积网络单像素成像

通讯作者: xi-haochen@163.com

Convolutional network single-pixel imaging with fusion attention mechanism

Corresponding author: E-mail: xi-haochen@163.com

计量

出版历程

融合注意力机制的卷积网络单像素成像

通讯作者: xi-haochen@163.com

English Abstract

Convolutional network single-pixel imaging with fusion attention mechanism

Corresponding author: E-mail: xi-haochen@163.com

全文HTML

2.1. 实验方案

2.2. 图像重建

2.3. 网络结构

目录