基于DT-LIF神經(jīng)元與SSD的脈沖神經(jīng)網(wǎng)絡(luò)目標(biāo)檢測方法

周雅; 栗心怡; 武喜艷; 趙宇飛; 宋勇

doi:10.11999/JEIT221367

基于DT-LIF神經(jīng)元與SSD的脈沖神經(jīng)網(wǎng)絡(luò)目標(biāo)檢測方法

doi: 10.11999/JEIT221367

北京理工大學(xué)光電學(xué)院北京 100081

基金項目: 國家自然科學(xué)基金(82272130, U22A20103)

詳細信息

作者簡介:
周雅：女，副教授，研究方向為智能光電信息處理

栗心怡：女，碩士生，研究方向為類腦計算

武喜艷：女，博士生，研究方向為脈沖神經(jīng)網(wǎng)絡(luò)及其應(yīng)用

趙宇飛：男，博士后，研究方向為面向計算機視覺的類腦計算

宋勇：男，教授，研究方向為類腦計算、智能交互等

通訊作者:
宋勇　yongsong@bit.edu.cn

中圖分類號: TN911.73; TP391.41
計量
- 文章訪問數(shù): 1060
- HTML全文瀏覽量: 482
- PDF下載量: 190
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2022-11-01
- 修回日期: 2023-05-11
- 網(wǎng)絡(luò)出版日期: 2023-05-20
- 刊出日期: 2023-08-21

Object Detection Method with Spiking Neural Network Based on DT-LIF Neuron and SSD

School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China

Funds: The National Natural Science Foundation of China (82272130, U22A20103)

摘要

摘要: 相對于傳統(tǒng)人工神經(jīng)網(wǎng)絡(luò)(ANN)，脈沖神經(jīng)網(wǎng)絡(luò)(SNN)具有生物可解釋性、計算效率高等優(yōu)勢。然而，對于目標(biāo)檢測任務(wù)，SNN存在訓(xùn)練難度大、精度低等問題。針對上述問題，該文提出一種基于動態(tài)閾值LIF神經(jīng)元(DT-LIF)與單鏡頭多盒檢測器(SSD)的SNN目標(biāo)檢測方法。首先，設(shè)計了一種DT-LIF神經(jīng)元模型，該模型可根據(jù)累積的膜電位動態(tài)調(diào)整神經(jīng)元的閾值，以驅(qū)動深層網(wǎng)絡(luò)的脈沖活動，提高推理速度。同時，以DT-LIF神經(jīng)元為基元，構(gòu)建了一種基于SSD的混合SNN。該網(wǎng)絡(luò)以脈沖視覺幾何群網(wǎng)絡(luò)(Spiking VGG)和脈沖密集連接卷積網(wǎng)絡(luò)(Spiking DenseNet)為主干(Backbone)，具有由批處理歸一化(BN)層、脈沖卷積(SC)層與DT-LIF神經(jīng)元構(gòu)成的3個額外層和SSD預(yù)測框頭(Head)。實驗結(jié)果表明，相對于LIF神經(jīng)元網(wǎng)絡(luò)，DT-LIF神經(jīng)元網(wǎng)絡(luò)在Prophesee GEN1數(shù)據(jù)集上的目標(biāo)檢測精度提高了25.2%。對比AsyNet算法，所提方法的目標(biāo)檢測精度提高了17.9%。
- 計算機視覺 /
- 目標(biāo)檢測 /
- 脈沖神經(jīng)網(wǎng)絡(luò) /
- 神經(jīng)元
Abstract: Compared with traditional Artificial Neural Network (ANN), the Spiking Neural Network (SNN) has advantages of bioligical reliability and high computational efficiency. However, for object detection task, SNN has problems such as high training difficulty and low accuracy. In response to the above problems, an object detection method with SNN based on Dynamic Threshold Leaky Integrate-and-Fire (DT-LIF) neuron and Single Shot multibox Detector (SSD) is proposed. First, a DT-LIF neuron is designed, which can dynamically adjust the threshold of neuron according to the cumulative membrane potential to drive spike activity of the deep network and imporve the inferance speed. Meanwhile, using DT-LIF neuron as primitive, a hybrid SNN based on SSD is constructed. The network uses Spiking Visual Geometry Group (Spiking VGG) and Spiking Densely Connected Convolutional Network (Spiking DenseNet) as the backbone, and combines with SSD prediction head and three additional layers composed of Batch Normalization (BN) layer , Spiking Convolution (SC) layer, and DT-LIF neuron. Experimental results show that compared with LIF neuron network, the object detection accuracy of DT-LIF neuron network on the Prophesee GEN1 dataset is improved by 25.2%. Compared with the AsyNet algorithm, the object detection accuracy of the proposed method is improved by 17.9%.
- Computer vision /
- Object detection /
- Spiking Neural Network (SNN) /
- Neuron

HTML全文

圖 1 LIF神經(jīng)元模型等效電路

下載: 全尺寸圖片幻燈片

圖 2 基于DT-LIF神經(jīng)元與SSD的目標(biāo)檢測算法的結(jié)構(gòu)

下載: 全尺寸圖片幻燈片

圖 3 DT-LIF神經(jīng)元模型示意圖

下載: 全尺寸圖片幻燈片

圖 4 Spiking VGG網(wǎng)絡(luò)結(jié)構(gòu)圖(以VGG11為例)

下載: 全尺寸圖片幻燈片

圖 5 Spiking DenseNet網(wǎng)絡(luò)結(jié)構(gòu)圖(以DenseNet121為例)

下載: 全尺寸圖片幻燈片

圖 6 Prophesee GEN1數(shù)據(jù)集示例

下載: 全尺寸圖片幻燈片

圖 7 訓(xùn)練損失(Loss)曲線圖

下載: 全尺寸圖片幻燈片

算法1 DT-LIF發(fā)射脈沖過程
參數(shù)：θ, p, q, V_th, τ_m
(1) θ = V_th = 1; V = 0; V_reset = 0 // 初始化
(2) for t = 1 to timesteps do
(3) 　for l = 2 to L do
(4) 　　for i = 1 to neurons do
(5) 　　　$ H_{i,t}^l $ = $ V_{i,t-1}^l $ + ($ X_{i,t}^l $ – ($ V_{i,t-1}^l $ – V_reset)) * tau // $ X_{i,t}^l $ 　　　　　是正向傳遞的輸入
(6) 　　　delta = $ H_{i,t}^l $ – $ V_{i,t-1}^l $
(7) 　　　$\theta_{i,t}^l $ = p + q exp (–delta / c)
(8) 　　　if $ H_{i,t}^l $ ≥ $\theta_{i,t}^l $ then
(9)　　　　 $ S_{i,t}^l $ = 1
(10) 　　　 $ V_{i,t}^l $ = V_reset
(11) 　　 end for
(12) 　 end for
(13) end for

下載: 導(dǎo)出CSV

表 1 Prophesee GEN1數(shù)據(jù)集上的對比實驗結(jié)果

方法	mAP(0.5:0.95)
Spiking VGG11+LIF	0.127
Spiking VGG11+DT-LIF	0.159
Spiking DenseNet+LIF	0.148
Spiking DenseNet+DT-LIF	0.165
AsyNet^[31]	0.140

下載: 導(dǎo)出CSV

參考文獻(31)

[1]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 580–587.
[2]	HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916. doi: 10.1109/TPAMI.2015.2389824
[3]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779–788.
[4]	LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21–37.
[5]	TAN Mingxing and LE Q. EfficientNet: Rethinking model scaling for convolutional neural networks[C]. The 36th International Conference on Machine Learning, Long Beach, USA, 2019: 6105–6114.
[6]	GERSTNER W and KISTLER W M. Spiking Neuron Models: Single Neurons, Populations, Plasticity[M]. Cambridge: Cambridge University Press, 2002: 421–454.
[7]	KIM S, PARK S, NA B, et al. Spiking-YOLO: Spiking neural network for energy-efficient object detection[C]. The 34th AAAI Conference on Artificial Intelligence, New York, USA, 2020: 11270–11277.
[8]	CHAKRABORTY B, SHE Xueyuan, and MUKHOPADHYAY S. A fully spiking hybrid neural network for energy-efficient object detection[J]. IEEE Transactions on Image Processing, 2021, 30: 9014–9029. doi: 10.1109/TIP.2021.3122092
[9]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2999–3007.
[10]	KUGELE A, PFEIL T, PFEIFFER M, et al. Hybrid SNN-ANN: Energy-efficient classification and object detection for event-based vision[C]. 43rd DAGM German Conference on Pattern Recognition, Bonn, Germany, 2022: 297–312.
[11]	胡一凡, 李國齊, 吳郁杰, 等. 脈沖神經(jīng)網(wǎng)絡(luò)研究進展綜述[J]. 控制與決策, 2021, 36(1): 1–26. doi: 10.13195/j.kzyjc.2020.1006 HU Yifan, LI Guoqi, WU Yujie, et al. Spiking neural networks: A survey on recent advances and new directions[J]. Control and Decision, 2021, 36(1): 1–26. doi: 10.13195/j.kzyjc.2020.1006
[12]	TOYOIZUMI T, PFISTER J P, AIHARA K, et al. Spike-timing dependent plasticity and mutual information maximization for a spiking neuron model[C]. The 17th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2004: 1409–1416.
[13]	HEBB D O. The Organization of Behavior: A Neuropsychological Theory[M]. New York: Psychology Press, 2002.
[14]	KHERADPISHEH S R, GANJTABESH M, THORPE S J, et al. STDP-based spiking deep convolutional neural networks for object recognition[J]. Neural Networks, 2018, 99: 56–67. doi: 10.1016/j.neunet.2017.12.005
[15]	DIEHL P U, NEIL D, BINAS J, et al. Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing[C]. 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 2015: 1–8.
[16]	NEFTCI E O, MOSTAFA H, and ZENKE F. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks[J]. IEEE Signal Processing Magazine, 2019, 36(6): 51–63. doi: 10.1109/msp.2019.2931595
[17]	WU Yujie, DENG Lei, LI Guoqi, et al. Spatio-temporal backpropagation for training high-performance spiking neural networks[J]. Frontiers in Neuroscience, 2018, 12: 331. doi: 10.3389/fnins.2018.00331
[18]	ZHENG Hanle, WU Yujie, DENG Lei, et al. Going deeper with directly-trained larger spiking neural networks[C]. The 35th AAAI Conference on Artificial Intelligence, Palo Alto, USA, 2021: 11062–11070.
[19]	FANG Wei, YU Zhaofei, CHEN Yanqi, et al. Incorporating learnable membrane time constant to enhance learning of spiking neural networks[C]. 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 2641–2651.
[20]	GERSTNER W, KISTLER W M, NAUD R, et al. Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition[M]. Cambridge: Cambridge University Press, 2014.
[21]	賀豐收, 何友, 劉準(zhǔn)釓, 等. 卷積神經(jīng)網(wǎng)絡(luò)在雷達自動目標(biāo)識別中的研究進展[J]. 電子與信息學(xué)報, 2020, 42(1): 119–131. doi: 10.11999/JEIT180899 HE Fengshou, HE You, LIU Zhunga, et al. Research and development on applications of convolutional neural networks of radar automatic target recognition[J]. Journal of Electronics &Information Technology, 2020, 42(1): 119–131. doi: 10.11999/JEIT180899
[22]	董小偉, 韓悅, 張正, 等. 基于多尺度加權(quán)特征融合網(wǎng)絡(luò)的地鐵行人目標(biāo)檢測算法[J]. 電子與信息學(xué)報, 2021, 43(7): 2113–2120. doi: 10.11999/JEIT200450 DONG Xiaowei, HAN Yue, ZHANG Zheng, et al. Metro pedestrian detection algorithm based on multi-scale weighted feature fusion network[J]. Journal of Electronics &Information Technology, 2021, 43(7): 2113–2120. doi: 10.11999/JEIT200450
[23]	SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. 3rd International Conference on Learning Representations, San Diego, USA, 2015.
[24]	AZOUZ R and GRAY C M. Dynamic spike threshold reveals a mechanism for synaptic coincidence detection in cortical neurons in vivo[J]. Proceedings of the National Academy of Sciences of the United States of America, 2000, 97(14): 8110–8115. doi: 10.1073/PNAS.130200797
[25]	FONTAINE B, PE?A J L, and BRETTE R. Spike-threshold adaptation predicted by membrane potential dynamics in vivo[J]. PLoS Computational Biology, 2014, 10(4): e1003560. doi: 10.1371/journal.PCBI.1003560
[26]	XIAO Rong, TANG Huajin, MA Yuhao, et al. An event-driven categorization model for AER image sensors using multispike encoding and learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(9): 3649–3657. doi: 10.1109/tnnls.2019.2945630
[27]	FANG Wei, YU Zhaofei, CHEN Yanqi, et al. Deep residual learning in spiking neural networks[C/OL]. The 34th International Conference on Neural Information Processing Systems, 2021: 21056–21069.
[28]	HUANG Gao, LIU Zhuang, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2261–2269.
[29]	DE TOURNEMIRE P, NITTI D, PEROT E, et al. A large scale event-based detection dataset for automotive[EB/OL]. https://doi.org/10.48550/arXiv.2001.08499, 2020.
[30]	張德祥, 王俊, 袁培成. 基于注意力機制的多尺度全場景監(jiān)控目標(biāo)檢測方法[J]. 電子與信息學(xué)報, 2022, 44(9): 3249–3257. doi: 10.11999/JEIT210664 ZHANG Dexiang, WANG Jun, and YUAN Peicheng. Object detection method for multi-scale full-scene surveillance based on attention mechanism[J]. Journal of Electronics &Information Technology, 2022, 44(9): 3249–3257. doi: 10.11999/JEIT210664
[31]	MESSIKOMMER N, GEHRIG D, LOQUERCIO A, et al. Event-based asynchronous sparse convolutional networks[C]. 16th European Conference on Computer Vision, Glasgow, UK, 2020: 415–431.