顯著性背景感知的多尺度紅外行人檢測(cè)方法

趙斌; 王春平; 付強(qiáng)

doi:10.11999/JEIT190761

顯著性背景感知的多尺度紅外行人檢測(cè)方法

doi: 10.11999/JEIT190761

陸軍工程大學(xué)石家莊校區(qū)電子與光學(xué)工程系石家莊 050003

詳細(xì)信息

作者簡(jiǎn)介:
趙斌：男，1990年生，博士生，研究方向?yàn)樯疃葘W(xué)習(xí)、目標(biāo)檢測(cè)

王春平：男，1965年生，博士生導(dǎo)師，研究方向?yàn)閳D像處理、火力控制理論與應(yīng)用

付強(qiáng)：男，1981年生，講師，博士，研究方向?yàn)橛?jì)算機(jī)視覺(jué)、網(wǎng)絡(luò)化火控與指控技術(shù)

通訊作者:
王春平　wang_c_p@163.com

中圖分類(lèi)號(hào): TN215
計(jì)量
- 文章訪(fǎng)問(wèn)數(shù): 2418
- HTML全文瀏覽量: 2080
- PDF下載量: 118
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2019-09-30
- 修回日期: 2020-05-13
- 網(wǎng)絡(luò)出版日期: 2020-05-20
- 刊出日期: 2020-10-13

Multi-scale Pedestrian Detection in Infrared Images with Salient Background-awareness

Department of Electronic and Optical Engineering, Shijiazhuang Campus of Army Engineering University, Shijiazhuang 050003, China

摘要

摘要: 超大視場(chǎng)(U-FOV)紅外成像系統(tǒng)探測(cè)范圍大、不受光照限制，但存在尺度多樣、小目標(biāo)豐富的特點(diǎn)。為此該文提出一種具備背景感知能力的多尺度紅外行人檢測(cè)方法，在提高小目標(biāo)檢測(cè)性能的同時(shí)，減少冗余計(jì)算。首先，構(gòu)建了4尺度的特征金字塔網(wǎng)絡(luò)分別獨(dú)立預(yù)測(cè)目標(biāo)，補(bǔ)充高分辨率細(xì)節(jié)特征。其次，在特征金字塔結(jié)構(gòu)的橫向連接中融入注意力模塊，產(chǎn)生顯著性特征，抑制不相關(guān)區(qū)域的特征響應(yīng)、突出圖像局部目標(biāo)特征。最后，在顯著性系數(shù)的基礎(chǔ)上構(gòu)建了錨框掩膜生成子網(wǎng)絡(luò)，約束錨框位置，排除平坦背景，提高處理效率。實(shí)驗(yàn)結(jié)果表明，顯著性生成子網(wǎng)絡(luò)僅增加5.94%的處理時(shí)間，具備輕量特性；超大視場(chǎng)(U-FOV)紅外行人數(shù)據(jù)集上的識(shí)別準(zhǔn)確率達(dá)到了93.20%，比YOLOv3高了26.49%；錨框約束策略能節(jié)約處理時(shí)間18.05%。重構(gòu)模型具有輕量性和高準(zhǔn)確性，適合于檢測(cè)超大視場(chǎng)中的多尺度紅外目標(biāo)。
- 紅外行人檢測(cè) /
- 超大視場(chǎng) /
- 卷積神經(jīng)網(wǎng)絡(luò) /
- 背景感知 /
- 多尺度
Abstract: The infrared imaging system of Ultrawide Field Of View (U-FOV) has large monitoring range and is not limited by illumination, but there are diverse scales and abundant small objects. For accurately detecting them, a multi-scale infrared pedestrian detection method is proposed with the ability of background-awareness, which can improve the detection performance of small objects and reduce the redundant computation. Firstly, a four scales feature pyramid network is constructed to predict object independently and supplement detail features with higher resolution. Secondly, attention module is integrated into the horizontal connection of feature pyramid structure to generate salient features, suppress feature response of irrelevant areas and enhance the object features. Finally, the anchor mask generation subnetwork is constructed on the basis of salient coefficient to the location of the anchors, to eliminate the flat background, and to improve the processing efficiency. The experimental results show that the salient generation subnetwork only increases the processing time by 5.94%, and has the lightweight characteristic. The Average-Precision is 93.20% on the U-FOV infrared pedestrian dataset, 26.49% higher than that of YOLOv3. Anchor box constraint strategy can save 18.05% of processing time. The proposed method is lightweight and accurate, which is suitable for detecting multi-scale infrared objects in the U-FOV camera.
- Infrared pedestrian detection /
- Ultrawide Field Of View(U-FOV) /
- Convolutional Neural Network(CNN) /
- Background-awareness /
- Multi-scale

HTML全文

圖 1 超大視場(chǎng)紅外圖像行人特性

下載: 全尺寸圖片幻燈片

圖 2 多尺度紅外行人檢測(cè)網(wǎng)絡(luò)結(jié)構(gòu)

下載: 全尺寸圖片幻燈片

圖 3 注意力模塊結(jié)構(gòu)

下載: 全尺寸圖片幻燈片

圖 4 顯著性特征與卷積特征融合方法

下載: 全尺寸圖片幻燈片

圖 5 錨框掩膜生成過(guò)程

下載: 全尺寸圖片幻燈片

圖 6 不同輸入圖像的錨框掩膜

下載: 全尺寸圖片幻燈片

圖 7 不同二值化閾值下的錨框掩膜

下載: 全尺寸圖片幻燈片

圖 8 紅外行人檢測(cè)可視化結(jié)果

下載: 全尺寸圖片幻燈片

表 1 不同IoU閾值下的行人檢測(cè)平均準(zhǔn)確率

方法	主干網(wǎng)絡(luò)	訓(xùn)練集	平均準(zhǔn)確率(AP)
方法	主干網(wǎng)絡(luò)	訓(xùn)練集	IoU=0.3	IoU=0.45	IoU=0.5	IoU=0.7
Faster R-CNN	ResNet101	U-FOV	–	–	0.5932	–
SSD	Mobilenet_v1	U-FOV	–	–	0.5584	–
R-FCN	ResNet101	U-FOV	–	–	0.6312	–
CSP	Resnet50	U-FOV	–	–	0.8414	–
YOLOv3	Darknet53	U-FOV	0.6595	0.6671	0.6628	0.6461
YOLOv3+FS	Darknet53	U-FOV	0.8880	0.8870	0.8828	0.8511
YOLOv3+FS	Darknet53	Caltech+U-FOV	0.9057	0.9078	0.9084	0.8961
本文方法	Darknet53	Caltech+U-FOV	0.9201	0.9320	0.9315	0.9107

下載: 導(dǎo)出CSV

表 2 參數(shù)量對(duì)比

方法	總參數(shù)量	可訓(xùn)練參數(shù)量	不可訓(xùn)練參數(shù)量
YOLOv3	61576342	61523734	52608
本文方法	64861976	64806296	55680

下載: 導(dǎo)出CSV

表 3 U-FOV測(cè)試集圖像總處理時(shí)間及處理幀速

方法	YOLOv3	YOLOv3+Attention	FS+Attention	本文方法
總時(shí)間(s)	90.35	95.72	125.39	107.25
處理幀率	7.32	6.91	5.27	6.16

下載: 導(dǎo)出CSV

參考文獻(xiàn)(25)

BLOISI D D, PREVITALI F, PENNISI A, et al. Enhancing automatic maritime surveillance systems with visual information[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 18(4): 824–833. doi: 10.1109/TITS.2016.2591321

KANG J K, HONG H G, and PARK K R. Pedestrian detection based on adaptive selection of visible light or far-infrared light camera image by fuzzy inference system and convolutional neural network-based verification[J]. Sensors, 2017, 17(7): 1598. doi: 10.3390/s17071598

KIM S, SONG W J, and KIM S H. Infrared variation optimized deep convolutional neural network for robust automatic ground target recognition[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, 2017: 195–202. doi: 10.1109/CVPRW.2017.30.

王晨, 湯心溢, 高思莉. 基于人眼視覺(jué)的紅外圖像增強(qiáng)算法研究[J]. 激光與紅外, 2017, 47(1): 114–118. doi: 10.3969/j.issn.1001-5078.2017.01.022

WANG Chen, TANG Xinyi, and GAO Sili. Infrared image enhancement algorithm based on human vision[J]. Laser &Infrared, 2017, 47(1): 114–118. doi: 10.3969/j.issn.1001-5078.2017.01.022

MUNDER S, SCHNORR C, and GAVRILA D M. Pedestrian detection and tracking using a mixture of view-based shape-texture models[J]. IEEE Transactions on Intelligent Transportation Systems, 2008, 9(2): 333–343. doi: 10.1109/TITS.2008.922943

DALAL N and TRIGGS B. Histograms of oriented gradients for human detection[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 886–893. doi: 10.1109/CVPR.2005.177.

ZHANG Shanshan, BAUCKHAGE C, and CREMERS A B. Informed haar-like features improve pedestrian detection[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 947–954. doi: 10.1109/CVPR.2014.126.

WATANABE T and ITO S. Two co-occurrence histogram features using gradient orientations and local binary patterns for pedestrian detection[C]. The 2nd IAPR Asian Conference on Pattern Recognition, Naha, Japan, 2013: 415–419. doi: 10.1109/ACPR.2013.117.

余春艷, 徐小丹, 鐘詩(shī)俊. 面向顯著性目標(biāo)檢測(cè)的SSD改進(jìn)模型[J]. 電子與信息學(xué)報(bào), 2018, 40(11): 2554–2561. doi: 10.11999/JEIT180118

YU Chunyan, XU Xiaodan, and ZHONG Shijun. An improved SSD model for saliency object detection[J]. Journal of Electronics &Information Technology, 2018, 40(11): 2554–2561. doi: 10.11999/JEIT180118

LIU Songtao, HUANG Di, and WANG Yunhong. Adaptive NMS: Refining pedestrian detection in a crowd[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 6452–6461. doi: 10.1109/CVPR.2019.00662.

LIU Wei, LIAO Shengcai, REN Weiqiang, et al. Center and scale prediction: A box-free approach for pedestrian and face detection[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Los Angeles, USA, 2019: 5187–5196.

車(chē)凱, 向鄭濤, 陳宇峰, 等. 基于改進(jìn)Fast R-CNN的紅外圖像行人檢測(cè)研究[J]. 紅外技術(shù), 2018, 40(6): 578–584. doi: 10.11846/j.issn.1001_8891.201806010

CHE Kai, XIANG Zhengtao, CHEN Yufeng, et al. Research on infrared image pedestrian detection based on improved fast R-CNN[J]. Infrared Technology, 2018, 40(6): 578–584. doi: 10.11846/j.issn.1001_8891.201806010

王殿偉, 何衍輝, 李大湘, 等. 改進(jìn)的YOLOv3紅外視頻圖像行人檢測(cè)算法[J]. 西安郵電大學(xué)學(xué)報(bào), 2018, 23(4): 48–52. doi: 10.13682/j.issn.2095-6533.2018.04.008

WANG Dianwei, HE Yanhui, LI Daxiang, et al. An improved infrared video image pedestrian detection algorithm[J]. Journal of Xi'an University of Posts and Telecommunications, 2018, 23(4): 48–52. doi: 10.13682/j.issn.2095-6533.2018.04.008

GIRSHICK R. Fast R-CNN[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1440–1448. doi: 10.1109/ICCV.2015.169.

REDMON J and FARHADI A. YOLOv3: An incremental improvement[EB/OL]. http://arxiv.org/abs/1804.02767, 2018.

郭智, 宋萍, 張義, 等. 基于深度卷積神經(jīng)網(wǎng)絡(luò)的遙感圖像飛機(jī)目標(biāo)檢測(cè)方法[J]. 電子與信息學(xué)報(bào), 2018, 40(11): 2684–2690. doi: 10.11999/JEIT180117

GUO Zhi, SONG Ping, ZHANG Yi, et al. Aircraft detection method based on deep convolutional neural network for remote sensing images[J]. Journal of Electronics &Information Technology, 2018, 40(11): 2684–2690. doi: 10.11999/JEIT180117

CHEN Long, ZHANG Hanwang, XIAO Jun, et al. SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6298–6306. doi: 10.1109/CVPR.2017.667.

WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 3–19. doi: 10.1007/978-3-030-01234-2_1.

DOLLáR P, WOJEK C, SCHIELE B, et al. Pedestrian detection: An evaluation of the state of the art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4): 743–761. doi: 10.1109/TPAMI.2011.155

FU Chengyang, LIU Wei, RANGA A, et al. DSSD: Deconvolutional single shot detector[J]. arXiv, 2017, 1701.06659.

HE Kaiming, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2980–2988. doi: 10.1109/ICCV.2017.322.

BERG A, AHLBERG J, and FELSBERG M. A thermal object tracking benchmark[C]. The 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance, Karlsruhe, Germany, 2015: 1–6. doi: 10.1109/AVSS.2015.7301772.

LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector[C]. Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Netherlands, 2016: 21–37. doi: 10.1007/978-3-319-46448-0_2.

REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031

DAI Jifeng, LI Yi, HE Kaiming, et al. R-FCN: Object detection via region-based fully convolutional networks[C]. Advances in Neural Information Processing Systems, Barcelona, Spain, 2016: 379–387.

相關(guān)文章

施引文獻(xiàn)

資源附件(0)

訪(fǎng)問(wèn)統(tǒng)計(jì)