基于空間可靠性約束的魯棒視覺跟蹤算法

蒲磊; 馮新喜; 侯志強(qiáng); 余旺盛

doi:10.11999/JEIT180780

基于空間可靠性約束的魯棒視覺跟蹤算法

doi: 10.11999/JEIT180780 cstr: 32379.14.JEIT180780

1.
空軍工程大學(xué)研究生院 ??西安 ??710077
2.
空軍工程大學(xué)信息與導(dǎo)航學(xué)院 ??西安 ??710077
3.
西安郵電大學(xué)計(jì)算機(jī)學(xué)院 ??西安 ??710121

基金項(xiàng)目: 國(guó)家自然科學(xué)基金(61571458, 61473309, 41601436)

詳細(xì)信息

作者簡(jiǎn)介:
蒲磊：男，1991年生，博士生，研究方向?yàn)橛?jì)算機(jī)視覺、目標(biāo)跟蹤

馮新喜：男，1964年生，教授，研究方向?yàn)樾畔⑷诤?、模式識(shí)別

侯志強(qiáng)：男，1973年生，教授，研究方向?yàn)閳D像處理、計(jì)算機(jī)視覺

余旺盛：男，1985年生，講師，研究方向?yàn)閳D像處理、模式識(shí)別

通訊作者:
蒲磊　warmstoner@163.com

中圖分類號(hào): TP391.4
計(jì)量
- 文章訪問(wèn)數(shù): 2379
- HTML全文瀏覽量: 859
- PDF下載量: 70
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2018-08-07
- 修回日期: 2019-01-21
- 網(wǎng)絡(luò)出版日期: 2019-02-15
- 刊出日期: 2019-07-01

Robust Visual Tracking Based on Spatial Reliability Constraint

1.
Graduate College, Air Force Engineering University, Xi’an 710077, China
2.
Institute of Information and Navigation, Air Force Engineering University, Xi’an 710077, China
3.
School of Computer Science and Technology, Xian University of Posts and Telecommunications, Xi’an 710121, China

Funds: The National Natural Science Foundation of China (61571458, 61473309, 41601436)

摘要

摘要: 針對(duì)復(fù)雜背景下目標(biāo)容易發(fā)生漂移的問(wèn)題，該文提出一種基于空間可靠性約束的目標(biāo)跟蹤算法。首先通過(guò)預(yù)訓(xùn)練卷積神經(jīng)網(wǎng)絡(luò)(CNN)模型提取目標(biāo)的多層深度特征，并在各層上分別訓(xùn)練相關(guān)濾波器，然后對(duì)得到的響應(yīng)圖進(jìn)行加權(quán)融合。接著通過(guò)高層特征圖提取目標(biāo)的可靠性區(qū)域信息，得到一個(gè)二值注意力矩陣，最后將得到的二值矩陣用于約束融合后響應(yīng)圖的搜索范圍，范圍內(nèi)的最大響應(yīng)值即為目標(biāo)的中心位置。為了處理長(zhǎng)時(shí)遮擋問(wèn)題，該文提出一種基于首幀模板信息的隨機(jī)選擇更新策略。實(shí)驗(yàn)結(jié)果表明，該算法在應(yīng)對(duì)相似背景干擾、遮擋、超出視野等多種場(chǎng)景均有良好的性能表現(xiàn)。
- 視覺跟蹤 /
- 空間可靠性約束 /
- 深度特征 /
- 相關(guān)濾波 /
- 模型更新
Abstract: Because of the problem that the target is prone to drift in complex background, a robust tracking algorithm based on spatial reliability constraint is proposed. Firstly, the pre-trained Convolutional Neural Network (CNN) model is used to extract the multi-layer deep features of the target, and the correlation filters are respectively trained on each layer to perform weighted fusion of the obtained response maps. Then, the reliability region information of the target is extracted through the high-level feature map, a binary matrix is obtained. Finally, the obtained binary matrix is used to constrain the search area of the response map, and the maximum response value in the area is the target position. In addition, in order to deal with the long-term occlusion problem, a random selection model update strategy with the first frame template information is proposed. The experimental results show that the proposed algorithm has good performance in dealing with similar background interference, occlusion, and other scenes.
- Visual tracking /
- Spatial reliability constraint /
- Deep features /
- Correlation filter /
- Model update

HTML全文

圖 1 卷積深度特征可視化

下載: 全尺寸圖片幻燈片

圖 2 算法流程圖

下載: 全尺寸圖片幻燈片

圖 3 OTB100測(cè)試結(jié)果的精度曲線和成功率曲線

下載: 全尺寸圖片幻燈片

圖 4 TempleColor128測(cè)試結(jié)果的精度曲線和成功率曲線

下載: 全尺寸圖片幻燈片

表 1 基于空間可靠性約束的魯棒視覺跟蹤算法

輸入：圖像序列I₁, I₂, ···, I_n，目標(biāo)初始位置p₀=(x₀, y₀)，目標(biāo)初始尺度s₀=(w₀, h₀)。
輸出：每幀圖像的跟蹤結(jié)果p_t=(x_t, y_t), s_t=(w_t, h_t)。
對(duì)于t=1, 2, ···, n, do：
(1) 定位目標(biāo)中心位置
(a) 利用前一幀目標(biāo)位置p_t_–1確定第t幀ROI區(qū)域，并提取其分層卷積特征；
(b) 對(duì)于每一層的卷積特征，利用式(4)和式(5)計(jì)算其相關(guān) 響應(yīng)圖；
(c) 利用式(6)對(duì)多個(gè)相關(guān)響應(yīng)圖進(jìn)行融合，得到最終的相關(guān)響應(yīng)圖；　　　(d)通過(guò)式(7)和式(8)提取空間可靠性區(qū)域圖并將用于約束響應(yīng)圖搜索范圍；
(e) 利用式(9)確定第t 幀中目標(biāo)的中心位置p_t。
(2) 確定目標(biāo)最佳尺度
(a) 利用p_t和前一幀目標(biāo)尺度s_t_–1進(jìn)行多尺度采樣，得到采樣圖像集I_s={$ I_{s_1},\ I_{s_2},\ ·\!·\!·,\ I_{s_m}$}；
(b) 采用文獻(xiàn)[14]中的尺度估計(jì)方法確定第t幀中目標(biāo)的最佳尺度s_t。
(3) 模型更新
(a) 通過(guò)得到響應(yīng)圖計(jì)算最大響應(yīng)值；
(b) 依據(jù)響應(yīng)值大小和式(10)—式(12)對(duì)濾波器進(jìn)行更新。
結(jié)束

下載: 導(dǎo)出CSV

表 2 不同屬性下算法的跟蹤精度對(duì)比結(jié)果

算法	SV(60)	OCC(45)	IV(34)	BC(27)	DEF(42)	MB(29)	FM(37)	IPR(46)	OPR(57)	OV(13)	LR(8)
本文算法	0.827	0.799	0.855	0.872	0.801	0.813	0.800	0.879	0.844	0.756	0.870
HDT	0.811	0.753	0.803	0.855	0.817	0.764	0.800	0.851	0.804	0.663	0.749
HCF	0.800	0.748	0.805	0.857	0.788	0.772	0.788	0.863	0.807	0.680	0.778

下載: 導(dǎo)出CSV

表 3 不同屬性下算法的跟蹤成功率對(duì)比結(jié)果

算法	SV(60)	OCC(45)	IV(34)	BC(27)	DEF(42)	MB(29)	FM(37)	IPR(46)	OPR(57)	OV(13)	LR(8)
本文算法	0.580	0.594	0.635	0.627	0.570	0.624	0.609	0.605	0.597	0.556	0.510
HDT	0.491	0.528	0.540	0.593	0.546	0.545	0.549	0.557	0.533	0.541	0.376
HCF	0.490	0.526	0.547	0.602	0.532	0.557	0.550	0.599	0.534	0.542	0.383

下載: 導(dǎo)出CSV

表 4 算法各部分對(duì)跟蹤性能影響對(duì)比實(shí)驗(yàn)

	SRCT	SRCT-S	SRCT-R	SRCT-S-R
成功率	0.624	0.618	0.610	0.603
跟蹤精度	0.864	0.856	0.841	0.838

下載: 導(dǎo)出CSV

參考文獻(xiàn)(30)

SMEULDERS A W M, CHU D M, CUCCHIARA R, et al. Visual tracking: An experimental survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1442–1468. doi: 10.1109/TPAMI.2013.230

WANG Naiyan, SHI Jianping, YEUNG D Y, et al. Understanding and diagnosing visual tracking systems[C]. Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 3101–3109. doi: 1109/ICCV.2015.355.

RAWAT W and WANG Zenghui. Deep convolutional neural networks for image classification: A comprehensive review[J]. Neural Computation, 2017, 29(9): 2352–2449. doi: 10.1162/neco_a_00990

GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 580–587.

SHELHAMER E, LONG J, and DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640–651. doi: 10.1109/TPAMI.2016.2572683

WANG Naiyan and YEUNG D Y. Learning a deep compact image representation for visual tracking[C]. Proceedings of the 26th International Conference on Neural Information Processing Systems, South Lake Tahoe, Nevada, USA, 2013: 809–817.

HONG S, YOU T, KWAK S, et al. Online tracking by learning discriminative saliency map with convolutional neural network[C]. Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France, 2015: 597–606.

NAM H and HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4293–4302.

李寰宇, 畢篤彥, 楊源, 等. 基于深度特征表達(dá)與學(xué)習(xí)的視覺跟蹤算法研究[J]. 電子與信息學(xué)報(bào), 2015, 37(9): 2033–2039. doi: 10.11999/JEIT150031

LI Huanyu, BI Duyan, YANG Yuan, et al. Research on visual tracking algorithm based on deep feature expression and learning[J]. Journal of Electronics &Information Technology, 2015, 37(9): 2033–2039. doi: 10.11999/JEIT150031

侯志強(qiáng), 戴鉑, 胡丹, 等. 基于感知深度神經(jīng)網(wǎng)絡(luò)的視覺跟蹤[J]. 電子與信息學(xué)報(bào), 2016, 38(7): 1616–1623. doi: 10.11999/JEIT151449

HOU Zhiqiang, DAI Bo, HU Dan, et al. Robust visual tracking via perceptive deep neural network[J]. Journal of Electronics &Information Technology, 2016, 38(7): 1616–1623. doi: 10.11999/JEIT151449

HENRIQUES J F, CASEIRO R, MARTINS P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]. Proceedings of the 12th European Conference on Computer Vision, Florence, Italy, 2012: 702–715. doi: 10.1007/978-3-642-33765-9_50.

DANELLJAN M, KHAN F S, FELSBERG M, et al. Adaptive color attributes for real-time visual tracking[C]. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, 2014: 1090–1097. doi: 10.1109/CVPR.2014.143.

HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596. doi: 10.1109/tpami.2014.2345390

DANELLJAN M, H?GER G, KHAN F S, et al. Accurate scale estimation for robust visual tracking[C]. Proceedings of British Machine Vision Conference, Nottingham, UK, 2014: 65.1–65.11. doi: 10.5244/C.28.65.

DANELLJAN M, H?GER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]. Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4310–4318. doi: 10.1109/ICCV.2015.490.

DANELLJAN M, ROBINSON A, KHAN F S, et al. Beyond correlation filters: Learning continuous convolution operators for visual tracking[C]. Proceedings of the 14th European Conference, Amsterdam, the Netherlands, 2016: 472–488. doi: 10.1007/978-3-319-46454-1_29.

RUSSAKOVSKY O, DENG Jia, SU Hao, et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211–252. doi: 10.1007/s11263-015-0816-y

KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[C]. Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2012: 1097–1105. doi: 10.1145/3065386.

SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]. International Conference on Learning Representations, San Diego,USA,2015.

HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778. doi: 10.1109/CVPR.2016.90.

VEDALDI A and LENC K. Matconvnet: Convolutional neural networks for matlab[C]. Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia, 2015: 689–692. doi: 10.1145/2733373.2807412.

WU Yi, LIM J, and YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834–1848. doi: 10.1109/TPAMI.2014.2388226

DANELLJAN M, H?GER G, KHAN F S, et al. Convolutional features for correlation filter based visual tracking[C]. Proceedings of 2015 IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 2015: 58–66. doi: 10.1109/ICCVW.2015.84.

QI Yuankai, ZHANG Shengping, QIN Lei, et al. Hedged deep tracking[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 4303–4311. doi: 10.1109/CVPR.2016.466.

MA Chao, HUANG Jiabin, YANG Xiaokang, et al. Hierarchical convolutional features for visual tracking[C]. Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 3074–3082. doi: 10.1109/ICCV.2015.352.

ZHANG Jianming, MA Shugao, and SCLAROFF S. MEEM: Robust tracking via multiple experts using entropy minimization[C]. Proceedings of the 13th European Conference, Zurich, Switzerland, 2014: 188–203.

LIANG Pengpeng, BLASCH E, and LING Haibin. Encoding color information for visual tracking: Algorithms and benchmark[J]. IEEE Transactions on Image Processing, 2015, 24(12): 5630–5644. doi: 10.1109/TIP.2015.2482905

TAO Ran, GAVVES E, and SMEULDERS A W M. Siamese instance search for tracking[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1420–1429. doi: 10.1109/CVPR.2016.158.

BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional siamese networks for object tracking[C]. European Conference on Computer Vision, Amsterdam, the Netherlands, 2016: 850–865.

侯志強(qiáng), 張浪, 余旺盛, 等. 基于快速傅里葉變換的局部分塊視覺跟蹤算法[J]. 電子與信息學(xué)報(bào), 2015, 37(10): 2397–2404. doi: 10.11999/JEIT150183

HOU Zhiqiang, ZHANG Lang, YU Wangsheng, et al. Local patch tracking algorithm based on fast fourier transform[J]. Journal of Electronics &Information Technology, 2015, 37(10): 2397–2404. doi: 10.11999/JEIT150183

相關(guān)文章

施引文獻(xiàn)

資源附件(0)

訪問(wèn)統(tǒng)計(jì)