基于深度分層特征表示的行人識(shí)別方法
doi: 10.11999/JEIT150982
基金項(xiàng)目:
國(guó)家自然科學(xué)基金(61471154),教育部留學(xué)回國(guó)人員科研啟動(dòng)基金
Pedestrian Recognition Method Based on Depth Hierarchical Feature Representation
Funds:
The National Natural Science Foundation of China (61471154), Scientific Research Foundation for Returned Scholars, Ministry of Education of China
-
摘要: 該文針對(duì)行人識(shí)別中的特征表示問(wèn)題,提出一種混合結(jié)構(gòu)的分層特征表示方法,這種混合結(jié)構(gòu)結(jié)合了具有表示能力的詞袋結(jié)構(gòu)和學(xué)習(xí)適應(yīng)性的深度分層結(jié)構(gòu)。首先利用基于梯度的HOG局部描述符提取局部特征,再通過(guò)一個(gè)由空間聚集受限玻爾茲曼機(jī)組成的深度分層編碼方法進(jìn)行編碼。對(duì)于每個(gè)編碼層,利用稀疏性和選擇性正則化進(jìn)行無(wú)監(jiān)督受限玻爾茲曼機(jī)學(xué)習(xí),再應(yīng)用監(jiān)督微調(diào)來(lái)增強(qiáng)分類任務(wù)中視覺(jué)特征表示,采用最大池化和空間金字塔方法得到高層圖像特征表示。最后采用線性支持向量機(jī)進(jìn)行行人識(shí)別,提取深度分層特征遮擋等與目標(biāo)無(wú)關(guān)部分自然分離,有效提高了后續(xù)識(shí)別的準(zhǔn)確性。實(shí)驗(yàn)結(jié)果證明了所提出方法具有較高的識(shí)別率。
-
關(guān)鍵詞:
- 行人識(shí)別 /
- 混合結(jié)構(gòu) /
- 深度學(xué)習(xí) /
- 深度分層編碼 /
- 受限玻爾茲曼機(jī)
Abstract: For feature representation of pedestrian recognition, a hybrid hierarchical feature representation method which combines representation ability of the bag of words model and depth layered with learning adaptability is presented. This method first uses HOG local descriptor gradient-based for local features extraction, and then encoding the feature by a depth of layered coding method, the layered coding method by spatial aggregating Restricted Boltzmann Machine (RBM). For each coding layer, the sparse and selective regularization are used for the unsupervised RBM learning and supervision fine-tuning is used to enhance the visual features representation in classification task. Finally, high-level image feature representation is obtained by the maximum pool and space of Pyramid method, and then the linear support vector machine is used for pedestrian recognition, feature extraction of depth architecture. It improves effectively the accuracy of subsequent recognition. Experimental results show that the proposed method has a high recognition rate. -
DALAL N and TRIGGS B. Histograms of oriented gradients for human detection[C]. Proceedings of IEEE Computer Society Conference on in Computer Vision and Pattern Recognition. San Diego, 2005: 886-893. doi: 10.1109/CVPR. 2005.177. ARMANFARD N, KOMEILI M, and KABIR E. TED: a texture-edge descriptor for pedestrian detection in video sequences[J]. Pattern Recognition, 2012, 45(3): 983-992. doi: 10.1016/j.patcog.2011.08.010. YAN Zhiguo, YANG Fang, WANG Jian, et al. Face orientation detection in video stream based on Harr-like feature and LQV classifier for civil video surveillance[C]. IET International Conference on Smart and Sustainable City (ICSSC), Shanghai, 2013: 161-165. doi: 10.1049/cp.2013. 2029. XIAO Pan, CAI Nian, TANG Bochao, et al. Efficient SIFT descriptor via color quantization[C]. IEEE International Conference on Consumer Electronics, Shenzhen, 2014: 1-3. doi: 10.1109/ICCE-China.2014.7029876. YANG Jian, XU Wei, LIU Yu, et al. Real-time discrimination of frontal face using integral channel features and Adaboost[C]. IEEE Conference on Software Engineering and Service Science (ICSESS), Beijing, 2014: 360-363. doi: 10. 1109/ICSESS.2014.6933582. WU Shuqiong and NAGAHASHI H. Parameterized AdaBoost: introducing a parameter to speed up the training of real AdaBoost[J]. IEEE Signal Processing Letters, 2014, 21(6): 687-691.doi: 10.1109/LSP.2014.2313570. SCHMIDHUBER J. Deep learning in neural networks: an overview[J]. Neural Networks, 2015, 61: 85-117. doi: 10.1016/ j.neunet.2014.09.003. RANZATO M, BOUREAU Y, and LECUN Y. Sparse feature learning for deep belief networks[C]. Proceedings of Annual Conference on Neural Information Processing Systems (NIPS), Vancouver, 2007: 1185-1192. 余凱, 賈磊, 陳雨強(qiáng), 等. 深度學(xué)習(xí)的昨天、今天和明天[J]. 計(jì)算機(jī)研究與發(fā)展, 2013, 50(9): 1799-1804. YU Kai, JIA Lei, CHEN Yuqiang, et al. Deep learning: yesterday, today, and tomorrow[J]. Journal of Computer Research and Development, 2013, 50(9): 1799-1804. LAW M T, THOME N, and CORD M. Bag-of-Words Image Representation: Key Ideas and Further Insight[M]. Switzerland, Springer International Publishing, 2014: 29-52. WU Chunpeng, FAN Wei, HE Yuan, et al. Handwritten character recognition by alternately trained relaxation convolutional neural network[C]. International Conference on Frontiers in Handwriting Recognition, Heraklion, 2014: 291-296. doi: 10.1109/ICFHR.2014.56. SOHN K, JUNG D Y, LEE H, et al. Efficient learning of sparse, distributed, convolutional feature representations for object recognition[C]. 2011 IEEE International Conference on Computer Vision (ICCV), Barcelona, 2011: 2643-2650. doi: 10.1109/ICCV.2011.6126554. LEE H, GROSSE R, RANGANATH R, et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations[C]. International Conference on Machine Learning, Montreal, 2009: 609-616. doi: 10.1145 /1553374.1553453. BAI Y, YU W, XIAO T, et al. Bag-of-words based deep neural network for image retrieval[C]. Proceedings of the ACM International Conference on Multimedia, New York, 2014: 229-232. doi: 10.1145/2647868.2656402. BOUREAU Y, BACH F, LECUN Y, et al. Learning mid-level features for recognition[C]. IEEE Conference on Computer Vision Pattern Recognition, 2010: 2559-2566. doi:10. 1109/CVPR.2010.5539963. YU K, LIN Y, and LAFFERTY J. Learning image representations from the pixel level via hierarchical sparse coding[C]. IEEE Conference on Computer Vision Pattern Recognition, Colorado Springs, 2011: 1713-1720. doi: 10. 1109/CVPR.2011.5995732. HINTON G E. Training products of experts by minimizing Ccontrastive divergence[J]. Neural Computation, 2002, 14(8): 1771-1800. doi: 10.1162/089976602760128018. -
計(jì)量
- 文章訪問(wèn)數(shù): 1725
- HTML全文瀏覽量: 111
- PDF下載量: 894
- 被引次數(shù): 0