基于深度哈希算法的極光圖像分類與檢索方法
doi: 10.11999/JEIT190984
-
南京郵電大學(xué)通信與信息工程學(xué)院 南京 210003
Aurora Image Classification and Retrieval Method Based on Deep Hashing Algorithm
-
College of Communication and Information Technology, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
-
摘要: 面對形態(tài)萬千、變化復(fù)雜的海量極光數(shù)據(jù),對其進(jìn)行分類與檢索為進(jìn)一步研究地球磁場物理機(jī)制和空間信息具有重要意義。該文基于卷積神經(jīng)網(wǎng)絡(luò)(CNN)對圖像特征提取方面的良好表現(xiàn),以及哈希編碼可以滿足大規(guī)模圖像檢索對檢索時間的要求,提出一種端到端的深度哈希算法用于極光圖像分類與檢索。首先在CNN中嵌入空間金字塔池化(SPP)和冪均值變換(PMT)來提取圖像中多種尺度的區(qū)域信息;其次在全連接層之間加入哈希層,將全連接層最能表現(xiàn)圖像的高維語義信息映射為緊湊的二值哈希碼,并在低維空間使用漢明距離對圖像對之間的相似性進(jìn)行度量;最后引入多任務(wù)學(xué)習(xí)機(jī)制,充分利用圖像標(biāo)簽信息和圖像對之間的相似度信息來設(shè)計損失函數(shù),聯(lián)合分類層和哈希層的損失作為優(yōu)化目標(biāo),使哈希碼之間可以保持更好的語義相似性,有效提升了檢索性能。在極光數(shù)據(jù)集和 CIFAR-10 數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果表明,所提出方法檢索性能優(yōu)于其他現(xiàn)有檢索方法,同時能夠有效用于極光圖像分類。
-
關(guān)鍵詞:
- 極光圖像 /
- 分類與檢索 /
- 卷積神經(jīng)網(wǎng)絡(luò) /
- 哈希編碼 /
- 多尺度特征融合
Abstract: It is of great significance to classify and retrieve the vast amount of aurora data with various forms and complex changes for the further study of the physical mechanism of the geomagnetic field and spatial information. In this paper, an end-to-end deep hashing algorithm for aurora image classification and retrieval is proposed based on the good performance of CNN in image feature extraction and the fact that hash coding can meet the retrieval time requirment of large-scale image retrieval. Firstly, Spatial Pyramidal Pooling(SPP) and Power Mean Transformtion(PMT) are embedded in Convolutional Neural Network (CNN) to extract multi-scale region information in the image. Secondly, a Hash layer is added between the fully connected layer to Mean Average Precision(MAP) the high-dimensional semantic information that can best represent the image into a compact binary Hash code, and the hamming distance is used to measure the similarity between the image pairs in the low-dimensional space. Finally, a multi-task learning mechanism is introduced to design the loss fuction by making full use of similarity informtion between the image label information and the image pairs. The loss of classification layer and Hash layer are combined as the optimization objective, so that a better semantic similarity between Hash code can be maintained, and the retrieval performance can be effectively improved. The results show that the proposed method outperforms the state-of-art retrieval algorithms on aurora dataset and CIFAR-10 datasets, and it can also be used in aurora image classification effectively. -
表 3 3種方法的MAP以及在bit=48下模型參數(shù)大小(MB)和訓(xùn)練時間(min)
方法 不同哈希碼長度(bit)下的MAP 參數(shù)大小 訓(xùn)練時間 12 24 32 48 AlexNet 0.8336 0.8450 0.8518 0.8554 218.20 158 AlexNet-SP 0.8729 0.9004 0.9066 0.8963 179.15 115 Im-AlexNet-SP 0.8995 0.9072 0.9173 0.9095 100.77 80 下載: 導(dǎo)出CSV
表 4 3種方法在不同哈希碼長度下的準(zhǔn)確率
方法 不同哈希碼長度(bit)下的準(zhǔn)確率 12 24 32 48 AlexNet 0.8964 0.8995 0.8988 0.9073 AlexNet-SP 0.9312 0.9298 0.9325 0.9367 Im-AlexNet-SP 0.9320 0.9305 0.9410 0.9384 下載: 導(dǎo)出CSV
表 5 本文方法與其他極光檢索算法的MAP以及平均查詢時間對比(s)
方法 MAP 平均查詢時間 HE 0.5253 0.65 VLAD 0.5868 0.52 MAC 0.6558 1.22 MS-RMAC 0.6901 2.89 本文Im-AlexNet-SP 0.9095 0.43 下載: 導(dǎo)出CSV
表 6 不同哈希算法在CIFAR-10不同哈希碼長度下的MAP
方法 不同哈希碼長度(bit)下的MAP 12 24 32 48 本文Im-AlexNet-SP 0.902 0.904 0.912 0.907 DPSH 0.713 0.727 0.744 0.757 DSH 0.673 0.685 0.690 0.694 CNNH 0.439 0.511 0.509 0.522 KSH 0.303 0.337 0.346 0.356 ITQ 0.162 0.169 0.172 0.175 LSH 0.127 0.137 0.141 0.149 下載: 導(dǎo)出CSV
-
WANG Qian, LIANG Jimin, HU Zejun, et al. Spatial texture based automatic classification of dayside aurora in all-sky images[J]. Journal of Atmospheric and Solar-Terrestrial Physics, 2010, 72(5/6): 498–508. doi: 10.1016/j.jastp.2010.01.011 韓冰, 楊辰, 高新波. 融合顯著信息的LDA極光圖像分類[J]. 軟件學(xué)報, 2013, 24(11): 2758–2766. doi: 10.3724/SP.J.1001.2013.04481HAN Bing, YANG Chen, and GAO Xinbo. Aurora image classification based on LDA combining with saliency information[J]. Journal of Software, 2013, 24(11): 2758–2766. doi: 10.3724/SP.J.1001.2013.04481 SYRJ?SUO M T, DONOVAN E F, and COGGER L L. Content-based retrieval of auroral images - thousands of irregular shapes[C]. The 4th IASTED International Conference Visualization, Imaging, and Image Processing, Marbella, Spain, 2004. FU Rong, GAO Xinbo, LI Xuelong, et al. An integrated aurora image retrieval system: Aurora Eye[J]. Journal of Visual Communication and Image Representation, 2010, 21(8): 787–797. doi: 10.1016/j.jvcir.2010.06.002 YANG Xi, GAO Xinbo, SONG Bin, et al. Aurora image search with contextual CNN feature[J]. Neurocomputing, 2018, 281: 67–77. doi: 10.1016/j.neucom.2017.11.059 葛蕓, 馬琳, 江順亮, 等. 基于高層特征圖組合及池化的高分辨率遙感圖像檢索[J]. 電子與信息學(xué)報, 2019, 41(10): 2487–2494. doi: 10.11999/JEIT190017GE Yun, MA Lin, JIANG Shunliang, et al. The combination and pooling based on high-level feature map for high-resolution remote sensing image retrieval[J]. Journal of Electronics &Information Technology, 2019, 41(10): 2487–2494. doi: 10.11999/JEIT190017 劉冶, 潘炎, 夏榕楷, 等. FP-CNNH: 一種基于深度卷積神經(jīng)網(wǎng)絡(luò)的快速圖像哈希算法[J]. 計算機(jī)科學(xué), 2016, 43(9): 39–46, 51. doi: 10.11896/j.issn.1002-137X.2016.09.007LIU Ye, PAN Yan, XIA Rongkai, et al. FP-CNNH: A fast image hashing algorithm based on deep convolutional neural network[J]. Computer Science, 2016, 43(9): 39–46, 51. doi: 10.11896/j.issn.1002-137X.2016.09.007 LI Wujun, WANG Sheng, and KANG Wangcheng. Feature learning based deep supervised hashing with pairwise labels[C]. The 25th International Joint Conference on Artificial Intelligence, New York, USA, 2016: 1711–1717. LIU Haomiao, WANG Ruiping, SHAN Shiguang, et al. Deep supervised hashing for fast image retrieval[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 2064–2072. doi: 10.1109/CVPR.2016.227. KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. Imagenet classification with deep convolutional neural networks[C]. The 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2012: 1097–1105. HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916. doi: 10.1109/TPAMI.2015.2389824 趙斐, 張文凱, 閆志遠(yuǎn), 等. 基于多特征圖金字塔融合深度網(wǎng)絡(luò)的遙感圖像語義分割[J]. 電子與信息學(xué)報, 2019, 41(10): 2525–2531. doi: 10.11999/JEIT190047ZHAO Fei, ZHANG Wenkai, YAN Zhiyuan, et al. Multi-feature map pyramid fusion deep network for semantic segmentation on remote sensing data[J]. Journal of Electronics &Information Technology, 2019, 41(10): 2525–2531. doi: 10.11999/JEIT190047 ZHANG Chenlin and WU Jianxin. Improving CNN linear layers with power mean non-linearity[J]. Pattern Recognition, 2019, 89: 12–21. doi: 10.1016/j.patcog.2018.12.029 JEGOU H, DOUZE M, and SCHMID C. Hamming embedding and weak geometric consistency for large scale image search[C]. The 10th European Conference on Computer Vision, Marseille, France, 2008: 304–317. doi: 10.1007/978-3-540-88682-2_24. XIA Yan, HE Kaiming, WEN Fang, et al. Joint inverted indexing[C]. 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 2013: 3416–3423. doi: 10.1109/ICCV.2013.424. TOLIAS G, SICRE R, and JéGOU H. Particular object retrieval with integral max-pooling of CNN activations[C]. The 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016: 1–12. LI Yang, XU Yulong, WANG Jiabao, et al. MS-RMAC: Multiscale regional maximum activation of convolutions for image retrieval[J]. IEEE Signal Processing Letters, 2017, 24(5): 609–613. doi: 10.1109/LSP.2017.2665522 DATAR M, IMMORLICA N, INDYK P, et al. Locality- sensitive hashing scheme based on p-stable distributions[C]. The 20th Annual Symposium on Computational Geometry, Brooklyn, USA, 2004: 253–262. doi: 10.1145/997817.997857. GONG Yunchao and LAZEBNIK S. Iterative quantization: A procrustean approach to learning binary codes[C]. The 24th IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2011: 817–824. doi: 10.1109/CVPR.2011.5995432. LIU Wei, WANG Jun, JI Rongrong, et al. Supervised hashing with kernels[C]. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 2074–2081. doi: 10.1109/CVPR.2012.6247912. -