基于深度哈希算法的極光圖像分類與檢索方法

陳昌紅; 彭騰飛; 干宗良

doi:10.11999/JEIT190984

基于深度哈希算法的極光圖像分類與檢索方法

doi: 10.11999/JEIT190984

南京郵電大學(xué)通信與信息工程學(xué)院南京 210003

基金項(xiàng)目: 國家自然科學(xué)基金(61501260)，江蘇省研究生科研與實(shí)踐創(chuàng)新計劃(KYCX17_0776)

詳細(xì)信息

作者簡介:
陳昌紅：女，1982年生，副教授，研究方向?yàn)橹悄芤曨l分析、模式識別

彭騰飛：男，1994年生，碩士生，研究方向?yàn)閳D像處理與圖像通信

干宗良：男，1978年生，副教授，研究方向?yàn)榉植际揭曨l編碼、圖像信號視頻處理

通訊作者:
陳昌紅　chenchh@njupt.edu.cn

中圖分類號: TN911.73
計量
- 文章訪問數(shù): 1731
- HTML全文瀏覽量: 575
- PDF下載量: 114
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2019-12-09
- 修回日期: 2020-08-09
- 網(wǎng)絡(luò)出版日期: 2020-08-13
- 刊出日期: 2020-12-08

Aurora Image Classification and Retrieval Method Based on Deep Hashing Algorithm

College of Communication and Information Technology, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

Funds: The National Natural Science Foundation of China (61501260), The Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX17_0776)

摘要

摘要: 面對形態(tài)萬千、變化復(fù)雜的海量極光數(shù)據(jù)，對其進(jìn)行分類與檢索為進(jìn)一步研究地球磁場物理機(jī)制和空間信息具有重要意義。該文基于卷積神經(jīng)網(wǎng)絡(luò)(CNN)對圖像特征提取方面的良好表現(xiàn)，以及哈希編碼可以滿足大規(guī)模圖像檢索對檢索時間的要求，提出一種端到端的深度哈希算法用于極光圖像分類與檢索。首先在CNN中嵌入空間金字塔池化(SPP)和冪均值變換(PMT)來提取圖像中多種尺度的區(qū)域信息；其次在全連接層之間加入哈希層，將全連接層最能表現(xiàn)圖像的高維語義信息映射為緊湊的二值哈希碼，并在低維空間使用漢明距離對圖像對之間的相似性進(jìn)行度量；最后引入多任務(wù)學(xué)習(xí)機(jī)制，充分利用圖像標(biāo)簽信息和圖像對之間的相似度信息來設(shè)計損失函數(shù)，聯(lián)合分類層和哈希層的損失作為優(yōu)化目標(biāo)，使哈希碼之間可以保持更好的語義相似性，有效提升了檢索性能。在極光數(shù)據(jù)集和 CIFAR-10 數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果表明，所提出方法檢索性能優(yōu)于其他現(xiàn)有檢索方法，同時能夠有效用于極光圖像分類。
- 極光圖像 /
- 分類與檢索 /
- 卷積神經(jīng)網(wǎng)絡(luò) /
- 哈希編碼 /
- 多尺度特征融合
Abstract: It is of great significance to classify and retrieve the vast amount of aurora data with various forms and complex changes for the further study of the physical mechanism of the geomagnetic field and spatial information. In this paper, an end-to-end deep hashing algorithm for aurora image classification and retrieval is proposed based on the good performance of CNN in image feature extraction and the fact that hash coding can meet the retrieval time requirment of large-scale image retrieval. Firstly, Spatial Pyramidal Pooling(SPP) and Power Mean Transformtion(PMT) are embedded in Convolutional Neural Network (CNN) to extract multi-scale region information in the image. Secondly, a Hash layer is added between the fully connected layer to Mean Average Precision(MAP) the high-dimensional semantic information that can best represent the image into a compact binary Hash code, and the hamming distance is used to measure the similarity between the image pairs in the low-dimensional space. Finally, a multi-task learning mechanism is introduced to design the loss fuction by making full use of similarity informtion between the image label information and the image pairs. The loss of classification layer and Hash layer are combined as the optimization objective, so that a better semantic similarity between Hash code can be maintained, and the retrieval performance can be effectively improved. The results show that the proposed method outperforms the state-of-art retrieval algorithms on aurora dataset and CIFAR-10 datasets, and it can also be used in aurora image classification effectively.
- Aurora image /
- Classification and retrieval /
- Convolutional Neural Network(CNN) /
- Hash coding /
- Multi-scale feature fusion

HTML全文

圖 1 本文算法的訓(xùn)練和測試框圖

下載: 全尺寸圖片幻燈片

圖 2 空間金字塔池化示意圖

下載: 全尺寸圖片幻燈片

圖 3 4類極光類型圖像

下載: 全尺寸圖片幻燈片

圖 4 3種方法的MAP, P-R以及Top-k 檢索返回的準(zhǔn)確率曲線

下載: 全尺寸圖片幻燈片

圖 5 3種方法在哈希碼長度為48 bit時的四分類混淆矩陣

下載: 全尺寸圖片幻燈片

圖 6 不同哈希算法在CIFAR-10數(shù)據(jù)集上MAP, P-R以及Top-k 檢索返回的準(zhǔn)確率曲線

下載: 全尺寸圖片幻燈片

表 1 有無哈希層損失兩種方法對比

方法	MAP	準(zhǔn)確率
不考慮哈希層損失	0.7563	0.8705
考慮哈希層損失	0.8554	0.9073

下載: 導(dǎo)出CSV

表 2 有無SPP_PMT層兩種方法對比

方法	MAP	準(zhǔn)確率
不加SPP_PMT	0.8554	0.9073
加入SPP_PMT	0.8963	0.9367

下載: 導(dǎo)出CSV

表 3 3種方法的MAP以及在bit=48下模型參數(shù)大小(MB)和訓(xùn)練時間(min)

方法	不同哈希碼長度(bit)下的MAP				參數(shù)大小	訓(xùn)練時間
方法	12	24	32	48	參數(shù)大小	訓(xùn)練時間
AlexNet	0.8336	0.8450	0.8518	0.8554	218.20	158
AlexNet-SP	0.8729	0.9004	0.9066	0.8963	179.15	115
Im-AlexNet-SP	0.8995	0.9072	0.9173	0.9095	100.77	80

下載: 導(dǎo)出CSV

表 4 3種方法在不同哈希碼長度下的準(zhǔn)確率

方法	不同哈希碼長度(bit)下的準(zhǔn)確率
方法	12	24	32	48
AlexNet	0.8964	0.8995	0.8988	0.9073
AlexNet-SP	0.9312	0.9298	0.9325	0.9367
Im-AlexNet-SP	0.9320	0.9305	0.9410	0.9384

下載: 導(dǎo)出CSV

表 5 本文方法與其他極光檢索算法的MAP以及平均查詢時間對比(s)

方法	MAP	平均查詢時間
HE	0.5253	0.65
VLAD	0.5868	0.52
MAC	0.6558	1.22
MS-RMAC	0.6901	2.89
本文Im-AlexNet-SP	0.9095	0.43

下載: 導(dǎo)出CSV

表 6 不同哈希算法在CIFAR-10不同哈希碼長度下的MAP

方法	不同哈希碼長度(bit)下的MAP
方法	12	24	32	48
本文Im-AlexNet-SP	0.902	0.904	0.912	0.907
DPSH	0.713	0.727	0.744	0.757
DSH	0.673	0.685	0.690	0.694
CNNH	0.439	0.511	0.509	0.522
KSH	0.303	0.337	0.346	0.356
ITQ	0.162	0.169	0.172	0.175
LSH	0.127	0.137	0.141	0.149

下載: 導(dǎo)出CSV

參考文獻(xiàn)(20)

WANG Qian, LIANG Jimin, HU Zejun, et al. Spatial texture based automatic classification of dayside aurora in all-sky images[J]. Journal of Atmospheric and Solar-Terrestrial Physics, 2010, 72(5/6): 498–508. doi: 10.1016/j.jastp.2010.01.011

韓冰, 楊辰, 高新波. 融合顯著信息的LDA極光圖像分類[J]. 軟件學(xué)報, 2013, 24(11): 2758–2766. doi: 10.3724/SP.J.1001.2013.04481

HAN Bing, YANG Chen, and GAO Xinbo. Aurora image classification based on LDA combining with saliency information[J]. Journal of Software, 2013, 24(11): 2758–2766. doi: 10.3724/SP.J.1001.2013.04481

SYRJ?SUO M T, DONOVAN E F, and COGGER L L. Content-based retrieval of auroral images - thousands of irregular shapes[C]. The 4th IASTED International Conference Visualization, Imaging, and Image Processing, Marbella, Spain, 2004.

FU Rong, GAO Xinbo, LI Xuelong, et al. An integrated aurora image retrieval system: Aurora Eye[J]. Journal of Visual Communication and Image Representation, 2010, 21(8): 787–797. doi: 10.1016/j.jvcir.2010.06.002

YANG Xi, GAO Xinbo, SONG Bin, et al. Aurora image search with contextual CNN feature[J]. Neurocomputing, 2018, 281: 67–77. doi: 10.1016/j.neucom.2017.11.059

葛蕓, 馬琳, 江順亮, 等. 基于高層特征圖組合及池化的高分辨率遙感圖像檢索[J]. 電子與信息學(xué)報, 2019, 41(10): 2487–2494. doi: 10.11999/JEIT190017

GE Yun, MA Lin, JIANG Shunliang, et al. The combination and pooling based on high-level feature map for high-resolution remote sensing image retrieval[J]. Journal of Electronics &Information Technology, 2019, 41(10): 2487–2494. doi: 10.11999/JEIT190017

劉冶, 潘炎, 夏榕楷, 等. FP-CNNH: 一種基于深度卷積神經(jīng)網(wǎng)絡(luò)的快速圖像哈希算法[J]. 計算機(jī)科學(xué), 2016, 43(9): 39–46, 51. doi: 10.11896/j.issn.1002-137X.2016.09.007

LIU Ye, PAN Yan, XIA Rongkai, et al. FP-CNNH: A fast image hashing algorithm based on deep convolutional neural network[J]. Computer Science, 2016, 43(9): 39–46, 51. doi: 10.11896/j.issn.1002-137X.2016.09.007

LI Wujun, WANG Sheng, and KANG Wangcheng. Feature learning based deep supervised hashing with pairwise labels[C]. The 25th International Joint Conference on Artificial Intelligence, New York, USA, 2016: 1711–1717.

LIU Haomiao, WANG Ruiping, SHAN Shiguang, et al. Deep supervised hashing for fast image retrieval[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 2064–2072. doi: 10.1109/CVPR.2016.227.

KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. Imagenet classification with deep convolutional neural networks[C]. The 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2012: 1097–1105.

HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916. doi: 10.1109/TPAMI.2015.2389824

趙斐, 張文凱, 閆志遠(yuǎn), 等. 基于多特征圖金字塔融合深度網(wǎng)絡(luò)的遙感圖像語義分割[J]. 電子與信息學(xué)報, 2019, 41(10): 2525–2531. doi: 10.11999/JEIT190047

ZHAO Fei, ZHANG Wenkai, YAN Zhiyuan, et al. Multi-feature map pyramid fusion deep network for semantic segmentation on remote sensing data[J]. Journal of Electronics &Information Technology, 2019, 41(10): 2525–2531. doi: 10.11999/JEIT190047

ZHANG Chenlin and WU Jianxin. Improving CNN linear layers with power mean non-linearity[J]. Pattern Recognition, 2019, 89: 12–21. doi: 10.1016/j.patcog.2018.12.029

JEGOU H, DOUZE M, and SCHMID C. Hamming embedding and weak geometric consistency for large scale image search[C]. The 10th European Conference on Computer Vision, Marseille, France, 2008: 304–317. doi: 10.1007/978-3-540-88682-2_24.

XIA Yan, HE Kaiming, WEN Fang, et al. Joint inverted indexing[C]. 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 2013: 3416–3423. doi: 10.1109/ICCV.2013.424.

TOLIAS G, SICRE R, and JéGOU H. Particular object retrieval with integral max-pooling of CNN activations[C]. The 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016: 1–12.

LI Yang, XU Yulong, WANG Jiabao, et al. MS-RMAC: Multiscale regional maximum activation of convolutions for image retrieval[J]. IEEE Signal Processing Letters, 2017, 24(5): 609–613. doi: 10.1109/LSP.2017.2665522

DATAR M, IMMORLICA N, INDYK P, et al. Locality- sensitive hashing scheme based on p-stable distributions[C]. The 20th Annual Symposium on Computational Geometry, Brooklyn, USA, 2004: 253–262. doi: 10.1145/997817.997857.

GONG Yunchao and LAZEBNIK S. Iterative quantization: A procrustean approach to learning binary codes[C]. The 24th IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2011: 817–824. doi: 10.1109/CVPR.2011.5995432.

LIU Wei, WANG Jun, JI Rongrong, et al. Supervised hashing with kernels[C]. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 2074–2081. doi: 10.1109/CVPR.2012.6247912.

相關(guān)文章

施引文獻(xiàn)

資源附件(0)

訪問統(tǒng)計