Hesse Sparse Representation under n-words Model for Image Retrieval
Funds:
The National Natural Science Foundation of China (61201323)
-
摘要: 論文針對視覺詞袋(BOVW)模型放棄圖像空間結(jié)構(gòu)的缺點,提出一種基于Hesse稀疏編碼的圖像檢索算法。首先,建立n-words模型,獲得圖像局部特征表示。n-words模型由一系列連續(xù)視覺詞獲得,是圖像特征的一種高級描述。該文從n=1到n=5進行試驗,尋找最恰當(dāng)?shù)膎值;其次,將二階Hesse能量函數(shù)融入標(biāo)準(zhǔn)稀疏編碼的目標(biāo)函數(shù),得到Hesse稀疏編碼公式;最后,以獲得的n-words序列作為編碼特征,利用特征符號搜索算法求解最優(yōu)Hesse系數(shù),計算相似度,返回檢索結(jié)果。實驗在兩類數(shù)據(jù)集上進行,與BOVW模型和已有的算法相比,新算法極大地提高了圖像檢索的準(zhǔn)確率。
-
關(guān)鍵詞:
- 圖像檢索 /
- 稀疏編碼 /
- 視覺詞袋模型 /
- n-words模型 /
- Hesse能量函數(shù)
Abstract: To deal with the problem that the Bag-Of-Visual-Words (BOVW) model discards image spatial structure, a new method based on the Hessian sparse coding for image retrieval is introduced. First, the n-words model is built in order to obtain the local feature representation. The n-words model can establish a high-level description using a series of visual word sequences to represent an image. The experiments are performed from n=1 to n=5 to seek the proper n. Second, the Hessian sparse coding formulation is acquired by incorporating the Hessian energy function into the standard sparse coding formulation. Finally, using the obtained n-words sequences as the encoding features, the optimal Hessian coefficients are calculated through the feature-sign search algorithm. The similarity is computed and the retrieval results are returned. The experiments are performed on the two datasets, the results show that the proposed new method for image retrieval outperforms the BOVW model and existent methods. -
SIVIC J and ZISSERMAN A. Video google: A text retrieval approach to object matching in videos[C]. Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France, 2003: 1470-1477. doi: 10.1109/ICCV.2003. 1238663. LAZEBNIK S, SCHMID C, and PONCE J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories[C]. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, 2006: 2169-2178. doi: 10.1109/CVPR.2006. 68. ZHANG Shiliang, TIAN Qi, HUA Gang, et al. Generating descriptive visual words and visual phrases for large-scale image applications[J]. IEEE Transactions on Image Processing, 2011, 20(9): 2664-2677. doi: 10.1109/TIP. 2011. 2128333. CHEN Tao, YAP Kimhui, and ZHANG Dajiang. Discriminative bag-of-visual phrase learning for landmark recognition[C]. Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012: 893-896. doi: 10.1109/ICASSP.2012. 6288028. YANG Meng, ZHANG Lei, YANG Jian, et al. Robust sparse coding for face recognition[C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Colorado, Springs, USA, 2011: 625-632. doi: 10.1109/CVPR.2011.5995393. LIU Weifeng, TAO Dacheng, CHENG Jun, et al. Multiview Hessian discriminative sparse coding for image annotation[J]. Computer Vision and Image Understanding, 2014, 118: 50-60. doi: 10.1016/j.cviu.2013.03.007. REDDY M K, TALUR J, and BABU R V. Sparse coding based VLAD for efficient image retrieval[C]. Proceedings of the 2014 IEEE International Conference on Electronics, Computing and Communication Technologies, Bangalore, India, 2014: 1-4. doi: 10.1109/CONECCT.2014.6740340. LIU Qiegen, YING Leslie, and LIANG Dong. An efficient augmented Lagrangian algorithm for graph regularized sparse coding in clustering[C]. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, Canada, 2013: 1656-1660. doi: 10. 1109/ICASSP.2013.6637933. 錢智明, 鐘平, 王潤生. 基于圖正則化與非負組稀疏的自動圖像標(biāo)注[J]. 電子與信息學(xué)報, 2015, 37(4): 784-790. doi: 10. 11999/JEIT141282. QIAN Zhiming, ZHONG Ping, and WANG Runsheng. Automatic image annotation via graph regularization and non-negative group sparsity[J]. Journal of Electronics Information Technology, 2015, 37(4): 784-790. doi: 10.11999/ JEIT141282. 劉哲, 楊靜, 陳路. 基于非局部稀疏編碼的超分辨率圖像復(fù)原[J]. 電子與信息學(xué)報, 2015, 37(3): 522-528. doi: 10.11999/ JEIT140481. LIU Zhe, YANG Jing, and CHEN Lu. Super-resolution image restoration based on nonlocal sparse coding[J]. Journal of Electronics Information Technology, 2015, 37(3): 522-528. doi: 10.11999/JEIT140481. YANG Jianchao, YU Kai, GONG Yihong, et al. Linear spatial pyramid matching using sparse coding for image classification[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, Florida, USA, 2009: 1794-1801. doi: 10.1109/CVPRW.2009.5206757. WANG Jinjun, YANG Jianchao, YU Kai, et al. Locality- constrained linear coding for image classification[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, California, USA, 2010: 3360-3367. doi: 10.1109/CVPR.2010.5540018. GAO Shenghua, TSANG Ivor WaiHung, and CHIA Liangtien. Laplacian sparse coding, hypergraph Laplacian sparse coding, and applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 92-104. doi: 10.1109/ TPAMI.2012.63. PEDROSA G V and TRAINA A J M. From bag-of-visual- words to bag-of-visual-phrases using n-grams[C]. Proceedings of the 2013 XXVI Conference on Graphics, Patterns and Images, Arequipa, Peru, 2013: 304-311. doi: 10.1109/ SIBGRAPI.2013.49. SUEN C Y. N-gram statistics for natural language understanding and text processing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1978, 1(2): 164172. ZHENG Miao, BU Jiajun, and CHEN Chun. Hessian sparse coding[J]. Neurocomputing, 2014, 123: 247-254. doi: 10.1016/ j.neucom.2013.08.001. LEE H, BATTLE A, RAINA R, et al. Efficient sparse coding algorithms[C]. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2007: 801-808. KIM K, STEINKE F, and HEIN M. Semi-supervised regression using Hessian energy with an application to semi-supervised dimensionality reduction[C]. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, Canada, 2009: 979-987. LI Fefei, ROB F, and PIETRO P. Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories[J]. Computer Vision and Image Understanding, 2007, 106: 59-70. doi: 10.1016/j. cviu.2005.09.012. POWERS D M W. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness correlation [J]. Journal of Machine Learning Technologies, 2011, 2(1): 37-63. TURPIN A and SCHOLER F. User performance versus precision measures for simple search tasks[C]. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, Washington, USA, 2006: 11-18. 王瑞霞, 彭國華, 鄭紅嬋. 拉普拉斯稀疏編碼的圖像檢索算法[J]. 計算機科學(xué), 2014, 41(8): 278-280. doi: 10.11896/j.issn. 1002-137X.2014.08.058. WANG Ruixia, PENG Guohua, and ZHENG Hongchan. Image retrieval algorithm based on Laplacian sparse coding [J]. Computer Science, 2014, 41(8): 278-280. doi: 10.11896/ j.issn.1002-137X.2014.08.058. -
計量
- 文章訪問數(shù): 1401
- HTML全文瀏覽量: 125
- PDF下載量: 342
- 被引次數(shù): 0