一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級(jí)搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問(wèn)題, 您可以本頁(yè)添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號(hào)碼
標(biāo)題
留言內(nèi)容
驗(yàn)證碼

基于金字塔池化網(wǎng)絡(luò)的道路場(chǎng)景深度估計(jì)方法

周武杰 潘婷 顧鵬笠 翟治年

周武杰, 潘婷, 顧鵬笠, 翟治年. 基于金字塔池化網(wǎng)絡(luò)的道路場(chǎng)景深度估計(jì)方法[J]. 電子與信息學(xué)報(bào), 2019, 41(10): 2509-2515. doi: 10.11999/JEIT180957
引用本文: 周武杰, 潘婷, 顧鵬笠, 翟治年. 基于金字塔池化網(wǎng)絡(luò)的道路場(chǎng)景深度估計(jì)方法[J]. 電子與信息學(xué)報(bào), 2019, 41(10): 2509-2515. doi: 10.11999/JEIT180957
Wujie ZHOU, Ting PAN, Pengli GU, Zhinian ZHAI. Depth Estimation of Monocular Road Images Based on Pyramid Scene Analysis Network[J]. Journal of Electronics & Information Technology, 2019, 41(10): 2509-2515. doi: 10.11999/JEIT180957
Citation: Wujie ZHOU, Ting PAN, Pengli GU, Zhinian ZHAI. Depth Estimation of Monocular Road Images Based on Pyramid Scene Analysis Network[J]. Journal of Electronics & Information Technology, 2019, 41(10): 2509-2515. doi: 10.11999/JEIT180957

基于金字塔池化網(wǎng)絡(luò)的道路場(chǎng)景深度估計(jì)方法

doi: 10.11999/JEIT180957
基金項(xiàng)目: 國(guó)家自然科學(xué)基金(61502429),浙江省自然科學(xué)基金(LY18F0002)
詳細(xì)信息
    作者簡(jiǎn)介:

    周武杰:男,1983年生,副教授,博士,研究方向?yàn)橛?jì)算機(jī)視覺(jué)與模式識(shí)別,深度學(xué)習(xí)

    潘婷:女,1994年生,碩士,研究方向?yàn)橛?jì)算機(jī)視覺(jué)與模式識(shí)別

    顧鵬笠:男,1989年生,碩士,研究方向?yàn)橛?jì)算機(jī)視覺(jué)與模式識(shí)別

    翟治年:男,1977年生,講師,博士,研究方向?yàn)樯疃葘W(xué)習(xí)

    通訊作者:

    周武杰 wujiezhou@163.com

  • 中圖分類(lèi)號(hào): TP391.4

Depth Estimation of Monocular Road Images Based on Pyramid Scene Analysis Network

Funds: The National Natural Science Foundation of China (61502429), The Zhejiang Provincial Natural Science foundation (LY18F020012)
  • 摘要: 針對(duì)從單目視覺(jué)圖像中估計(jì)深度信息時(shí)存在的預(yù)測(cè)精度不夠準(zhǔn)確的問(wèn)題,該文提出一種基于金字塔池化網(wǎng)絡(luò)的道路場(chǎng)景深度估計(jì)方法。該方法利用4個(gè)殘差網(wǎng)絡(luò)塊的組合提取道路場(chǎng)景圖像特征,然后通過(guò)上采樣將特征圖逐漸恢復(fù)到原始圖像尺寸,多個(gè)殘差網(wǎng)絡(luò)塊的加入增加網(wǎng)絡(luò)模型的深度;考慮到上采樣過(guò)程中不同尺度信息的多樣性,將提取特征過(guò)程中各種尺寸的特征圖與上采樣過(guò)程中相同尺寸的特征圖進(jìn)行融合,從而提高深度估計(jì)的精確度。此外,對(duì)4個(gè)殘差網(wǎng)絡(luò)塊提取的高級(jí)特征采用金字塔池化網(wǎng)絡(luò)塊進(jìn)行場(chǎng)景解析,最后將金字塔池化網(wǎng)絡(luò)塊輸出的特征圖恢復(fù)到原始圖像尺寸并與上采樣模塊的輸出一同輸入預(yù)測(cè)層。通過(guò)在KITTI數(shù)據(jù)集上進(jìn)行實(shí)驗(yàn),結(jié)果表明該文所提的基于金字塔池化網(wǎng)絡(luò)的道路場(chǎng)景深度估計(jì)方法優(yōu)于現(xiàn)有的估計(jì)方法。
  • 圖  1  本文提出的神經(jīng)網(wǎng)絡(luò)框架

    圖  2  兩種殘差網(wǎng)絡(luò)塊塊的結(jié)構(gòu)圖

    圖  3  上采樣恢復(fù)尺度模塊

    圖  4  金字塔池化模塊

    表  1  深度圖像的預(yù)測(cè)值與真實(shí)值之間的誤差和相關(guān)性

    RMSE Lg Lg_rms a1 a2 a3
    Fine_coarse[17] 2.6440 0.272 0.167 0.488 0.948 0.972
    ResNet50[18] 2.4618 0.243 0.126 0.674 0.943 0.972
    ResNet_fcn50[19] 2.5284 0.247 0.134 0.636 0.950 0.979
    D_U[20] 2.8246 0.305 0.127 0.634 0.916 0.945
    UVD_fcn[21] 2.6507 0.264 0.145 0.566 0.945 0.970
    本文方法 2.3504 0.230 0.120 0.684 0.949 0.975
    下載: 導(dǎo)出CSV

    表  2  不同恢復(fù)尺度方法的結(jié)果

    RMSE Lg Lg_rms a1 a2 a3
    使用反卷積層恢復(fù)尺度的方法 2.3716 0.237 0.125 0.673 0.946 0.973
    使用卷積塊恢復(fù)尺度的方法 2.4724 0.240 0.129 0.646 0.948 0.974
    使用上采樣層恢復(fù)尺度的方法 2.3504 0.230 0.120 0.684 0.949 0.975
    下載: 導(dǎo)出CSV
  • LUO Yue, REN J, LIN Mude, et al. Single view stereo matching[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 155–163.
    SILBERMAN N, HOIEM D, KOHLI P, et al. Indoor segmentation and support inference from RGBD images[C]. The 12th European Conference on Computer Vision, Florence, Italy, 2012: 746–760.
    REN Xiaofeng, BO Liefeng, and FOX D. RGB-(D) scene labeling: Features and algorithms[C]. 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012: 2759–2766.
    SHOTTON J, SHARP T, KIPMAN A, et al. Real-time human pose recognition in parts from single depth images[J]. Communications of the ACM, 2013, 56(1): 116–124. doi: 10.1145/2398356
    ALP GüLER R, NEVEROVA N, and KOKKINOS I. Densepose: Dense human pose estimation in the wild[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7297–7306.
    LUO Wenjie, SCHWING A G, and URTASUN R. Efficient deep learning for stereo matching[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 5695–5703.
    FLINT A, MURRAY D, and REID I. Manhattan scene understanding using monocular, stereo, and 3D features[C]. 2011 International Conference on Computer Vision, Barcelona, Spain, 2011: 2228–2235.
    KUNDU A, LI Yin, DELLAERT F, et al. Joint semantic segmentation and 3D reconstruction from monocular video[C]. The 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 703–718.
    YAMAGUCHI K, MCALLESTER D, and URTASUN R. Efficient joint segmentation, occlusion labeling, stereo and flow estimation[C]. The 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 756–771.
    BAIG M H and TORRESANI L. Coupled depth learning[C]. 2016 IEEE Winter Conference on Applications of Computer Vision, Lake Placid, USA, 2016: 1–10.
    EIGEN D and FERGUS R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 2650–2658.
    SCHARSTEIN D and SZELISKI R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms[J]. International Journal of Computer Vision, 2002, 47(1/3): 7–42. doi: 10.1023/A:1014573219977
    UPTON K. A modern approach[J]. Manufacturing Engineer, 1995, 74(3): 111–113. doi: 10.1049/me:19950308
    FLYNN J, NEULANDER I, PHILBIN J, et al. Deep stereo: Learning to predict new views from the world's imagery[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 5515–5524.
    SAXENA A, CHUNG S H, and NG A Y. 3-D depth reconstruction from a single still image[J]. International Journal of Computer Vision, 2008, 76(1): 53–69.
    KARSCH K, LIU Ce, and KANG S B. Depth transfer: Depth extraction from video using non-parametric sampling[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(11): 2144–2158. doi: 10.1109/TPAMI.2014.2316835
    EIGEN D, PUHRSCH C, and FERGUS R. Depth map prediction from a single image using a multi-scale deep network[C]. The 27th International Conference on Neural Information Processing Systems, Montréal, Canada, 2014: 2366–2374.
    LAINA I, RUPPRECHT C, BELAGIANNIS V, et al. Deeper depth prediction with fully convolutional residual networks[C]. The 4th International Conference on 3D Vision, Stanford, USA, 2016: 239–248.
    FU Huan, GONG Mingming, WANG Chaohui, et al. Deep ordinal regression network for monocular depth estimation[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 2002–2011.
    DIMITRIEVSKI M, GOOSSENS B, VEELAERT P, et al. High resolution depth reconstruction from monocular images and sparse point clouds using deep convolutional neural network[J]. SPIE, 2017, 10410: 104100H.
    MANCINI M, COSTANTE G, VALIGI P, et al. Toward domain independence for learning-based monocular depth estimation[J]. IEEE Robotics and Automation Letters, 2017, 2(3): 1778–1785. doi: 10.1109/LRA.2017.2657002
    GARG R, VIJAY KUMAR B G, CARNEIRO G, et al. Unsupervised CNN for single view depth estimation: Geometry to the rescue[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 740–756.
    KUZNIETSOV Y, STUCKLER J, and LEIBE B. Semi-supervised deep learning for monocular depth map prediction[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6647–6655.
    GODARD C, MAC AODHA O, and BROSTOW G J. Unsupervised monocular depth estimation with left-right consistency[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6602–6611.
    ZORAN D, ISOLA P, KRISHNAN D, et al. Learning ordinal relationships for mid-level vision[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 388–396.
    CHEN Weifeng, FU Zhao, YANG Dawei, et al. Single-image depth perception in the wild supplementary Materia[C]. The 30th Conference on Neural Information Processing Systems, Barcelona, Spain, 2016: 730–738.
    HE Kaiming, ZHANG Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
    ZHAO Hengshuang, SHI Jianping, QI Xiaojuan, et al. Pyramid scene parsing network[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6230–6239.
    ZHOU Bolei, KHOSLA A, LAPEDRIZA A, et al. Object detectors emerge in deep scene CNNs[J]. arXiv preprint arXiv: 1412.6856, 2014.
    SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1–9.
    UHRIG J, SCHNEIDER N, SCHNEIDER L, et al. Sparsity invariant CNNs[C]. 2017 International Conference on 3D Vision, Qingdao, China, 2017: 11–20.
    KINGMA D P and BA J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv: 1412.6980, 2014.
  • 加載中
圖(4) / 表(2)
計(jì)量
  • 文章訪問(wèn)數(shù):  3025
  • HTML全文瀏覽量:  1485
  • PDF下載量:  75
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2018-10-12
  • 修回日期:  2019-05-21
  • 網(wǎng)絡(luò)出版日期:  2019-05-28
  • 刊出日期:  2019-10-01

目錄

    /

    返回文章
    返回