一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號碼
標(biāo)題
留言內(nèi)容
驗證碼

基于視圖感知的單視圖三維重建算法

王年 胡旭陽 朱凡 唐俊

王年, 胡旭陽, 朱凡, 唐俊. 基于視圖感知的單視圖三維重建算法[J]. 電子與信息學(xué)報, 2020, 42(12): 3053-3060. doi: 10.11999/JEIT190986
引用本文: 王年, 胡旭陽, 朱凡, 唐俊. 基于視圖感知的單視圖三維重建算法[J]. 電子與信息學(xué)報, 2020, 42(12): 3053-3060. doi: 10.11999/JEIT190986
Nian WANG, Xuyang HU, Fan ZHU, Jun TANG. Single-view 3D Reconstruction Algorithm Based on View-aware[J]. Journal of Electronics & Information Technology, 2020, 42(12): 3053-3060. doi: 10.11999/JEIT190986
Citation: Nian WANG, Xuyang HU, Fan ZHU, Jun TANG. Single-view 3D Reconstruction Algorithm Based on View-aware[J]. Journal of Electronics & Information Technology, 2020, 42(12): 3053-3060. doi: 10.11999/JEIT190986

基于視圖感知的單視圖三維重建算法

doi: 10.11999/JEIT190986
基金項目: 國家自然科學(xué)基金(61772032)
詳細(xì)信息
    作者簡介:

    王年:男,1966年生,教授,博士,主要從事模式識別與圖像處理等方面的研究

    胡旭陽:男,1995年生,碩士生,研究方向為圖像生成和3維重建

    朱凡:男,1987年生,博士,主要從事計算機(jī)視覺方面的研究

    唐俊:男,1977年生,教授,博士,主要從事模式識別與計算機(jī)視覺等方面的研究

    通訊作者:

    唐俊 tangjunahu@163.com

  • 中圖分類號: TN911.73; TP301.6

Single-view 3D Reconstruction Algorithm Based on View-aware

Funds: The National Nature Science Foundation of China (61772032)
  • 摘要: 盡管由于丟棄維度將3維(3D)形狀投影到2維(2D)視圖看似是不可逆的,但是從可視化到計算機(jī)輔助幾何設(shè)計,各個垂直行業(yè)對3維重建技術(shù)的興趣正迅速增長。傳統(tǒng)基于物體深度圖或者RGB圖的3維重建算法雖然可以在一些方面達(dá)到令人滿意的效果,但是它們?nèi)匀幻媾R若干問題:(1)粗魯?shù)膶W(xué)習(xí)2D視圖與3D形狀之間的映射;(2)無法解決物體不同視角下外觀差異所帶來的的影響;(3)要求物體多個觀察視角下的圖像。該文提出一個端到端的視圖感知3維(VA3D)重建網(wǎng)絡(luò)解決了上述問題。具體而言,VA3D包含多鄰近視圖合成子網(wǎng)絡(luò)和3D重建子網(wǎng)絡(luò)。多鄰近視圖合成子網(wǎng)絡(luò)基于物體源視圖生成多個鄰近視角圖像,且引入自適應(yīng)融合模塊解決了視角轉(zhuǎn)換過程中出現(xiàn)的模糊或扭曲等問題。3D重建子網(wǎng)絡(luò)使用循環(huán)神經(jīng)網(wǎng)絡(luò)從合成的多視圖序列中恢復(fù)物體3D形狀。通過在ShapeNet數(shù)據(jù)集上大量定性和定量的實(shí)驗表明,VA3D有效提升了基于單視圖的3維重建結(jié)果。
  • 圖  1  視圖感知3維重建

    圖  2  MSN生成器結(jié)構(gòu)

    圖  3  自適應(yīng)融合

    圖  4  定性比較示例樣本

    圖  5  SA3D與VA3D生成的多視圖對比

    圖  6  不同合成視圖數(shù)量的IoU和F-score

    表  1  定量比較結(jié)果

    類別IoU F-score
    3D-R2N2_13D-R2N2_5VA3D 3D-R2N2_13D-R2N2_5VA3D
    柜子0.72990.78390.79150.82670.86510.8694
    汽車0.81230.85510.85300.89230.91900.9178
    椅子0.49580.58020.56430.64040.71550.6995
    飛機(jī)0.55600.62280.63850.70060.75610.7641
    桌子0.52970.60610.61280.67170.73620.7386
    長凳0.46210.55660.55330.61150.69910.6936
    平均0.59760.66740.66890.72380.78180.7805
    下載: 導(dǎo)出CSV

    表  2  對比SA3D算法結(jié)果

    算法平均IoU
    SA3D0.6162
    VA3D0.6741
    下載: 導(dǎo)出CSV

    表  3  MSN中不同輸出策略的影響

    模型SSIMPSNRIoUF-score
    僅使用${\left\{ {{{{\tilde{ I}}}_r}} \right\}^{\rm{C}}}$0.803519.80420.65250.7649
    僅使用${\left\{ {{{{\tilde{ I}}}_f}} \right\}^{\rm{C}}}$0.843520.52730.65300.7646
    自適應(yīng)融合0.848820.62030.65540.7672
    下載: 導(dǎo)出CSV

    表  4  重建結(jié)果的方差

    模型$\sigma _{{\rm{IoU}}}^2$$\sigma _{{F} {\rm{ - score}}}^{\rm{2}}$
    合成視圖數(shù)量=00.00570.0061
    合成視圖數(shù)量=40.00510.0054
    下載: 導(dǎo)出CSV

    表  5  不同損失函數(shù)的組合

    模型SSIMPSNRIoUF-score
    無重建損失${{\cal{L}}_{{\rm{rec}}}}$0.846220.26930.65400.7658
    無對抗損失${{\cal{L}}_{{\rm{adv}}}}$0.851621.43850.65390.7651
    無感知損失${{\cal{L}}_{{\rm{per}}}}$0.841620.31410.65250.7645
    全部損失0.848820.62030.65540.7672
    下載: 導(dǎo)出CSV
  • EIGEN D, PUHRSCH C, and FERGUS R. Depth map prediction from a single image using a multi-scale deep network[C]. The 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 2014: 2366–2374.
    WU Jiajun, WANG Yifan, XUE Tianfan, et al. Marrnet: 3D shape reconstruction via 2.5D sketches[C]. The 31st Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 540–550.
    WANG Nanyang, ZHANG Yinda, LI Zhuwen, et al. Pixel2mesh: Generating 3D mesh models from single RGB images[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 55–71. doi: 10.1007/978-3-030-01252-6_4.
    TANG Jiapeng, HAN Xiaoguang, PAN Junyi, et al. A skeleton-bridged deep learning approach for generating meshes of complex topologies from single RGB images[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 4536–4545. doi: 10.1109/cvpr.2019.00467.
    CHOY C B, XU Danfei, GWAK J Y, et al. 3D-R2N2: A unified approach for single and multi-view 3D object reconstruction[C]. The 14th European Conference on Computer Vision, Amsterdam, the Netherlands, 2016: 628–644. doi: 10.1007/978-3-319-46484-8_38.
    HU Xuyang, ZHU Fan, LIU Li, et al. Structure-aware 3D shape synthesis from single-view images[C]. 2018 British Machine Vision Conference, Newcastle, UK, 2018.
    GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]. The 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 2014: 2672–2680.
    張驚雷, 厚雅偉. 基于改進(jìn)循環(huán)生成式對抗網(wǎng)絡(luò)的圖像風(fēng)格遷移[J]. 電子與信息學(xué)報, 2020, 42(5): 1216–1222. doi: 10.11999/JEIT190407

    ZHANG Jinglei and HOU Yawei. Image-to-image translation based on improved cycle-consistent generative adversarial network[J]. Journal of Electronics &Information Technology, 2020, 42(5): 1216–1222. doi: 10.11999/JEIT190407
    陳瑩, 陳湟康. 基于多模態(tài)生成對抗網(wǎng)絡(luò)和三元組損失的說話人識別[J]. 電子與信息學(xué)報, 2020, 42(2): 379–385. doi: 10.11999/JEIT190154

    CHEN Ying and CHEN Huangkang. Speaker recognition based on multimodal generative adversarial nets with triplet-loss[J]. Journal of Electronics &Information Technology, 2020, 42(2): 379–385. doi: 10.11999/JEIT190154
    WANG Tingchun, LIU Mingyu, ZHU Junyan, et al. High-resolution image synthesis and semantic manipulation with conditional gans[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8798–8807. doi: 10.1109/cvpr.2018.00917.
    ULYANOV D, VEDALDI A, and LEMPITSKY V. Instance normalization: The missing ingredient for fast stylization[EB/OL]. https://arxiv.org/abs/1607.08022, 2016.
    XU Bing, WANG Naiyan, CHEN Tianqi, et al. Empirical evaluation of rectified activations in convolutional network[EB/OL]. https://arxiv.org/abs/1505.00853, 2015.
    GOKASLAN A, RAMANUJAN V, RITCHIE D, et al. Improving shape deformation in unsupervised image-to-image translation[C]. The 15th European Conference on Computer Vision, Munich, Germany, 2018: 662–678. doi: 10.1007/978-3-030-01258-8_40.
    MAO Xudong, LI Qing, XIE Haoran, et al. Least squares generative adversarial networks[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2813–2821. doi: 10.1109/iccv.2017.304.
    GULRAJANI I, AHMED F, ARJOVSKY M, et al. Improved training of wasserstein GANs[C]. The 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 5767–5777.
    LEDIG C, THEIS L, HUSZáR F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 105–114. doi: 10.1109/CVPR.2017.19.
    SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. https://arxiv.org/abs/1409.1556, 2014.
    KINGMA D P and BA J. Adam: A method for stochastic optimization[EB/OL]. https://arxiv.org/abs/1412.6980, 2014.
    CHANG A X, FUNKHOUSER T, GUIBAS L, et al. Shapenet: An information-rich 3D model repository[EB/OL]. https://arxiv.org/abs/1512.03012, 2015.
    GRABNER A, ROTH P M, and LEPETIT V. 3D pose estimation and 3D model retrieval for objects in the wild[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 3022–3031. doi: 10.1109/cvpr.2018.00319.
    HE Xinwei, ZHOU Yang, ZHOU Zhichao, et al. Triplet-center loss for multi-view 3D object retrieval[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 1945–1954. doi: 10.1109/cvpr.2018.00208.
  • 加載中
圖(6) / 表(5)
計量
  • 文章訪問數(shù):  2627
  • HTML全文瀏覽量:  1423
  • PDF下載量:  137
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2019-12-09
  • 修回日期:  2020-05-26
  • 網(wǎng)絡(luò)出版日期:  2020-06-22
  • 刊出日期:  2020-12-08

目錄

    /

    返回文章
    返回