基于改進(jìn)循環(huán)生成式對抗網(wǎng)絡(luò)的圖像風(fēng)格遷移
doi: 10.11999/JEIT190407
-
1.
天津理工大學(xué)電氣電子工程學(xué)院 天津 300384
-
2.
天津市復(fù)雜系統(tǒng)控制理論及應(yīng)用重點實驗室 天津 300384
Image-to-image Translation Based on Improved Cycle-consistent Generative Adversarial Network
-
1.
School of Electrical and Electronic Engineering, Tianjin University of Technology, Tianjin 300384, China
-
2.
Tianjin Key Laboratory of Complex System Control Theory and Application, Tianjin 300384, China
-
摘要:
圖像間的風(fēng)格遷移是一類將圖片在不同領(lǐng)域進(jìn)行轉(zhuǎn)換的方法。隨著生成式對抗網(wǎng)絡(luò)在深度學(xué)習(xí)中的快速發(fā)展,其在圖像風(fēng)格遷移領(lǐng)域中的應(yīng)用被日益關(guān)注。但經(jīng)典算法存在配對訓(xùn)練數(shù)據(jù)較難獲取,生成圖片效果差的缺點。該文提出一種改進(jìn)循環(huán)生成式對抗網(wǎng)絡(luò)(CycleGAN++),取消了環(huán)形網(wǎng)絡(luò),并在圖像生成階段將目標(biāo)域與源域的先驗信息與相應(yīng)圖片進(jìn)行縱深級聯(lián);優(yōu)化了損失函數(shù),采用分類損失代替循環(huán)一致?lián)p失,實現(xiàn)了不依賴訓(xùn)練數(shù)據(jù)映射的圖像風(fēng)格遷移。采用CelebA和Cityscapes數(shù)據(jù)集進(jìn)行實驗評測,結(jié)果表明在亞馬遜勞務(wù)平臺感知研究(AMT perceptual studies)與全卷積網(wǎng)絡(luò)得分(FCN score)兩個經(jīng)典測試指標(biāo)中,該文算法比CycleGAN, IcGAN, CoGAN, DIAT等經(jīng)典算法取得了更高的精度。
-
關(guān)鍵詞:
- 圖像風(fēng)格遷移 /
- 深度學(xué)習(xí) /
- 生成式對抗網(wǎng)絡(luò) /
- 損失函數(shù)
Abstract:Image-to-image translation is a method to convert images in different domains. With the rapid development of the Generative Adversarial Network(GAN) in deep learning, GAN applications are increasingly concerned in the field of image-to-image translation. However, classical algorithms have disadvantages that the paired training data is difficult to obtain and the convert effect of generation image is poor. An improved Cycle-consistent Generative Adversarial Network(CycleGAN++) is proposed. New algorithm removes the loop network, and cascades the prior information of the target domain and the source domain in the image generation stage, The loss function is optimized as well, using classification loss instead of cycle consistency loss, realizing image-to-image translation without training data mapping. The evaluation of experiments on the CelebA and Cityscapes dataset show that new method can reach higher precision under the two classical criteria—Amazon Mechanical Turk perceptual studies(AMT perceptual studies) and Full-Convolutional Network score(FCN score), than the classical algorithms such as CycleGAN, IcGAN, CoGAN, and DIAT.
-
表 1 CycleGAN+與原算法的AMT測試結(jié)果對比(%)
方法 男性→女性 女性→男性 照片→標(biāo)簽 標(biāo)簽→照片 CycleGAN 24.6±2.3 21.1±1.8 26.8±2.8 23.2±3.4 CycleGAN+ 29.5±3.2 29.2±4.1 27.8±2.2 28.2±2.4 下載: 導(dǎo)出CSV
表 2 CycleGAN+與原算法的FCN得分結(jié)果對比
方法 每像素精度 每類精度 IoU分類 CycleGAN 0.52 0.17 0.11 CycleGAN+ 0.60 0.21 0.16 下載: 導(dǎo)出CSV
表 3 CycleGAN++與CycleGAN+的AMT感知研究結(jié)果對比(%)
方法 男性→女性 女性→男性 照片→標(biāo)簽 標(biāo)簽→照片 CycleGAN+ 29.5±3.2 29.2±4.1 27.8±2.2 28.2±2.4 本文CycleGAN++ 31.4±3.8 32.6±4.7 30.1±2.6 30.9±2.7 下載: 導(dǎo)出CSV
表 4 CycleGAN++與CycleGAN+的FCN得分結(jié)果對比
方法 每像素精度 每類精度 IoU分類 CycleGAN+ 0.60 0.21 0.16 本文CycleGAN++ 0.69 0.27 0.23 下載: 導(dǎo)出CSV
-
HERTZMANN A, JACOBS C E, OLIVER N, et al. Image analogies[C]. The 28th Annual Conference on Computer Graphics and Interactive Techniques, New York, USA, 2001: 327–340. doi: 10.1145/383259.383295. GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]. The 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 2014: 2672–2680. RADFORD A, METZ L, and CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks[EB/OL]. https://arxiv.org/abs/1511.06434, 2015. ARJOVSKY M, CHINTALA S, and BOTTOU L. Wasserstein GAN[EB/OL]. https://arxiv.org/abs/1701.07875, 2017. GULRAJANI I, AHMED F, ARJOVSKY M, et al. Improved training of wasserstein GANs[C]. The 31st International Conference on Neural Information Processing Systems, Red Hook, USA, 2017: 5769–5779. ISOLA P, ZHU Junyan, ZHOU Tinghui, et al. Image-to-image translation with conditional adversarial networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 5967–5976. doi: 10.1109/CVPR.2017.632. MIRZA M and OSINDERO S. Conditional generative adversarial nets[EB/OL]. https://arxiv.org/abs/1411.1784, 2014. ROSALES R, ACHAN K, and FREY B. Unsupervised image translation[C]. The 9th IEEE International Conference on Computer Vision, Nice, France, 2003: 472–478. doi: 10.1109/ICCV.2003.1238384. LIU Mingyu, BREUEL T, KAUTZ J, et al. Unsupervised image-to-image translation networks[C]. The 31st Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 700–708. LIU Mingyu and TUZEL O. Coupled generative adversarial networks[C]. The 30th Conference on Neural Information Processing Systems, Barcelona, Spain, 2016: 469–477. KINGMA D P and WELLING M. Auto-encoding variational bayes[EB/OL]. https://arxiv.org/abs/1312.6114, 2013. ZHU Junyan, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2242–2251. doi: 10.1109/ICCV.2017.244. KIM T, CHA M, KIM H, et al. Learning to discover cross-domain relations with generative adversarial networks[C]. The 34th International Conference on Machine Learning, Sydney, Australia, 2017: 1857–1865. YI Zili, ZHANG Hao, TAN Ping, et al. DualGAN: Unsupervised dual learning for image-to-image translation[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2868–2876. doi: 10.1109/ICCV.2017.310. BOUSMALIS K, SILBERMAN N, DOHAN D, et al. Unsupervised pixel-level domain adaptation with generative adversarial networks[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 95–104. doi: 10.1109/CVPR.2017.18. SHRIVASTAVA A, PFISTER T, TUZEL O, et al. Learning from simulated and unsupervised images through adversarial training[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2242–2251. doi: 10.1109/CVPR.2017.241. TAIGMAN Y, POLYAK A, and Wolf L. Unsupervised cross-domain image generation[EB/OL]. https://arxiv.org/abs/1611.02200, 2016. LI Chuan and WAND M. Precomputed real-time texture synthesis with markovian generative adversarial networks[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 702–716. doi: 10.1007/978-3-319-46487-9_43. LIU Ziwei, LUO Ping, WANG Xiaogang, et al. Deep learning face attributes in the wild[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 3730–3738. doi: 10.1109/ICCV.2015.425. KINGMA D P and BA J. Adam: A method for stochastic optimization[EB/OL]. https://arxiv.org/abs/1412.6980, 2014. LI Mu, ZUO Wangmeng, and ZHANG D. Deep identity-aware transfer of facial attributes[EB/OL]. https://arxiv.org/abs/1610.05586, 2016. PERARNAU G, VAN DE WEIJER J, RADUCANU B, et al. Invertible conditional GANs for image editing[EB/OL]. https://arxiv.org/abs/1611.06355, 2016. -