一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

姓名
郵箱
手機號碼
標題
留言內(nèi)容
驗證碼

基于GAN實現(xiàn)環(huán)境聲音分類的組合對抗防御

張強 楊吉斌 張雄偉 曹鐵勇 李毅豪

張強, 楊吉斌, 張雄偉, 曹鐵勇, 李毅豪. 基于GAN實現(xiàn)環(huán)境聲音分類的組合對抗防御[J]. 電子與信息學報, 2023, 45(12): 4399-4410. doi: 10.11999/JEIT221251
引用本文: 張強, 楊吉斌, 張雄偉, 曹鐵勇, 李毅豪. 基于GAN實現(xiàn)環(huán)境聲音分類的組合對抗防御[J]. 電子與信息學報, 2023, 45(12): 4399-4410. doi: 10.11999/JEIT221251
ZHANG Qiang, YANG Jibin, ZHANG Xiongwei, CAO Tieyong, LI Yihao. Combinatorial Adversarial Defense for Environmental Sound Classification Based on GAN[J]. Journal of Electronics & Information Technology, 2023, 45(12): 4399-4410. doi: 10.11999/JEIT221251
Citation: ZHANG Qiang, YANG Jibin, ZHANG Xiongwei, CAO Tieyong, LI Yihao. Combinatorial Adversarial Defense for Environmental Sound Classification Based on GAN[J]. Journal of Electronics & Information Technology, 2023, 45(12): 4399-4410. doi: 10.11999/JEIT221251

基于GAN實現(xiàn)環(huán)境聲音分類的組合對抗防御

doi: 10.11999/JEIT221251
基金項目: 國家自然科學基金(62071484)
詳細信息
    作者簡介:

    張強:男,博士生,研究方向為信息內(nèi)容安全、人工智能、信號處理、對抗樣本攻擊與防御等

    楊吉斌:男,博士,副教授,研究方向為聲學信號處理、機器學習、模式識別、人工智能安全等

    張雄偉:男,博士,教授,研究方向為語音信號處理、機器學習、模式識別、人工智能安全等

    曹鐵勇:男,博士,教授,研究方向為信號處理、機器學習、圖像處理、人工智能安全等

    李毅豪:男,博士生,研究方向為信息內(nèi)容安全、人工智能、語音信號處理、對抗樣本攻擊與防御等

    通訊作者:

    楊吉斌 yjbice@sina.com

  • 中圖分類號: TN912

Combinatorial Adversarial Defense for Environmental Sound Classification Based on GAN

Funds: The National Natural Science Foundation of China (62071484)
  • 摘要: 雖然深度神經(jīng)網(wǎng)絡可以有效改善環(huán)境聲音分類(ESC)性能,但對對抗樣本攻擊依然具有脆弱性。已有對抗防御方法通常只對特定攻擊有效,無法適應白盒、黑盒等不同攻擊場景。為提高ESC模型在各種場景下對各種攻擊的防御能力,該文提出一種結(jié)合對抗檢測、對抗訓練和判別性特征學習的ESC組合對抗防御方法。該方法使用對抗樣本檢測器(AED)對輸入ESC模型的樣本進行檢測,基于生成對抗網(wǎng)絡(GAN)同時對AED和ESC模型進行對抗訓練,其中,AED作為GAN的判別器使用。同時,該方法將判別性損失函數(shù)引入ESC模型的對抗訓練中,以驅(qū)使模型學習到的樣本特征類內(nèi)更加緊湊、類間更加遠離,進一步提升模型的對抗魯棒性。在兩個典型ESC數(shù)據(jù)集,以及白盒、自適應白盒、黑盒攻擊設置下,針對多種模型開展了防御對比實驗。實驗結(jié)果表明,該方法基于GAN實現(xiàn)多種防御方法的組合,可以有效提升ESC模型防御對抗樣本攻擊的能力,對應的ESC準確率比其他方法對應的ESC準確率提升超過10%。同時,實驗驗證了所提方法的有效性不是由混淆梯度引起的。
  • 圖  1  ESC組合對抗防御方法總體框架

    圖  2  經(jīng)所提方法訓練的AED和ESC模型損失值隨迭代次數(shù)的變化

    圖  3  所提方法中不同要素組合對應的模型分類準確率(%)

    圖  4  采用所提方法防御的VGGish模型分類準確率與擾動大小的關(guān)系

    表  1  典型ESC數(shù)據(jù)集簡要信息

    數(shù)據(jù)集類別數(shù)樣本數(shù)訓練樣本數(shù)測試樣本數(shù)樣本時長聲道數(shù)
    ESC50502 0001 8002005 s1
    UrbanSound8K108 7327 858874≤4 s2
    下載: 導出CSV

    表  2  不同模型在典型ESC數(shù)據(jù)集上的分類準確率(%)

    數(shù)據(jù)集模型
    GoogLeNetAlexNetResNet18EnvNet-v2SoundNet8VGGish
    ESC5084.080.582.080.581.082.5
    UrbanSound8K96.694.596.393.396.597.8
    下載: 導出CSV

    表  3  在UrbanSound8K數(shù)據(jù)集上不同防御方法在白盒攻擊場景下的性能比較(%)

    分類模型
    GoogLeNetAlexNet
    NatureMAD[11]FGSM[12]WNA[14]本文NatureMAD[11]FGSM[12]WNA[14]本文
    不使用攻擊96.689.282.387.298.194.584.371.183.195.5
    FGSM攻擊32.477.840.238.592.727.373.534.645.292.3
    PGD攻擊12.672.127.430.188.511.468.624.534.387.9
    BIM攻擊13.873.228.531.389.713.269.325.135.188.4
    CW攻擊13.371.426.759.288.110.367.923.860.487.6
    最小值12.671.426.730.188.110.367.923.834.387.6
    下載: 導出CSV

    表  4  在UrbanSound8K數(shù)據(jù)集上所提方法在自適應白盒攻擊場景下的性能表現(xiàn)(%)

    GoogLeNetAlexNet
    FGSM攻擊92.592.0
    PGD攻擊88.387.6
    BIM攻擊89.488.2
    CW攻擊87.887.3
    最小值87.887.3
    下載: 導出CSV

    表  5  在ESC50數(shù)據(jù)集上不同防御方法在黑盒攻擊場景下的性能比較(%)

    SoundNet8VGGishEnvNet-v2
    NaturePGDCW本文NaturePGDCW本文NaturePGDCW本文
    FGSM攻擊40.569.358.376.242.868.148.577.337.269.466.775.8
    PGD攻擊27.459.240.372.324.258.538.470.525.157.455.869.5
    BIM攻擊28.558.639.873.225.356.238.770.726.360.556.370.1
    CW攻擊35.659.842.574.432.557.541.871.239.758.654.972.3
    最小值27.458.639.872.324.256.238.470.525.157.454.969.5
    下載: 導出CSV

    表  6  檢測閾值對所提方法防御性能的影響

    檢測
    閾值
    AED的對抗樣本
    檢測正確率(%)
    AED的真實樣本
    檢測正確率(%)
    ESC模型的分類準確率(%)
    真實樣本對抗樣本
    0.135.294.090.173.4
    0.353.391.393.476.2
    0.576.688.296.480.3
    0.787.685.695.779.4
    0.992.281.794.678.2
    下載: 導出CSV

    表  7  在ESC50數(shù)據(jù)集上所提方法在白盒攻擊場景下的性能表現(xiàn)(%)

    SoundNet8VGGishEnvNet-v2
    FGSM攻擊70.771.471.3
    PGD攻擊65.264.765.4
    BIM攻擊67.065.666.2
    CW攻擊66.165.365.8
    最小值65.264.765.4
    下載: 導出CSV
  • [1] PICZAK K J. ESC: Dataset for environmental sound classification[C]. The 23rd ACM Multimedia Conference, Brisbane, Australia, 2015: 1015–1018.
    [2] SALAMON J, JACOBY C, and BELLO J P. A dataset and taxonomy for urban sound research[C]. The 22nd ACM International Conference on Multimedia, Orlando, USA, 2014: 1041–1044.
    [3] GEMMEKE J F, ELLIS D P W, FREEDMAN D, et al. Audio set: An ontology and human-labeled dataset for audio events[C]. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, USA, 2017: 776–780.
    [4] GONG Yuan, CHUNG Y A, and GLASS J. AST: Audio spectrogram transformer[C]. The 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 2021: 571–575.
    [5] AYTAR Y, VONDRICK C, and TORRALBA A. SoundNet: Learning sound representations from unlabeled video[C]. The 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 2016: 892–900.
    [6] HERSHEY S, CHAUDHURI S, ELLIS D P W, et al. CNN architectures for large-scale audio classification[C]. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, USA, 2017: 131–135.
    [7] TOKOZUME Y, USHIKU Y, and HARADA T. Learning from between-class examples for deep sound recognition[C]. 6th International Conference on Learning Representations, Vancouver, Canada, 2018: 1–13.
    [8] ZEGHIDOUR N, TEBOUL O, DE CHAUMONT QUITRY F, et al. LEAF: A learnable frontend for audio classification[C]. The 9th International Conference on Learning Representations, Virtual Event, Austria, 2021: 1–16.
    [9] XIE Yi, LI Zhuohang, SHI Cong, et al. Enabling fast and universal audio adversarial attack using generative model[C/OL]. The 35th Conference on Artificial Intelligence, Virtual Event, 2021: 14129–14137.
    [10] ESMAEILPOUR M, CARDINAL P, and KOERICH A L. A robust approach for securing audio classification against adversarial attacks[J]. IEEE Transactions on Information Forensics and Security, 2020, 15: 2147–2159. doi: 10.1109/TIFS.2019.2956591
    [11] OLIVIER R, RAJ B, and SHAH M. High-frequency adversarial defense for speech and audio[C]. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, Canada, 2021: 2995–2999.
    [12] SALLO R A, ESMAEILPOUR M, and CARDINAL P. Adversarially training for audio classifiers[C]. The 25th International Conference on Pattern Recognition, Milan, Italy, 2020: 9569–9576.
    [13] ESMAEILPOUR M, CARDINAL P, and KOERICH A L. Detection of adversarial attacks and characterization of adversarial subspace[C]. 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, Spain, 2020: 3097–3101.
    [14] SUBRAMANIAN V, BENETOS E, and SANDLER M B. Robustness of adversarial attacks in sound event classification[C]. The Workshop on Detection and Classification of Acoustic Scenes and Events 2019, New York City, USA, 2019: 239–243.
    [15] POURSAEED O, JIANG Tianxing, YANG H, et al. Robustness and generalization via generative adversarial training[C]. 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 15711–15720.
    [16] LEE H, HAN S, and LEE J. Generative adversarial trainer: Defense to adversarial perturbations with GAN[EB/OL]. http://arxiv.org/abs/1705.03387v2, 2017.
    [17] JANG Y, ZHAO Tianchen, HONG S, et al. Adversarial defense via learning to generate diverse attacks[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 2740–2749.
    [18] WANG Huaxia and YU C N. A direct approach to robust deep learning using adversarial networks[C]. The 7th International Conference on Learning Representations, New Orleans, USA, 2019: 1–15.
    [19] 孔銳, 蔡佳純, 黃鋼. 基于生成對抗網(wǎng)絡的對抗攻擊防御模型[J/OL]. 自動化學報, 2020. https://doi.org/10.16383/j.aas.c200033, 2020.

    KONG Rui, CAI Jiachun, and HUANG Gang. Defense to adversarial attack with generative adversarial network[J/OL]. Acta Automatica Sinica, 2020. https://doi.org/10.16383/j.aas.c200033, 2020.
    [20] SAMANGOUEI P, KABKAB M, and CHELLAPPA R. Defense-GAN: Protecting classifiers against adversarial attacks using generative models[C]. The 6th International Conference on Learning Representations, Vancouver, Canada, 2018: 1–17.
    [21] WU Haibin, HSU P C, GAO Ji, et al. Adversarial sample detection for speaker verification by neural vocoders[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, Singapore, 2022: 236–240.
    [22] AGARWAL C, NGUYEN A, and SCHONFELD D. Improving robustness to adversarial examples by encouraging discriminative features[C]. 2019 IEEE International Conference on Image Processing, Taipei, China, 2019: 3801–3805.
    [23] MUSTAFA A, KHAN S H, HAYAT M, et al. Deeply supervised discriminative learning for adversarial defense[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(9): 3154–3166. doi: 10.1109/TPAMI.2020.2978474
    [24] ATHALYE A, CARLINI N, and WAGNER D A. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples[C]. The 35th International Conference on Machine Learning, Stockholm, Sweden, 2018: 274–283.
    [25] GOODFELLOW I J, SHLENS J, and SZEGEDY C. Explaining and harnessing adversarial examples[C]. The 3rd International Conference on Learning Representations, San Diego, USA, 2015: 1–11.
    [26] CARLINI N and WAGNER D. Towards evaluating the robustness of neural networks[C]. 2017 IEEE Symposium on Security and Privacy, San Jose, USA, 2017: 39–57.
    [27] KURAKIN A, GOODFELLOW I J, and BENGIO S. Adversarial examples in the physical world[C]. The 5th International Conference on Learning Representations, Toulon, France, 2017: 1–14.
    [28] LAN Jiahe, ZHANG Rui, YAN Zheng, et al. Adversarial attacks and defenses in speaker recognition systems: A survey[J]. Journal of Systems Architecture, 2022, 127: 102526. doi: 10.1016/j.sysarc.2022.102526
    [29] WEN Yandong, ZHANG Kaipeng, LI Zhifeng, et al. A discriminative feature learning approach for deep face recognition[C]. 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 499–515.
    [30] SCHROFF F, KALENICHENKO D, and PHILBIN J. FaceNet: a unified embedding for face recognition and clustering[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 815–823.
    [31] 張強, 楊吉斌, 張雄偉, 等. CS-Softmax: 一種基于余弦相似性的Softmax損失函數(shù)[J]. 計算機研究與發(fā)展, 2022, 59(4): 936–949. doi: 10.7544/issn1000-1239.20200879

    ZHANG Qiang, YANG Jibin, ZHANG Xiongwei, et al. CS-Softmax: A cosine similarity-based Softmax loss[J]. Journal of Computer Research and Development, 2022, 59(4): 936–949. doi: 10.7544/issn1000-1239.20200879
    [32] SALIMANS T, GOODFELLOW I, ZAREMBA W, et al. Improved techniques for training GANs[C]. The 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 2016, 29: 2234–2242.
    [33] YANG Dingdong, HONG S, JANG Y, et al. Diversity-sensitive conditional generative adversarial networks[C]. The 7th International Conference on Learning Representations, New Orleans, USA, 2019: 1–23.
    [34] SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1–9.
    [35] KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[C]. The 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2012: 1097–1105.
    [36] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
    [37] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278–2324. doi: 10.1109/5.726791
    [38] ENGSTROM L, ILYAS A, and ATHALYE A. Evaluating and understanding the robustness of adversarial logit pairing[EB/OL]. http://arxiv.org/abs/1807.10272, 2018.
    [39] MADRY A, MAKELOV A, SCHMIDT L, et al. Towards deep learning models resistant to adversarial attacks[C]. The 6th International Conference on Learning Representations, Vancouver, Canada, 2018: 1–28.
    [40] KIM H. Torchattacks: A PyTorch repository for adversarial attacks[EB/OL]. http://arxiv.org/abs/2010.01950v3, 2020.
    [41] TRAMÈR F, PAPERNOT N, GOODFELLOW I, et al. The space of transferable adversarial examples[EB/OL]. http://arxiv.org/abs/1704.03453, 2017.
    [42] TSIPRAS D, SANTURKAR S, ENGSTROM L, et al. Robustness may be at odds with accuracy[C]. The 7th International Conference on Learning Representations, New Orleans, USA, 2019: 1–24.
  • 加載中
圖(4) / 表(7)
計量
  • 文章訪問數(shù):  481
  • HTML全文瀏覽量:  464
  • PDF下載量:  140
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2022-09-27
  • 修回日期:  2022-12-08
  • 錄用日期:  2022-12-20
  • 網(wǎng)絡出版日期:  2022-12-23
  • 刊出日期:  2023-12-26

目錄

    /

    返回文章
    返回