一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級(jí)搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號(hào)碼
標(biāo)題
留言內(nèi)容
驗(yàn)證碼

一種基于樹增強(qiáng)樸素貝葉斯的分類器學(xué)習(xí)方法

陳曦 張坤

陳曦, 張坤. 一種基于樹增強(qiáng)樸素貝葉斯的分類器學(xué)習(xí)方法[J]. 電子與信息學(xué)報(bào), 2019, 41(8): 2001-2008. doi: 10.11999/JEIT180886
引用本文: 陳曦, 張坤. 一種基于樹增強(qiáng)樸素貝葉斯的分類器學(xué)習(xí)方法[J]. 電子與信息學(xué)報(bào), 2019, 41(8): 2001-2008. doi: 10.11999/JEIT180886
Xi CHEN, Kun ZHANG. A Classifier Learning Method Based on Tree-Augmented Na?ve Bayes[J]. Journal of Electronics & Information Technology, 2019, 41(8): 2001-2008. doi: 10.11999/JEIT180886
Citation: Xi CHEN, Kun ZHANG. A Classifier Learning Method Based on Tree-Augmented Na?ve Bayes[J]. Journal of Electronics & Information Technology, 2019, 41(8): 2001-2008. doi: 10.11999/JEIT180886

一種基于樹增強(qiáng)樸素貝葉斯的分類器學(xué)習(xí)方法

doi: 10.11999/JEIT180886
基金項(xiàng)目: 國(guó)家自然科學(xué)基金(61772087)
詳細(xì)信息
    作者簡(jiǎn)介:

    陳曦:男,1963年生,教授,碩士生導(dǎo)師,研究方向?yàn)閿?shù)據(jù)挖掘

    張坤:男,1993年生,碩士生,研究方向?yàn)閿?shù)據(jù)挖掘

    通訊作者:

    張坤 zonkis2016@outlook.com

  • 中圖分類號(hào): TP311.1

A Classifier Learning Method Based on Tree-Augmented Na?ve Bayes

Funds: The National Natural Science Foundation of China (61772087)
  • 摘要: 樹增強(qiáng)樸素貝葉斯(TAN)結(jié)構(gòu)強(qiáng)制每個(gè)屬性結(jié)點(diǎn)必須擁有類別父結(jié)點(diǎn)和一個(gè)屬性父結(jié)點(diǎn),也沒有考慮到各個(gè)屬性與類別之間的相關(guān)性差異,導(dǎo)致分類準(zhǔn)確率較差。為了改進(jìn)TAN的分類準(zhǔn)確率,該文首先擴(kuò)展TAN結(jié)構(gòu),允許屬性結(jié)點(diǎn)沒有父結(jié)點(diǎn)或只有一個(gè)屬性父結(jié)點(diǎn);提出一種利用可分解的評(píng)分函數(shù)構(gòu)建樹形貝葉斯分類模型的學(xué)習(xí)方法,采用低階條件獨(dú)立性(CI)測(cè)試初步剔除無效屬性,再結(jié)合改進(jìn)的貝葉斯信息標(biāo)準(zhǔn)(BIC)評(píng)分函數(shù)利用貪婪搜索獲得每個(gè)屬性結(jié)點(diǎn)的父結(jié)點(diǎn),從而建立分類模型。對(duì)比樸素貝葉斯(NB)和TAN,構(gòu)建的分類器在多個(gè)分類指標(biāo)上表現(xiàn)更好,說明該方法具有一定的優(yōu)越性。
  • 圖  1  結(jié)構(gòu)示意圖

    圖  2  不同閾值$\varepsilon $的分類準(zhǔn)確率

    圖  3  多分類數(shù)據(jù)集的AUC polar圖對(duì)比

    圖  7  SETAN結(jié)構(gòu)示意圖

    圖  4  平均分類準(zhǔn)確率對(duì)比圖

    圖  5  二分類數(shù)據(jù)集的ROC曲線

    圖  6  平均分類準(zhǔn)確率比較圖

    表  1  算法描述

     輸入:變量集V,樣本數(shù)據(jù)集D
     輸出:SETAN結(jié)構(gòu)
     步驟1 For each $X_i \in V\;$, $I(C,X_i) = {\rm{Calc\_MI}}(C,X_i)$# 計(jì)算屬性
    與類別之間的互信息值
     步驟2 將每個(gè)互信息值$I(C,X_i)$存入數(shù)組,降序
     步驟3 For each $I(C,X_i) > \varepsilon $ in List
      $S_1 = S_1 \cup \{ X_i\} $
      Add path $C - X_i$to graph $E$# 若無連接邊,則添加類
    別C到屬性$X_i$的連接邊
      $S_2 = S_{\rm{2}} \cup \{ X_j\} $,$X_j \in \{ I(C,X_j) < \varepsilon \} $
      Add path $X_i - X_j$to graph $E$# 互信息值小于閾值$\varepsilon $的
    結(jié)點(diǎn)則被添加到集合$S_2$
      Remove $I(C,X_i)$from List
      End for
     步驟4 For each $E' \in E$
      ${\rm{Score}}(E') = {\rm{Calc\_BIC}}(E')$# 計(jì)算改進(jìn)的BIC評(píng)分
      K2-Search of the optimal SETAN Structure # 利用評(píng)
    分函數(shù)搜索最優(yōu)結(jié)構(gòu)
      End for
     步驟5 Return $G = (V',E')$with best BIC score
    下載: 導(dǎo)出CSV

    表  2  網(wǎng)絡(luò)結(jié)構(gòu)學(xué)習(xí)實(shí)驗(yàn)結(jié)果

    $\xi $Asia網(wǎng)Alarm網(wǎng)
    ADRADR
    0.0103502025
    0.0011173245
    0.000130815345
    下載: 導(dǎo)出CSV

    表  3  實(shí)驗(yàn)數(shù)據(jù)集信息

    數(shù)據(jù)集樣本數(shù)量類別分布屬性數(shù)量分類數(shù)量缺失值
    Balance62549/288/28843
    Car17281210/384/69/6564
    Connect6755844473/16635/6449423
    Mushroom81244208/3916222
    Nursery129604320/2/328/4266/404485
    SPECT8040/40222
    Cancer28685/20192
    Votes435168/267162
    下載: 導(dǎo)出CSV

    表  4  閾值$\varepsilon $信息

    數(shù)據(jù)集閾值$\varepsilon $平均準(zhǔn)確率
    Balance0.01/0.05/0.100.915/0.914/0.910
    Connect0.01/0.05/0.100.767/0.764/0.760
    SPECT0.01/0.05/0.100.740/0.738/0.733
    Cancer0.01/0.05/0.100.710/0.710/0.698
    下載: 導(dǎo)出CSV

    表  5  NB, TAN和SETAN的各項(xiàng)分類指標(biāo)對(duì)比情況

    數(shù)據(jù)集算法準(zhǔn)確率F1值召回率精確率AUC面積
    NB0.9140.8760.9140.8420.961
    BalanceTAN0.8610.8340.8610.8360.904
    SETAN0.9140.8760.9140.8420.962
    NB0.8570.8490.8570.8540.976
    CarTAN0.9080.9110.9080.920.983
    SETAN0.9460.9470.9460.9470.988
    NB0.7210.6810.7210.6810.807
    ConnectTAN0.7630.7220.7630.7310.864
    SETAN0.7640.7240.7640.7350.866
    NB0.9580.9580.9580.960.998
    MushroomTAN0.9991.0000.9991.0001.000
    SETAN1.0001.0001.0001.0001.000
    NB0.9030.8940.9030.9060.982
    NurseryTAN0.9280.920.9280.9290.991
    SETAN0.9370.9270.9370.9370.993
    NB0.7380.7350.7380.7450.802
    SPECTTAN0.7130.7090.7130.7240.668
    SETAN0.7380.7360.7380.7410.755
    NB0.7340.7270.7340.7230.702
    CancerTAN0.7060.6920.7060.6870.667
    SETAN0.7100.7000.7100.6950.624
    NB0.9010.9020.9010.9050.973
    VotesTAN0.9400.9400.9400.9410.986
    SETAN0.9490.9500.9490.9500.985
    下載: 導(dǎo)出CSV
  • PEARL J. Probabilistic reasoning in intelligent systems: networks of plausible inference[J]. Computer Science Artificial Intelligence, 1991, 70(2): 1022–1027. doi: 10.2307/407557
    WEBB G I, CHEN Shenglei, and N A. Zaidi Scalable learning of Bayesian network classifiers[J]. Journal of Machine Learning Research, 2016, 17(1): 1515–1549.
    MURALIDHARAN V and SUGUMARAN V. A comparative study of Na?ve Bayes classifier and Bayes net classifier for fault diagnosis of monoblock centrifugal pump using wavelet analysis[J]. Applied Soft Computing, 2012, 12(8): 2023–2029. doi: 10.1016/j.asoc.2012.03.021
    Friedman N, Geiger D, and Goldszmidt M. Bayesian network classifiers[J]. Machine Learning, 1997, 29(2-3): 131–163. doi: 10.1023/a:1007465528199
    GAN Hongxiao, ZHANG Yang, and SONG Qun. Bayesian belief network for positive unlabeled learning with uncertainty[J]. Pattern Recognition Letters, 2017, 90. doi: 10.1016/j.patrec.2017.03.007
    JIANG Liangxiao, CAI Zhihua, WANG Dianhong, et al. Improving Tree augmented Naive Bayes for class probability estimation[J]. Knowledge-Based Systems, 2012, 26: 239–245. doi: 10.1016/j.knosys.2011.08.010
    王中鋒, 王志海. TAN分類器結(jié)構(gòu)等價(jià)類空間及其在分類器學(xué)習(xí)算法中的應(yīng)用[J]. 北京郵電大學(xué)學(xué)報(bào), 2012, 35(1): 72–76. doi: 10.3969/j.issn.1007-5321.2012.01.017

    WANG Zhongfeng and WANG Zhihai. Equivalent classes of TAN classifier structure and their application on learning algorithm[J]. Journal of Beijing University of Posts and Telecommunications, 2012, 35(1): 72–76. doi: 10.3969/j.issn.1007-5321.2012.01.017
    DUAN Zhiyi and WANG Limin. K-dependence bayesian classifier ensemble[J]. Entropy, 2017, 19(12): 651. doi: 10.3390/e19120651
    馮月進(jìn), 張鳳斌. 最大相關(guān)最小冗余限定性貝葉斯網(wǎng)絡(luò)分類器學(xué)習(xí)算法[J]. 重慶大學(xué)學(xué)報(bào), 2014, 37(6): 71–77. doi: 10.11835/j.issn.1000-582X.2014.06.011

    FENG Yuejin and ZHANG Fengbi. Max-relevance min-redundancy restrictive BAN classifier learning algorithm[J]. Journal of Chongqing University:Natural Science, 2014, 37(6): 71–77. doi: 10.11835/j.issn.1000-582X.2014.06.011
    WONG M L and LEUNG K S. An efficient data mining method for learning bayesian networks using an evolutionary algorithm-based hybrid approach[J]. IEEE Transactions on Evolutionary Computation, 2004, 8(4): 378–404. doi: 10.1109/TEVC.2004.830334
    LOU Hua, WANG Limin, DUAN Dingbo, et al. RDE: A novel approach to improve the classification performance and expressivity of KDB[J]. Plos One, 2018, 13(7): e0199822. doi: 10.1371/journal.pone.0199822
    ROBINSON R W. Counting Unlabeled Acyclic Digraphs[M]. Berlin Heidelberg: Springer, 1977: 28–43. doi: 10.1007/BFb0069178.
    SCHWARZ G. Estimating the Dimension of a Model[J]. Annals of Statistics, 1978, 6(2): 15–18.
    GREINER R and ZHOU W. Structural extension to logistic regression: discriminative parameter learning of belief net classifiers[J]. Machine Learning, 2005, 59(3): 297–322. doi: 10.1007/s10994-005-0469-0
    MADDEN M G. On the classification performance of TAN and general bayesian networks[J]. Knowledge-Based Systems, 2009, 22(7): 489–495. doi: 10.1016/j.knosys.2008.10.006
    DRUGAN M M and WIERING M A. Feature selection for Bayesian network classifiers using the MDL-FS score[J]. International Journal of Approximate Reasoning, 2010, 51(6): 695–717. doi: 10.1016/j.ijar.2010.02.001
    WU Jiehua. A generalized tree augmented naive bayes link prediction model[J]. Journal of Computational Science, 2018. doi: 10.1016/j.jocs.2018.04.006
    MEHRJOU A, HOSSEINI R, and ARAABI B N. Improved Bayesian information criterion for mixture model selection[J]. Pattern Recognition Letters, 2016, 69: 22–27. doi: 10.1016/j.patrec.2015.10.004
    杜瑞杰. 貝葉斯分類器及其應(yīng)用研究[D]. [碩士論文], 上海大學(xué), 2012.

    DU Ruijie. The Research of Bayesian Classifier and its applications[D]. [Master disertation], Shanghai University, 2012
  • 加載中
圖(7) / 表(5)
計(jì)量
  • 文章訪問數(shù):  3064
  • HTML全文瀏覽量:  1302
  • PDF下載量:  108
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2018-09-18
  • 修回日期:  2019-03-27
  • 網(wǎng)絡(luò)出版日期:  2019-04-20
  • 刊出日期:  2019-08-01

目錄

    /

    返回文章
    返回