基于混合多樣性生成與修剪的集成單類分類算法
doi: 10.11999/JEIT140161
基金項(xiàng)目:
國(guó)家自然科學(xué)基金(61272280, 41271447, 61272195),教育部新世紀(jì)優(yōu)秀人才支持計(jì)劃(NCET-12-0919),中央高?;究蒲袠I(yè)務(wù)費(fèi)專項(xiàng)資金(K5051203020, K5051303016, K5051303018, BDY081422, K50513100006)和西安市科技局項(xiàng)目(CXY1341(6))資助課題
Ensemble One-class Classifiers Based on Hybrid Diversity Generation and Pruning
-
摘要: 針對(duì)傳統(tǒng)集成學(xué)習(xí)方法直接應(yīng)用于單類分類器效果不理想的問(wèn)題,該文首先證明了集成學(xué)習(xí)方法能夠提升單類分類器的性能,同時(shí)證明了若基分類器集不經(jīng)選擇會(huì)導(dǎo)致集成后性能下降;接著指出了經(jīng)典集成方法直接應(yīng)用于單類分類器集成時(shí)存在基分類器多樣性嚴(yán)重不足的問(wèn)題,并提出了一種能夠提高多樣性的基單類分類器混合生成策略;最后從集成損失構(gòu)成的角度拆分集成單類分類器的損失函數(shù),針對(duì)性地構(gòu)造了集成單類分類器修剪策略并提出一種基于混合多樣性生成和修剪的單類分類器集成算法,簡(jiǎn)稱為PHD-EOC。在UCI標(biāo)準(zhǔn)數(shù)據(jù)集和惡意程序行為檢測(cè)數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果表明,PHD-EOC算法兼顧多樣性與單類分類性能,在各種單類分類器評(píng)價(jià)指標(biāo)上均較經(jīng)典集成學(xué)習(xí)方法有更好的表現(xiàn),并降低了決策階段的時(shí)間復(fù)雜度。
-
關(guān)鍵詞:
- 機(jī)器學(xué)習(xí) /
- 單類分類 /
- 集成單類分類 /
- 分類器多樣性 /
- 集成修剪 /
- 集成學(xué)習(xí)
Abstract: Combining one-class classifiers using the classical ensemble methods is not satisfactory. To address this problem, this paper first proves that though one-class classification performance can be improved by a classifier ensemble, it can also degrade if the set of base classifiers are not selected carefully. On this basis, this study further analyzes that the lacking of diversity heavily accounts for performance degradation. Therefore, a hybrid method for generating diverse base classifiers is proposed. Secondly, in the combining phase, to find the most useful diversity, the one-class ensemble loss is split and analyzed theoretically to propose a diversity based pruning method. Finally, by combining these two steps, a novel ensemble one-class classifier named Pruned Hybrid Diverse Ensemble One-class Classifier (PHD-EOC) is proposed. The experimental results on the UCI datasets and a malicious software detection dataset show that the PHD-EOC strikes a better balance between the diverse base classifiers and classification performance. It also outperforms other classical ensemble methods for a faster decision speed. -
計(jì)量
- 文章訪問(wèn)數(shù): 1712
- HTML全文瀏覽量: 195
- PDF下載量: 556
- 被引次數(shù): 0