基于公理化模糊子集的改進(jìn)譜聚類算法
doi: 10.11999/IEIT170904
-
1.
蘭州理工大學(xué)電氣工程與信息工程學(xué)院 ??蘭州 ??730050
-
2.
甘肅省工業(yè)過(guò)程先進(jìn)控制重點(diǎn)實(shí)驗(yàn)室 ??蘭州 ??730050
-
3.
蘭州理工大學(xué)國(guó)家級(jí)電氣與控制工程實(shí)驗(yàn)教學(xué)中心 ??蘭州 ??730050
An Improved Spectral Clustering Algorithm Based on Axiomatic Fuzzy Set
-
1.
College of Electrical and Information Engineering, Lanzhou University of Technology, Lanzhou 730050, China
-
2.
Key Laboratory of Gansu Advanced Control for Industrial Processes, Lanzhou 730050, China
-
3.
National Experimental Teaching Center of Electrical and Control Engineering, Lanzhou University of Technology, Lanzhou 730050, China
-
摘要: 譜聚類算法通常是采用高斯核作為相似性度量,并利用所有可用的特征來(lái)構(gòu)建具有歐氏距離的相似度矩陣,數(shù)據(jù)集復(fù)雜度會(huì)影響其譜聚類性能,因此該文提出一種基于公理化模糊子集(AFS)的改進(jìn)譜聚類算法。首先結(jié)合AFS算法,利用識(shí)別特征來(lái)衡量更合適的數(shù)據(jù)成對(duì)相似性,生成更強(qiáng)大的親合矩陣;再有效地利用Nystr?m采樣算法,計(jì)算采樣點(diǎn)間以及采樣點(diǎn)和剩余點(diǎn)間的相似度矩陣去降低計(jì)算的復(fù)雜度;最后通過(guò)在不同數(shù)據(jù)集以及圖像分割上進(jìn)行實(shí)驗(yàn),證明了提出算法的有效性。
-
關(guān)鍵詞:
- 親和矩陣 /
- 譜聚類 /
- 公理化模糊子集 /
- Nystr?m采樣算法
Abstract: Gaussian kernel is usually used as the similarity measure in spectral clustering algorithm, and all the available features are used to construct the similarity matrix with Euclidean distance. The complexity of the data set would affect its spectral clustering performance. Therefore, an improved spectral clustering algorithm based on Axiomatic Fuzzy Set (AFS) is proposed. Firstly, AFS algorithm is combined to measure the similarity of more suitable data by recognizing features, and the stronger affinity matrix is generated. Then Nystr?m sampling algorithm is used to calculate the similarity matrix between the sampling points and the sampling points and the remaining points to reduce the computational complexity. Finally, the experiment is carried out by using different data sets and image segmentations, the effectiveness of the proposed algorithm are proved. -
表 1 數(shù)據(jù)集特征
數(shù)據(jù)集 數(shù)據(jù)總數(shù) 類數(shù) 維數(shù) Iris 150 3 4 Heart 270 2 13 Sonar 208 2 60 Wine 178 3 13 Protein 552 8 77 Hepatitis 155 2 19 Segmentation 2310 7 19 Pen digits 10992 10 16 下載: 導(dǎo)出CSV
表 2 數(shù)據(jù)集的CE(%)
數(shù)據(jù)集 SC STSC AFS 本文算法 Iris 10.71 7.46 9.72 7.63 Heart 20.96 22.13 30.63 12.42 Sonar 44.53 46.83 38.52 33.60 Wine 2.92 2.91 3.54 3.13 Protein 54.70 55.67 55.12 48.87 Hepatitis 30.76 38.73 32.34 23.20 Segmentation 22.08 21.35 31.17 18.63 Pen digits 25.37 24.25 – 22.16 下載: 導(dǎo)出CSV
表 3 數(shù)據(jù)集的NMI(%)
數(shù)據(jù)集 SC STSC AFS 本文算法 Iris 75.87 78.63 78.06 85.49 Heart 28.54 26.23 18.45 40.33 Sonar 7.32 1.83 15.47 22.38 Wine 89.30 89.34 85.67 87.96 Protein 54.43 48.24 36.62 65.80 Hepatitis 13.75 4.78 3.57 17.42 Segmentation 65.58 66.72 58.56 72.24 Pen digits 60.53 61.48 – 66.52 下載: 導(dǎo)出CSV
表 6 復(fù)雜度分析
計(jì)算步驟 復(fù)雜度 計(jì)算矩陣 ${{A}}$ O(n2) 計(jì)算矩陣 ${{B}}$ O(n(N–n)) 若矩陣 ${{A}}$正定 對(duì) ${{A}}$矩陣分解 O(n3) 求解矩陣 ${{p}}$ O(n2(N–n)) 矩陣分解 O(n3) 求解矩陣 ${{Y}}$ O(n2N) 若矩陣 ${{A}}$非正定 求解矩陣 ${{S}}$ O(n2N) 對(duì)矩陣 ${{S}$對(duì)角分解 O(n3) 求解矩陣 ${{Y}}$ O(n2N) K-means算法進(jìn)行聚類 O(nK2T) 下載: 導(dǎo)出CSV
-
JAIN A. Data clustering: a review[J]. ACM Computing Surveys, 1999, 31(3): 264–323. DOI: 10.1145/331499.331504 XU Rui. Survey of Clustering Algorithms[M]. New Jersey: IEEE Press, 2005: 645–678. doi: 10.1109/TNN.2005.845141. RAMON-GONEN R and GELBARD R. Cluster evolution analysis: Identification and detection of similar clusters and migration patterns[J]. Expert Systems with Applications, 2017, 83: 363-378. DOI: 10.1016/j.eswa.2017.04.007. WITTEN I H and FRANK E. Data Mining: Practical Machine Learning Tools and Techniques[M]. Massachusetts: Morgan Kaufmann, 2005: 81–82. doi: 0120884070, 9780120884070. 夏平, 任強(qiáng), 吳濤, 等.融合多尺度統(tǒng)計(jì)信息模糊C均值聚類與Markov隨機(jī)場(chǎng)的小波域聲納圖像分割[J]. 兵工學(xué)報(bào), 2017, 38(5): 940-948. DOI: 10.3969/j.issn.1000-1093.2017.05.014.XIA Ping, REN Qiang, WU Tao, et al. Sonar image segmentation fusion of multi-scale statistical information FCM clustering and MRF model in wavelet domain[J]. Acta Armamentaria, 2017, 38(5): 940-948. DOI: 10.3969/j.issn.1000-1093.2017.05.014. TREVOR H, ROBERT T, and FRIEDMAN J J H. The Elements of Statistical Learning[M]. New York: Springer, 2001: 460–514. doi: 10.1198/jasa.2004.s339. 李武, 趙嬌燕, 嚴(yán)太山.基于平均差異度優(yōu)選初始聚類中心的改進(jìn)K-均值聚類算法[J]. 控制與決策, 2017, 32(4): 759-762. DOI: 10.13195/j.kzyjc.2016.0274.LI Wu, ZHAO Jiaoyan, and YAN Taishan. Improved K-means clustering algorithm optimizing initial clustering centers based on average difference degree[J]. Control and Decision, 2017, 32(4): 759-762. DOI: 10.13195/j.kzyjc.2016.0274. CHEN Weijie and GIGER M L. A fuzzy c-means (fcm) based algorithm for intensity inhomogeneity correction and segmentation of MR images[C]. IEEE International Symposium on Biomedical Imaging, Nano to Macro Marriott Crystal Gateway, Arlington, 2005, 2: 1307–1310. doi: 10.1109/ISBI.2004.1398786. 蔡曉妍, 戴冠中, 楊黎斌.譜聚類算法綜述[J]. 計(jì)算機(jī)科學(xué), 2008, 35(7): 14-18. DOI: 10.3969/j.issn.1002-137X.2008.07.004.CAI Xiaoyan, DAI Guanzhong, and YANG Libin. Survey on spectral clustering algorithms[J]. Computer Science, 2008, 35(7): 14-18. DOI: 10.3969/j.issn.1002-137X.2008.07.004. ZELNIK-MANOR L and PERONA P. Self-tuning spectral clustering[C]. Advances in Neural Information Processing Systems, Vancouver, 2004: 1601–1608. YANG Peng, ZHU Qingsheng, and HUANG Biao. Spectral clustering with density sensitive similarity function[J]. Knowledge-Based Systems, 2011, 24(5): 621-628. DOI: 10.1016/j.knosys.2011.01.009. JAIN A K. Data Clustering: 50 Years Beyond K-means[M]. Machine Learning and Knowledge Discovery in Databases. Springer Berlin Heidelberg, 2008: 651–666. doi: 10.1007/978-3-540-87479-9_3. ZHU Xiatian, CHEN Change Loy, and GONG Shaogang. Constructing robust affinity graphs for spectral clustering[C]. Computer Vision and Pattern Recognition. IEEE, Ohio, 2014: 1450–1457. doi: 10.1109/CVPR.2014.188. GONG Shaogang, CHEN Change Loy, and XIANG Tao. Security and Surveillance[M]. Visual Analysis of Humans. Springer London, 2011: 455–472. doi: 10.1007/978-0-85729-997-0_23. PAVAN M and PELILLO M. Dominant sets and pairwise clustering[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2006, 29(1): 167-172. DOI: 10.1109/TPAMI.2007.10. 丁世飛, 賈洪杰, 史忠植. 基于自適應(yīng) 采樣的大數(shù)據(jù)譜聚類算法[J]. 軟件學(xué)報(bào), 2014, (9): 2037-2049. DOI: 10.13328/j.cnki.jos.004643.DING Shifei, JIA Hongjie, and SHI Zhongzhi. Spectral clustering algorithm based on adaptive Nystr?m sampling for big data analysis[J]. Journal of Software, 2014, (9): 2037-2049. DOI: 10.13328/j.cnki.jos.004643. LIU Xiaodong and REN Yan. Novel artificial intelligent techniques via AFS theory: Feature selection, concept categorization and characteristic description [J]. Applied Soft Computing, 2010, 10(3): 793-805. DOI: 10.1016/j.asoc.2009.09.009. LIH Xiaodong, WANG Xianchang, and PEDRYCZ W. Fuzzy clustering with semantic interpretation[J]. Applied Soft Computing, 2015, 26: 21-30. DOI: 10.1016/j.asoc.2014.09.037. BELONGIE S, FOWLKES C, FAN C, et al. Spectral partitioning with indefinite kernels using the Nystr?m extension[C]. European Conference on Computer Vision. Springer-Verlag, Denmark, 2002: 531–542. doi: 10.1007/3-540-47977-5_35. WU Mingrui. A local learning approach for clustering[C]. International Conference on Neural Information Processing Systems, Hong Kong, China, 2006: 1529–1536. STREHL A and GHOSH J. Cluster ensembles-aknowledge reuse framework for combining multiple partitions[J]. The Journal of Machine Learning Research, 2003, 3(3): 583-617. DOI: 10.1162/153244303321897735. HU Zhaoling, GUO Dazhi, and SHENG Yehua Extracting textural in formation of satellite SAR image based on wavelet decomposition[J]. Journa1 of Remote Sensing, 2001, 5: 423-427. -