基于結(jié)構(gòu)相似度的大規(guī)模社交網(wǎng)絡(luò)聚類算法
doi: 10.11999/JEIT140512
基金項目:
國家自然科學(xué)基金(61105049, 61300166),中國民航信息技術(shù)科研基地開放課題基金(CAAC-ITRB-201303, CAAC-ITRB-201204),天津市科技計劃項目(13ZCZDGX01098)和天津市自然科學(xué)基金(14JCQNJC00600)資助課題
Clustering Algorithms for Large-scale Social Networks Based on Structural Similarity
-
摘要: 針對社交網(wǎng)絡(luò)的有向交互性和大規(guī)模特性,該文提出一種基于結(jié)構(gòu)相似度的有向網(wǎng)絡(luò)聚類算法(DirSCAN),以及相應(yīng)的分布式并行算法(PDirSCAN)??紤]社交網(wǎng)絡(luò)中節(jié)點間的有向交互性,將行為結(jié)構(gòu)相似的節(jié)點聚集起來,并進行節(jié)點功能分析。針對社交網(wǎng)絡(luò)規(guī)模巨大的特點,提出MapReduce框架下的分布式并行聚類算法,在確保聚類結(jié)果一致的前提下,提高處理性能。大量真實數(shù)據(jù)集上的實驗結(jié)果表明,DirSCAN比無向網(wǎng)絡(luò)聚類算法(SCAN)在F1上可提高2.34%的性能,并行算法PDirSCAN比DirSCAN運行速度提升1.67倍,能夠有效處理大規(guī)模的有向網(wǎng)絡(luò)聚類問題。
-
關(guān)鍵詞:
- 社交網(wǎng)絡(luò) /
- 有向網(wǎng)絡(luò)聚類 /
- 并行算法 /
- MapReduce
Abstract: To cluster the directed and large-scale social networks, a Structural Clustering Algorithm for Directed Networks (DirSCAN) and a corresponding Parallel algorithm (PDirSCAN) are proposed. Considering oriented behavioral relation between two vertices, DirSCAN is constructed based on action structural similarity and function analysis. To meet the need of large-scale social network analysis, a lossless PDirSCAN based on MapReduce distributed parallel architecture is designed to improve the processing performance. A large number of experimental results on real-world network datasets show that DirSCAN improves performance of SCAN up to 2.34% on F1, PDirSCAN runs 1.67 times faster than DirSCAN.-
Key words:
- Social networks /
- Directed network clustering /
- Parallel algorithm /
- MapReduce
-
計量
- 文章訪問數(shù): 1914
- HTML全文瀏覽量: 150
- PDF下載量: 1596
- 被引次數(shù): 0