基于混合權(quán)重合并策略的社交網(wǎng)絡(luò)用戶關(guān)注點識別方法
doi: 10.11999/JEIT161348
國家自然科學(xué)基金(71490725, 71521001, 71371062, 91546114, 71501057),國家973規(guī)劃項目(2013CB329603),國家科技支撐計劃項目(2015BAH26F00),教育部人文社會科學(xué)研究青年基金(15YJC630111)
Recognizing Users Focuses on Social Network Based on Mixed-weight Combined Strategy
The National Natural Science Foundation of China (71490725, 71521001, 71371062, 91546114, 71501057), The National 973 Program of China (2013CB329603), The National Key Technology Support Program (2015BAH26F00), MOE Project of Humanities and Social Sciences (15YJC630111)
-
摘要: 主題模型是用于識別博客、網(wǎng)絡(luò)社區(qū)、微博等社交網(wǎng)絡(luò)平臺上用戶關(guān)注點的重要手段??紤]到社交網(wǎng)絡(luò)平臺上短文本主題識別的特殊性,該文根據(jù)短文本內(nèi)容在上下文上的相關(guān)性,提出一種基于混合權(quán)重合并策略的AW-LDA模型。該模型將符合上下文相關(guān)條件的短文本進行虛擬合并,并根據(jù)上下文相關(guān)程度對不同短文本賦予不同的權(quán)重,構(gòu)建了一種新的短文本主題識別方法。通過網(wǎng)絡(luò)BBS社區(qū)與微博社區(qū)兩組數(shù)據(jù)的實驗,該模型能夠有效識別不同話題下社交網(wǎng)絡(luò)用戶關(guān)注點,為解決短文本主題識別問題提供了新的解決思路。
-
關(guān)鍵詞:
- 社交網(wǎng)絡(luò) /
- 主題模型 /
- 關(guān)注點識別 /
- 混合權(quán)重 /
- 鄰近用戶
Abstract: It is an important measure to utilize the topic model to recognize the users focuses on social networks, such as blog, online community, and microblog. Considering the particularity of topic recognizing of short texts on the social network platform, this paper develops an AW-LDA model based on mixed-weight combined strategy according to the relevance of short texts context. This model virtually combines short texts, which are in line with contextual-related conditions, and endows different short texts with different weights according to the related extent. It proposes a new method of recognizing short texts topics. According to the experiments on data of BBS and Weibo communities, the results show that the model can effectively recognize social network users focuses on different subjects and it proposes a new idea about solving the topic recognition problem of short texts.-
Key words:
- Social network /
- Topic model /
- Focus recognition /
- Mix-weight /
- Neighboringuser
-
YAN Zehua and LI Fang. News thread extraction based on topical n-gram model with a background distribution[C]. International Conference on Neural Information Processing, Berlin, 2011: 416-424. doi: 10.1007/978-3-642-24958-7_49. XING Chen, WANG Yuan, LIU Jie, et al. Hash tag-based sub- event discovery using mutually generative LDA in Twitter[C]. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, 2016: 2666-2672. ZHANG Xiaoming, CHEN Xiaoming, CHEN Yan, et al. Event detection and popularity prediction in microblogging [J]. Neurocomputing, 2015, 149(3): 1469-1480. doi: 10.1016/ j.neucom.2014.08.045. BLEI D, NG A, and JORDAN M. Latent dirichlet allocation [J]. Journal of Machine Learning Research, 2003, (3): 993-1022. WENG Jianshu, LIM E, JIANG Jing, et al. Twitterrank: Finding topic-sensitive influential twitterers[C]. Proceedings of the Third ACM International Conference on Web Search and Data Mining, New York, 2010: 261-270. doi: 10.1145/ 1718487.1718520. PHAN X, NGUYEN L, and HORIGUCHI S. Learning to classify short and sparse text web with hidden topics from large-scale data collections[C]. Proceedings of the 17th International Conference on World Wide Web, Beijing, 2008: 91-100. doi: 10.1145/1367497.1367510. ZHANG Heng and ZHONG Guoqiang. Improving short text classification by learning vector representations of both words and hidden topics[J]. Knowledge-Based Systems, 2016, 102(12): 76-86. doi: 10.1016/j.knosys.2016.03.027. VO D and OCK C. Learning to classify short text from scientific documents using topic models with various types of knowledge[J]. Expert Systems with Applications, 2015, 42(3): 1684-1698. doi: 10.1016/j.eswa.2014.09.031. JIN O, LIU N, ZHAO Kai, et al. Transferring topical knowledge from auxiliary long texts for short text clustering [C]. Proceedings of the 20th ACM International Conference on Information and Knowledge Management, New York, 2011: 775-784. doi: 10.1145/2063576.2063689. CHENG Xueqi, YAN Xiaohui, LAN Yanyan, et al. Btm: Topic modeling over short texts[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(12): 2928-2941. doi: 10.1109/TKDE.2014.2313872. ZUO Yuan, WU Junjie, ZHANG Hui, et al. Topic modeling of short texts: A pseudo-document view[C]. Proceedings of the 22nd ACM international Conference on Knowledge Discovery and Data Mining, San Francisco, 2016: 2105-2114. doi: 10.1145/2939672.2939880. LIN Hao, SUN Bo, WU Junjie, et al. Topic detection from short text: A term-based consensus clustering method[C]. Proceedings of the 13th International Conference on Service Systems and Service Management, Kunming, 2016: 1-6. doi: 10.1109/ICSSSM.2016.7538624. ZHAO Waynexin, JIANG Jing, WENG Jianshu, et al. Comparing twitter and traditional media using topic models[C]. Proceedings of the 33rd European Conference on Information Retrieval, Dublin, 2011: 338-349. doi: 10.1007/ 978-3-642-20161-5_34. MIMNO D, WALLACH H, TALLEY E, et al. Optimizing semantic coherence in topic models[C]. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh, 2011: 262-272. -
計量
- 文章訪問數(shù): 1268
- HTML全文瀏覽量: 166
- PDF下載量: 281
- 被引次數(shù): 0