基于預(yù)判篩選的高效關(guān)聯(lián)規(guī)則挖掘算法
doi: 10.11999/JEIT151107
國家自然科學(xué)基金(61373135, 61401225, 61502252, 61201160),江蘇省基礎(chǔ)研究計劃(自然科學(xué)基金)(BK20140883, BK20140894, BK20131377),中國博士后科學(xué)基金(2015M581844), 江蘇省博士后科研資助計劃項目(1501125B),南京郵電大學(xué)校級科研基金(NY214101, NY215147)
An Efficient Association Rule Mining Algorithm Based on Prejudging and Screening
The National Natural Science Foundation of China (61373135, 61401225, 61502252, 61201160), Natural Science Foundation of Jiangsu Province of China (BK20140883, BK20140894, BK20131377), China Postdoctoral Science Foundation Funded Project (2015M581844), Jiangsu Planned Projects for Postdoctoral Research Funds (1501125B), NUPTSF (NY214101, NY215147)
-
摘要: 關(guān)聯(lián)規(guī)則分析作為數(shù)據(jù)挖掘的主要手段之一,在發(fā)現(xiàn)海量事務(wù)數(shù)據(jù)中隱含的有價值信息方面具有重要的作用。該文針對Apriori 算法的固有缺陷,提出了AWP (Apriori With Prejudging) 算法。該算法在Apriori 算法連接、剪枝的基礎(chǔ)上,添加了預(yù)判篩選的步驟,使用先驗概率對候選頻繁k項集集合進行縮減優(yōu)化,并且引入阻尼因子和補償因子對預(yù)判篩選產(chǎn)生的誤差進行修正,簡化了挖掘頻繁項集的操作過程。實驗證明AWP算法能夠有效減少掃描數(shù)據(jù)庫的次數(shù),降低算法的運行時間。
-
關(guān)鍵詞:
- 數(shù)據(jù)挖掘 /
- 關(guān)聯(lián)規(guī)則 /
- 事務(wù)數(shù)據(jù)庫 /
- 預(yù)判篩選 /
- Apriori
Abstract: Association rule analysis, as one of the significant means of data mining, plays an important role in discovering the implicit knowledge in massive transaction data. To overcome the inherent defects of the classic Apriori algorithm, this paper proposes Apriori With Prejudging (AWP) algorithm. AWP algorithm adds a pre-judging procedure on the basis of the self-join and pruning progress in Apriori algorithm. It reduces and optimizes the k-frequent item sets using prior probability. In addition, the damping factor and compensating factor are introduced to revise the deviation caused by pre-judging. AWP algorithm simplifies the operation process of mining frequent item sets. Experimental results show that the improvement measures can effectively reduce the number of scanning databases and reduce the running time of the algorithm.-
Key words:
- Data mining /
- Association rules /
- Transaction database /
- Prejudging; Apriori /
-
SINGLA S and MALIK A. Survey on various improved Apriori algorithms[J]. International Journal of Advanced Research in Computer and Communication Engineering, 2014, 3(11): 8528-8531. doi: 10.17148/ijarcce.2014.31139. MINAL G I and SURYAVANSHI N Y. Association rule mining using improved Apriori algorithm[J]. International Journal of Computer Applications, 2015, 112(4): 37-42. RAJESWARI K. Improved Apriori algorithm A comparative study using different objective measures[J]. International Journal of Computer Science and Information Technologies, 2015, 6(3): 3185-3191. ACHAR A, LAXMAN S, and SASTRY P S. A unified view of the Apriori-based algorithms for frequent episode discovery[J]. Knowledge Information Systems, 2012, 31(2): 223-250. doi: 10.1007/s10115-011-0408-2. 李鵬, 于曉洋, 孫渤禹. 基于用戶群組行為分析的視頻推薦方法研究[J]. 電子與信息學(xué)報, 2014, 36(6): 1484-1491. doi: 10.3724/SP.J.1146.2013.01225. LI Peng, YU Xiaoyang, and SUN Boyu. Video recommendation method based on group user behavior analysis[J]. Journal of Electronics Information Technology, 2014, 36(6): 1484-1491. doi: 10.3724/SP.J.1146.2013.01225. AGRAWAL R and SRIKANT R. Fast algorithms for mining association rules[C]. VLDB94 Proceedings of the 20th International Conference on Very Large Data Bases, San Francisco, CA, USA, 1994: 487- 499. YANG Z, TANG W, SHINTEMIROV A, et al. Association rule mining-based dissolved gas analysis for fault diagnosis of power transformers[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2009, 39(6): 597-610. doi: 10.1109/TSMCC.2009.2021989. ZHANG F, ZHANG Y, and BAKOS J D. Gpapriori: Gpu-accelerated frequent itemset mining[C]. 2011 IEEE International Conference on Cluster Computing, Austin, TX, USA, 2011: 590-594. doi: 10.1109/CLUSTER.2011.61. ANGELINE M D and JAMES S P. Association rule generation using Apriori mend algorithm for students placement[J]. International Journal of Emerging Sciences, 2012, 2(1): 78-86. LI N, ZENG L, HE Q, et al. Parallel implementation of Apriori algorithm based on MapReduce[C]. 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel Distributed Computing (SNPD), Kyoto, Japan, 2012: 236-241. doi: 10.1109/ SNPD. 2012.31. SULIANTA F, LIONG TH, and ATASTINA I. Mining food industrys multidimensional data to produce association rules using Apriori algorithm as a basis of business strategy[C]. 2013 International Conference of Information and Communication Technology (ICoICT), Bandung, Indonisia, 2013: 176-181. doi: 10.1109/ICoICT.2013.6574569. ABAYA S A. Association rule mining based on Apriori algorithm in minimizing candidate generation[J]. International Journal of Scientific Engineering Research, 2012, 3(7): 1-4. WANG Feng and LI Yonghua. An improved Apriori algorithm based on the matrix[C]. Proceedings of 2008 International Seminar on Future BioMedical Information Engineering, Wuhan, China, 2008: 152-155. doi: 10.1109/ FBIE.2008.80. MAOLEGI M A and ARKOK B. An improved Apriori algorithm for association rules[J]. International Journal on Natural Language Computing, 2014, 3(1): 21-29. doi: 10.5121/ijnlc.2014.3103. 葛琳, 季新生, 江濤. 基于關(guān)聯(lián)規(guī)則的網(wǎng)絡(luò)信息內(nèi)容安全事件發(fā)現(xiàn)及其Map-Reduce實現(xiàn)[J]. 電子與信息學(xué)報, 2014, 36(8): 1831-1837. doi: 10.3724/SP.J.1146.2013.01272. GE Lin, JI Xinsheng, and JIANG Tao. Discovery of network information content security incidents based on association rules and its implementation in Map-Reduce[J]. Journal of Electronics Information Technology, 2014, 36(8): 1831-1837. doi: 10.3724/SP.J.1146.2013.01272. TANK D M. Improved Apriori algorithm for mining association rules[J]. International Journal of Information Technology and Computer Science, 2014, 6(7): 15-23. doi: 10.5815/ijitcs.2014.07.03. RAO S and GUPTA R. Implementing improved algorithm over Apriori data mining association rule algorithm[J]. International Journal of Computer Science and Technology, 2012, 34(3): 489-493. -
計量
- 文章訪問數(shù): 1481
- HTML全文瀏覽量: 142
- PDF下載量: 484
- 被引次數(shù): 0