邓晓衡

教授 博士生导师 硕士生导师

入职时间:2006-01-05

所在单位:电子信息学院

职务:院长

学历:博士研究生毕业

性别:男

联系方式:Email:dxh@csu.edu.cn

学位:博士学位

在职信息:在职

主要任职:计算机学院副院长 湖南省数据传感与交换设备工程中心 主任 IEEE RS Chapter长沙 主席CCF普适计算专委 委员 CCF长沙 执委

毕业院校:中南大学

学科:计算机科学与技术
信息与通信工程

当前位置: 邓晓衡 >> 论文成果

J. Zheng, X. Deng, H. Zhang. A novel method to generate frequent itemsets in distributed environment[C]//2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC). IEEE, 2018: 1-8.

发布时间:2024-03-13

点击次数:

发表刊物:2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC)

摘要:Abstract—Frequent itemset mining (FIM) is an important topic in data mining, which extracts knowledge of the relationships among items in a transaction dataset. Apriori algorithm and its variants, apriori-like algorithms, are widely used FIM algorithms. However, in a big data environment, these algorithms are inefficient. Due to the iterative calculation and modification of intermediate results, if an apriori-like algorithm is applied on a high-dimension or large-scale dataset, the memory requirement is unacceptable for a single machine. Although parallel and distributed programming could be a solution to deal with big data problems, apriori-like algorithms are not quite suitable for parallel computing because they need extra time overhead of communication to update intermediate results iteratively in cluster memories. To solve this problem, we propose a novel FIM algorithm, Distributed Apriori Based on Itemset-Encoding (DABIE). Different from existing methods, DABIE has two main advantages. Firstly, it stores intermediate results encoded in the form of 0 and 1 to reduce memory usage. Secondly, generating frequent itemsets is based on logical operation of encoding to reduce modification of data in cluster memories. These two advantages make DABIE more friendly to cluster computing. We apply DABIE on datasets with different scales. Compared with other distributed apriori-like algorithms, the results of our experiments show that DABIE can efficiently improve the multi-iterative FIM in big data environment.

备注:http://faculty.csu.edu.cn/dengxiaoheng/zh_CN/lwcg/10445/content/49191.htm

是否译文:

附件:

  • 57-A_Novel_Method_to_Generate_Frequent_Itemsets_in_Distributed_Environment.pdf

  • 上一条: J. Luo, X. Deng, H. Zhang, et al. Ultra-low latency service provision in edge computing[C]//2018 IEEE International Conference on Communications (ICC). IEEE, 2018: 1-6.

    下一条: X. Deng, G. Li, M. Dong, et al. Finding overlapping communities based on Markov chain and link clustering[J]. Peer-to-peer Networking and Applications, 2017, 10: 411-420.