2017 年 第 9 期 信 息 通 信 2017 (总第 177 期) INFORMATION & COMMUNICATIONS (Sum. No 177) 基 于 hadoop与医疗大数据的apriori算法并行化研究 付小妮 (华北水利水电大学,河南郑州450046) 摘要:现代信息科技的飞速发展,改变了人类在各个领域的生活,毫无疑问在医疗领域将是一场新的革命。传统的医疗数据 挖掘分析方法已经无法满足日益增长的医疗数据,因此我们迫切需要一种新的医疗大数据挖掘技术来改变这种局面。本文 根据医疗大数据的特点,针对传统的关联规则apriori算法扫描数据库时间过长,以及产生候选项集过多的问题,并利用开源 框 架 hadoop,本文对传统的apriori算法进行改进,提出了一种基于医疗大数据的apriori矩阵算法mapreduce并行化实现。 关键词:医疗大数据;apriori; hadoop;矩阵;mapreduce 中图分类号: 文献标识码:A 文章编号:1673-1131(2017)09-0030-02 Abstract: The rapid development of modem information technology has changed the life of mankind in various fields, no doubt in the medical field will be a new revolution. Traditional medical data mining analysis methods have been unable to meet the growing medical data, so we urgently need a new medical data mining technology to change this situation. In this paper,accord- ing to the characteristics of medical large data, apriori algorithm is used to scan the database for too long, and the problem of generating candidate items is too much. Using the open source framewo