-
摘要: 现实世界和工程实践产生了大量的数据流,这种数据不同于传统的静态数据,对其进行有效处理和挖掘遇到了极大的挑战.如何使用有限存储空间进行快速和近似的频繁模式挖掘是数据流挖掘的基本问题,具有非常重要的研究价值和实践意义,已经引起了国内外研究者的广泛关注.本文深入分析数据流中的频繁模式挖掘,对其特点和算法进行较为全面的总结和分类论述,并讨论了存在的主要问题和未来的研究方向.Abstract: Real-world applications often generate huge amount of data streams, which challenges efficient processing and mining due to its special characteristics. As a fundamental problem in data stream mining, frequent pattern mining techniques employed in these applications should be efficient in terms of space usage and execution time while providing a high quality of yields. This has received considerable attention in the past few years due to its research value and increasing amount of importance in numerous applications. The purpose of this paper is to review the recent work in frequent pattern mining under data stream environments, and summarize its characteristics and algorithms in general. With taxonomy, we dissertate the existing algorithms from probabilistic and deterministic bounds on error, respectively. Throughout the detailed review, some comparisons and evaluations are performed. Finally, future directions in data stream mining research are discussed.
-
Key words:
- Data mining /
- data streams /
- frequent pattern /
- approximate algorithm
计量
- 文章访问数: 4179
- HTML全文浏览量: 95
- PDF下载量: 2070
- 被引次数: 0