An Evolving Fuzzy Inference Algorithm With Multi-dimensional Temporal Association Rules
-
摘要: 挖掘时态关联规则的目的是为了发现带有时态信息的项集之间有趣的关系.由于数据库经常动态更新,时态关联规则的挖掘也应该适应数据库的更新.然而,现有的大多数算法不仅需要重新挖掘更新的数据库,浪费了大量的时间和效率,而且不能利用已存在的规则定量地预测某些项的变化趋势.本文提出了一个基于多维时态关联规则的演化模糊推理预测建模算法(Evolving fuzzy inference model based on multidimensional temporal association rules,EFI-MTAR),主要优势是构建了一种基于多维时态关联规则的模糊推理建模算法(Fuzzy inference modeling algorithm based on multidimensional temporal association rules,FI-MTAR),实现了对时间序列的定量预测.此外,为了降低规则更新的代价和加快规则预测的速度,提出了概念漂移检测策略来处理时间序列数据以适应数据库的动态更新.实验结果表明了本文提出算法的有效性和准确性.Abstract: The purpose of mining temporal association rules is to find interesting relationships between item sets with temporal information. Due to the dynamic update of the database, the mining of temporal association rules should adapt to the updates. However, most of the existing algorithms not only need to remine the updated database but also are unable to quantitatively predict the tendency of certainitem. In this paper, an evolving fuzzy inference model based on multidimensional temporal association rules (EFI-MTAR) is proposed to predict the time series quantitatively, In addition, in order to reduce the cost and accelerate the efficiency for prediction, a concept drift detection method is put forward to deal with time series data to adapt to the updates dynamically. Experimental results show the effectiveness and accuracy of the proposed algorithm.
-
Key words:
- Multi-dimensional temporal association rules /
- fuzzy inference /
- evolving /
- concept drift
1) 本文责任编委 张敏灵 -
表 1 对比方案
Table 1 The comparison program
表 2 离散化项集的序列片段模式
Table 2 The segment patterns of the time series for the discrete item
时间序列变量 离散 斜率 斜率对 幅值变 归一化变化 化项 均值 应角度 化均值 幅值均值 PT08.S2(NMHC) 21 74.25 89.25 144.2 0.48 NOx(GT) 31 -31.6 -87.1 -92.4 -0.16 PT08.S5(O3) 61 80.8 89.3 911.1 0.45 表 3 序列片段模式的语义描述
Table 3 The semantic description of the segment patterns
斜率对应角度范围 语义描述 归一化变化幅值范围 语义描述 $[-90, -60]$ 剧烈下降 $[-1, -0.6]$ 大幅下降 $[-60, -30]$ 快速下降 $[-0.6, -0.3]$ 中幅下降 $[-30, 0 ]$ 平稳下降 $[-0.3, 0]$ 小幅下降 $[0, 30]$ 平稳上升 $[0, 0.3]$ 小幅上升 $[30, 60]$ 快速上升 $[0.3, 0.6]$ 中幅上升 $[60, 90]$ 剧烈上升 $[0.6, 1]$ 大幅上升 表 4 最终预测输出上下界均方根误差
Table 4 The RMSE of upper bound and lower bound for the prediction output
$dx_{c3{j_3}}^L$ $dx_{c3{j_3}}^U$ 0.1656 0.1803 表 5 不同滑动窗口的演化更新效果
Table 5 The evolution effect of different sliding window size
评价指标 演化次数 滑动窗口大小 本文方法 5 % 10 % 15 % 20 % 7 % Rules 0 16 17 19 19 17 1 20 19 22 24 18 2 20 21 24 26 20 3 21 21 25 29 21 4 24 27 29 33 21 5 25 24 30 34 24 6 25 29 31 33 23 CA 0 0.909 0.899 0.876 0.866 0.911 1 0.879 0.904 0.851 0.851 0.906 2 0.886 0.909 0.846 0.839 0.916 3 0.891 0.896 0.849 0.832 0.919 4 0.876 0.887 0.851 0.842 0.924 5 0.869 0.879 0.854 0.847 0.896 6 0.881 0.896 0.862 0.859 0.909 RMSE 0 0.144 0.152 0.187 0.197 0.137 1 0.179 0.145 0.221 0.224 0.145 2 0.168 0.139 0.234 0.264 0.121 3 0.159 0.157 0.227 0.279 0.112 4 0.187 0.165 0.224 0.241 0.104 5 0.194 0.183 0.209 0.236 0.154 6 0.175 0.156 0.201 0.214 0.146 表 6 不同数据集的有效性和准确性对比
Table 6 Comparison of the validity and accuracy of different data sets
数据集 演化 方案1 方案2 次数 Rules CA Rules CA Air Quality 0 35 0.902 17 0.911 1 49 0.897 18 0.906 2 54 0.879 20 0.916 3 62 0.887 21 0.919 4 66 0.874 21 0.924 5 71 0.867 24 0.896 6 78 0.874 23 0.909 7 87 0.894 20 0.921 8 94 0.886 23 0.914 9 99 0.879 25 0.900 10 101 0.875 27 0.894 11 109 0.874 24 0.898 12 112 0.895 24 0.927 13 119 0.9004 26 0.932 Istanbul 0 32 0.806 19 0.855 1 36 0.814 17 0.881 2 39 0.743 21 0.805 3 51 0.807 21 0.801 4 57 0.764 21 0.805 5 67 0.794 25 0.801 6 74 0.7778 23 0.889 0 65 0.904 44 0.946 1 78 0.882 49 0.891 2 86 0.856 49 0.888 Synthetic 3 990.862 56 0.875 Control 4 108 0.843 48 0.851 Chart 5 110 0.856 53 0.956 6 114 0.854 59 0.896 7 124 0.889 48 0.926 表 7 拟合误差
Table 7 Fitting error
数据集 演化次数 方案1 方案2 方案3 0 0.189 0.137 0.168 1 0.195 0.145 0.172 2 0.176 0.121 0.156 3 0.169 0.112 0.144 4 0.162 0.104 0.136 Air 5 0.201 0.154 0.159 Quality 6 0.194 0.146 0.174 7 0.156 0.108 0.144 8 0.186 0.133 0.176 9 0.197 0.148 0.185 10 0.211 0.158 0.195 11 0.197 0.155 0.186 12 0.154 0.099 0.129 13 0.146 0.091 0.132 Istanbul 0 0.144 0.094 0.129 1 0.184 0.139 0.165 2 0.226 0.172 0.208 3 0.198 0.149 0.184 4 0.188 0.139 0.171 5 0.209 0.168 0.196 6 0.148 0.103 0.139 0 0.132 0.072 0.095 1 0.154 0.094 0.124 2 0.169 0.113 0.146 Synthetic 3 0.187 0.132 0.167 Control 4 0.206 0.149 0.182 Chart 5 0.129 0.076 0.109 6 0.138 0.092 0.116 7 0.155 0.087 0.126 -
[1] Yolcu U, Aladag C H, Egrioglu E, Uslu V R. Time-series forecasting with a novel fuzzy time-series approach:an example for Istanbul stock market. Journal of Statistical Computation and Simulation, 2013, 83(4):599-612 doi: 10.1080/00949655.2011.630000 [2] Cheng C H, Yang J H. Rough-set rule induction to build fuzzy time series model in forecasting stock price. In: Proceedings of the 12th Conference on International Fuzzy Systems and Knowledge Discovery (FSKD). Zhangjiajie, China: IEEE, 2015. 278-284 [3] Aslanargun A, Mammadov M, Yazici B, Yolacan S. Comparison of ARIMA, neural networks and hybrid models in time series:tourist arrival forecasting. Journal of Statistical Computation and Simulation, 2007, 77(1):29-53 doi: 10.1080/10629360600564874 [4] Cheng C H, Shiu H Y. A novel GA-SVR time series model based on selected indicators method for forecasting stock price. In: Proceedings of the 2014 International Conference on Information Science, Electronics and Electrical Engineering (ISEEE). Sapporo, Japan: IEEE, 2014. 395-399 http://ieeexplore.ieee.org/document/6948139/ [5] 赵昊, 汪涛, 许凡, 方彦军.时序动态关联规则挖掘中趋势变化与预测.河南科技大学学报:自然科学版, 2015, 36(6):40-45 http://d.old.wanfangdata.com.cn/Periodical/lygxyxb201506009Zhao Hao, Wang Tao, Xu Fan, Fang Yan-Jun. Change tendency and forecast on time series in dynamic association rules mining. Journal of Henan University of Science and Technology:Natural Science, 2015, 36(6):40-45 http://d.old.wanfangdata.com.cn/Periodical/lygxyxb201506009 [6] Zeng Y, Yin S Q, Liu J Y, Zhang M. Research of improved FP-Growth algorithm in association rules mining. Scientific Programming, 2015, 2015: Article No. 910281 http://dl.acm.org/citation.cfm?id=2814689 [7] Xiao Y Y, Tian Y, Zhao Q H. Optimizing frequent time-window selection for association rules mining in a temporal database using a variable neighbourhood search. Computers and Operations Research, 2014, 52:241-250 doi: 10.1016/j.cor.2013.09.018 [8] Adhikari J, Rao P R. Identifying calendar-based periodic patterns. Emerging Paradigms in Machine Learning. Berlin, Heidelberg: Springer, 2013. 329-357 [9] Ben Ahmed E, Nabli A, Gargouri F. On line mining of cyclic association rules from parallel dimension hierarchies. Real World Data Mining Applications. Cham: Springer International Publishing, 2015. 31-50 [10] Matthews S G, Gongora M A, Hopgood A A, Ahmadi S. Web usage mining with evolutionary extraction of temporal fuzzy association rules. Knowledge-Based Systems, 2013, 54:66-72 doi: 10.1016/j.knosys.2013.09.003 [11] Yang H D, Yang C C. Using health-consumer-contributed data to detect adverse drug reactions by association mining with temporal analysis. ACM Transactions on Intelligent Systems and Technology, 2015, 6(4):Article No. 55 http://cn.bing.com/academic/profile?id=9ccd1892b9c3ad2a731759c442bdcbeb&encoded=0&v=paper_preview&mkt=zh-cn [12] Nath B, Bhattacharyya D K, Ghosh A. Incremental association rule mining:a survey. Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery, 2013, 3(3):157-169 doi: 10.1002/widm.1086 [13] Zhuang D E H, Li G C L, Wong A K C. Discovery of temporal associations in multivariate time series. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(12):2969-2982 doi: 10.1109/TKDE.2014.2310219 [14] Hong T P, Lan G C, Su J H, Wu P S, Wang S L. Discovery of temporal association rules with hierarchical granular framework. Applied Computing and Informatics, 2016, 12(2):134-141 doi: 10.1016/j.aci.2016.01.003 [15] Sirisha G N V G, Shashi M. A new multivariate time series transformation technique using closed interesting subspaces. In: Proceedings of the 2015 International Mining Intelligence and Knowledge Exploration. Cham: Springer, 2015. 392-405 doi: 10.1007/978-3-319-26832-3_37 [16] Mohd K N, Mustapha A, Ahmad M H. Effect of temporal relationships in associative rule mining for web log data. The Scientific World Journal, 2014, 2014:Article No. 813983 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=Doaj000003740544 [17] 王玲, 孟建瑶, 徐培培, 彭开香.基于多维时间序列形态特征的相似性动态聚类算法.工程科学学报, 2017, 39(7):1114-1122 http://d.old.wanfangdata.com.cn/Periodical/bjkjdxxb201707019Wang Ling, Meng Jian-Yao, Xu Pei-Pei, Peng Kai-Xiang. Similarity dynamical clustering algorithm based on multidimensional shape features for time series. Chinese Journal of Engineering, 2017, 39(7):1114-1122 http://d.old.wanfangdata.com.cn/Periodical/bjkjdxxb201707019 [18] Pankaj G, Sagar B B. Discovering weighted calendar-based temporal relationship rules using frequent pattern tree. Indian Journal of Science and Technology, 2016, 9(28). DOI: 10.17485/ijst/2016/v9i28/98455 [19] 徐正光. 智能自动化的模式识别方法及其工程实现[博士学位论文], 北京科技大学自动化学院, 中国, 2001.Xu Zheng-Guang. Pattern recognition method of intelligent automation and its implementation in engineering[Ph. D. dissertation], College of Automation, University of Science and Technology, China, 2001. [20] Sun C P, Gao Q, Yu H, Xu Z G. Study on moving pattern based faultdetection method. Applied Mechanics and Materials, 2013, 427-429:1463-1466 doi: 10.4028/www.scientific.net/AMM.427-429 [21] Xu Z G, Sun C P. Moving pattern-based forecasting model of a class of complex dynamical systems. In: Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC). Orlando, FL, USA: IEEE, 2011. 4967-4972 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6161015 [22] Xu Z G, Sun C P. Moving pattern-based approach to modeling of a class of complex production processes. In: Proceedings of the 2011 IEEE International Conference on Fuzzy Systems. Taipei, China: IEEE, 2011. 2282-2287 http://ieeexplore.ieee.org/document/6007651/ [23] 徐正光, 孙昌平, 吴金霞.基于区间数度量的运动模式建模与控制.控制理论与应用, 2012, 29(9):1115-1124 http://d.old.wanfangdata.com.cn/Periodical/kzllyyy201209003Xu Zheng-Guang, Sun Chang-Ping, Wu Jin-Xia. Moving pattern measured by interval number for modeling and control. Control Theory and Application, 2012, 29(9):1115-1124 http://d.old.wanfangdata.com.cn/Periodical/kzllyyy201209003 [24] 徐正光, 孙昌平.基于区间T-S模糊模型的运动模式预测.控制与决策, 2012, 27(11):1699-1705 http://d.old.wanfangdata.com.cn/Periodical/kzyjc201211017Xu Zheng-Guang, Sun Chang-Ping. Moving pattern forecasting using interval T-S fuzzy model. Control and Decision, 2012, 27(11):1699-1705 http://d.old.wanfangdata.com.cn/Periodical/kzyjc201211017 [25] 丁园, 王斌, 鄢进冲, 潘昪.基于二维区间自回归模型的烧结终点预测.烧结球团, 2017, 42(3):1-6, 15 http://d.old.wanfangdata.com.cn/Periodical/sjqt201703001Ding Yuan, Wang Bin, Yan Jin-Chong, Pan Sheng. Prediction of burning through point based on two-dimensional interval autoregressive model. Sintering and Pelletizing, 2017, 42(3):1-6, 15 http://d.old.wanfangdata.com.cn/Periodical/sjqt201703001 [26] 孙昌平, 徐正光.基于多维区间T-S模糊模型的多维运动模式预测.控制与决策, 2016, 31(9):1569-1576 http://d.old.wanfangdata.com.cn/Periodical/kzyjc201609005Sun Chang-Ping, Xu Zheng-Guang. Multi-dimensional moving pattern prediction based on multi-dimensional interval T-S fuzzy model. Control and Decision, 2016, 31(9):1569-1576 http://d.old.wanfangdata.com.cn/Periodical/kzyjc201609005 [27] Dries A, Rückert U. Adaptive concept drift detection. Statistical Analysis and Data Mining, 2009, 2(5-6):311-327 http://d.old.wanfangdata.com.cn/Periodical/jsjxb201707005 [28] Buntine W. Learning classification trees. Statistics and Computing, 1992, 2(2):63-73 doi: 10.1007/BF01889584 [29] Aha D. UCI Machine learning repository: Center for machine learning and intelligent systems[Online], available: http://archive.ics.uci.edu/ml, August 25, 2017 [30] Wang H B, Liu Y C, Wang C D. Research on association rule algorithm based on distributed and weighted FP-growth. Advances in Multimedia, Software Engineering and Computing. Berlin, Heidelberg: Springer, 2011, 1: 133-138 doi: 10.1007/978-3-642-25989-0_24 [31] Wang J S, Zhang Y, Sun S F. Multiple T-S fuzzy neural networks soft sensing modeling of flotation process based on fuzzy C-means clustering algorithm. Advances in Neural Network Research and Applications. Berlin, Heidelberg: Springer, 2010. 137-144