2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种同伴知识互增强下的序列推荐方法

胡开喜 李琳 吴小华 解庆 袁景凌

胡开喜, 李琳, 吴小华, 解庆, 袁景凌. 一种同伴知识互增强下的序列推荐方法. 自动化学报, 2023, 49(7): 1456−1470 doi: 10.16383/j.aas.c220347
引用本文: 胡开喜, 李琳, 吴小华, 解庆, 袁景凌. 一种同伴知识互增强下的序列推荐方法. 自动化学报, 2023, 49(7): 1456−1470 doi: 10.16383/j.aas.c220347
Hu Kai-Xi, Li Lin, Wu Xiao-Hua, Xie Qing, Yuan Jing-Ling. A sequential recommendation method enhanced by peer knowledge. Acta Automatica Sinica, 2023, 49(7): 1456−1470 doi: 10.16383/j.aas.c220347
Citation: Hu Kai-Xi, Li Lin, Wu Xiao-Hua, Xie Qing, Yuan Jing-Ling. A sequential recommendation method enhanced by peer knowledge. Acta Automatica Sinica, 2023, 49(7): 1456−1470 doi: 10.16383/j.aas.c220347

一种同伴知识互增强下的序列推荐方法

doi: 10.16383/j.aas.c220347
基金项目: 国家自然科学基金(62276196, 61602353), 湖北省重点研发计划项目(2021BAA030), 国家留学基金委基金(202106950041, 留金美[2020] 1509), 安康市科学技术研究发展计划(AK2020-GY-08)资助
详细信息
    作者简介:

    胡开喜:武汉理工大学计算机与人工智能学院博士研究生. 2018年获得重庆大学控制工程硕士学位. 主要研究方向为序列预测. E-mail: issac_hkx@whut.edu.cn

    李琳:武汉理工大学计算机与人工智能学院教授. 2009年获得日本东京大学博士学位. 主要研究方向为信息检索, 推荐系统. 本文通信作者. E-mail: cathylilin@whut.edu.cn

    吴小华:武汉理工大学计算机与人工智能学院博士研究生. 2019年获得西北大学计算机科学与技术硕士学位. 主要研究方向为可解释机器学习. E-mail: xhwu@whut.edu.cn

    解庆:武汉理工大学计算机与人工智能学院副教授. 2013年获得澳大利亚昆士兰大学博士学位. 主要研究方向为流数据挖掘与模式分析. E-mail: felixxq@whut.edu.cn

    袁景凌:武汉理工大学计算机与人工智能学院教授. 2004年获得武汉理工大学博士学位. 主要研究方向为分布式并行计算. E-mail: yjl@whut.edu.cn

A Sequential Recommendation Method Enhanced by Peer Knowledge

Funds: Supported by National Natural Science Foundation of China (62276196, 61602353), Key Research and Development Program of Hubei Province (2021BAA030), Foundation of China Scholarship Council (202106950041, LiuJinMei [2020] 1509), and Ankang Municipal Science and Technology Bureau (AK2020-GY-08)
More Information
    Author Bio:

    HU Kai-Xi Ph.D. candidate at the School of Computer Science and Artificial Intelligence, Wuhan University of Technology. He received his master degree in control engineering from Chongqing University in 2018. His main research interest is sequential prediction

    LI Lin Professor at the School of Computer Science and Artificial Intelligence, Wuhan University of Technology. She received her Ph.D. degree from University of Tokyo, Japan, in 2009. Her research interest covers information retrieval and recommender systems. Corresponding author of this paper

    WU Xiao-Hua Ph.D. candidate at the School of Computer Science and Artificial Intelligence, Wuhan University of Technology. He received his master degree in computer science and technology from Northwest University in 2019. His main research interest is explainable machine learning

    XIE Qing Associate professor at the School of Computer Science and Artificial Intelligence, Wuhan University of Technology. He received his Ph.D. degree from University of Queensland, Australia, in 2013. His research interest covers streaming data mining and pattern analysis

    YUAN Jing-Ling Professor at the School of Computer Science and Artificial Intelligence, Wuhan University of Technology. She received her Ph.D. degree from Wuhan University of Technology, in 2004. Her main research interest is parallel distributed computing

  • 摘要: 序列推荐(Sequential recommendation, SR)旨在建模用户序列中的动态兴趣, 预测下一个行为. 现有基于知识蒸馏(Knowledge distillation, KD)的多模型集成方法通常将教师模型预测的概率分布作为学生模型样本学习的软标签, 不利于关注低置信度序列样本中的动态兴趣. 为此, 提出一种同伴知识互增强下的序列推荐方法(Sequential recommendation enhanced by peer knowledge, PeerRec), 使多个具有差异的同伴网络按照人类由易到难的认知过程进行两阶段的互相学习. 在第1阶段知识蒸馏的基础上, 第2阶段的刻意训练通过动态最小组策略协调多个同伴从低置信度样本中挖掘出可被加强训练的潜在样本. 然后, 受训的网络利用同伴对潜在样本预测的概率分布调节自身对该样本学习的权重, 从解空间中探索更优的兴趣表示. 3个公开数据集上的实验结果表明, 提出的PeerRec方法相比于最新的基线方法在基于Top-k的指标上不仅获得了更佳的推荐精度, 且具有良好的在线推荐效率.
    1)  11 一种基于最优传输理论衡量两个分布间距离的度量方式, 目前只在一维分布、高斯分布等少数几种分布上存在闭式解.
  • 图  1  用户动态兴趣在潜在空间中的表示与推断

    Fig.  1  The representation and inference of dynamic interests in latent representation spaces

    图  2  PeerRec模型的网络结构

    Fig.  2  The architecture of our proposed PeerRec

    图  3  基于刻意训练的互相学习

    Fig.  3  An illustration of deliberate practice based mutual learning

    图  4  PeerRec变体在HR@1指标上的对比

    Fig.  4  The comparison between the variants of PeerRec in terms of HR@1

    图  5  不同同伴网络数量的性能对比

    Fig.  5  The evaluation on different number of peers

    图  6  Batch大小设置为256时, 模型的迭代速率

    Fig.  6  The running speed of different models with batch size 256

    图  7  超参数敏感性分析

    Fig.  7  Sensitivity analysis of hyper-parameters

    表  1  实验集数据统计表

    Table  1  Statistics of dataset

    ML-1mLastFMToys
    用户数量6 0401 09019 412
    行为类别数量3 4163 64611 924
    最长序列的行为数量2 275897548
    最短序列的行为数量1633
    序列的平均行为数量163.5046.216.63
    序列行为数量的方差192.53 77.698.50
    下载: 导出CSV

    表  2  与基线模型在精度指标上的对比

    Table  2  The comparison with baselines in terms of accuracy based metrics

    数据集模型HR@1HR@5HR@10NDCG@5NDCG@10MRR
    ML-1mPOP0.04070.16030.27750.10080.13830.1233
    BERT4Rec[17]0.36950.68510.78230.53750.56900.5108
    S3-Rec[1]0.28970.65750.79110.45570.52660.4535
    HyperRec[9]0.31800.66310.77380.50140.53750.4731
    R-CE[11]0.39880.64780.74040.53270.56270.5179
    STOSA[14]0.32220.65460.78440.49670.53890.4716
    PeerRec (同伴1)0.42500.71970.81410.58430.61500.5600
    PeerRec (同伴2)0.42520.72250.81410.58600.61570.5610
    LastFMPOP0.02020.09080.17800.05440.08250.0771
    BERT4Rec[17]0.10910.32940.46140.22270.26480.2266
    S3-Rec[1]0.11560.28440.42290.20030.24520.2148
    HyperRec[9]0.11460.31470.46880.21500.26460.2241
    R-CE[11]0.06510.18350.28620.12430.15700.1397
    STOSA[14]0.07520.21650.34120.14580.18600.1556
    PeerRec (同伴1)0.12940.34950.47890.23390.27550.2341
    PeerRec (同伴2)0.12480.33580.48350.23180.27960.2378
    ToysPOP0.02600.10460.18480.06520.09090.0861
    BERT4Rec[17]0.13900.33790.45960.24090.28020.2444
    S3-Rec[1]0.09900.30230.43930.20210.24630.2081
    HyperRec[9]0.11470.28750.39090.20310.23650.2087
    R-CE[11]0.11300.31890.45290.21790.26110.2233
    STOSA[14]0.18380.35870.45500.27490.30590.2732
    PeerRec (同伴 1)0.17940.37030.47850.27850.31340.2810
    PeerRec (同伴 2)0.17820.37060.47780.27810.31270.2803
    下载: 导出CSV

    表  3  知识蒸馏与刻意训练对比

    Table  3  The comparison between knowledge distillation and deliberate practice

    数据集HR@1NDCG@5MRR
    知识蒸馏[39]ML-1m0.39520.56560.5386
    LastFM0.11190.23010.2314
    Toys0.16930.27610.2767
    刻意训练 PeerRecML-1m0.42510.58520.5605
    LastFM0.12710.23290.2360
    Toys0.17880.27830.2807
    下载: 导出CSV

    表  4  PeerRec模型采用不同初始化的性能对比

    Table  4  The performance comparison between different initializations of our PeerRec

    数据集初始化方式HR@1NDCG@5MRR
    ML-1mTND0.42510.58520.5605
    Xavier0.42630.58520.5600
    Kaiming0.42780.59110.5652
    LastFMTND0.12710.23290.2360
    Xavier0.12940.23970.2424
    Kaiming0.12470.22570.2342
    ToysTND0.17880.27830.2807
    Xavier0.17750.27940.2811
    Kaiming0.18060.27760.2804
    下载: 导出CSV
  • [1] Zhou K, Wang H, Zhao W X, Zhu Y T, Wang S R, Zhang F Z, et al. S3-Rec: Self-supervised learning for sequential recommendation with mutual information maximization. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management. New York, USA: ACM, 2020. 1893−1902
    [2] 饶子昀, 张毅, 刘俊涛, 曹万华. 应用知识图谱的推荐方法与系统. 自动化学报, 2021, 47(9): 2061-2077

    Rao Zi-Yun, Zhang Yi, Liu Jun-Tao, Cao Wan-Hua. Recommendation methods and systems using knowledge graph. Acta Automatica Sinica, 2021, 47(9): 2061-2077
    [3] Li X C, Liang J, Liu X L, Zhang Y. Adversarial filtering modeling on long-term user behavior sequences for click-through rate prediction. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. Madrid, Spain: ACM, 2022. 1969−1973
    [4] 汤文兵, 任正云, 韩芳. 基于注意力机制的协同卷积动态推荐网络. 自动化学报, 2021, 47(10): 2438-2448

    Tang Wen-Bing, Ren Zheng-Yun, Han Fang. Attention-based collaborative convolutional dynamic network for recommendation. Acta Automatica Sinica, 2021, 47(10): 2438-2448
    [5] 郭磊, 李秋菊, 刘方爱, 王新华. 基于自注意力网络的共享账户跨域序列推荐. 计算机研究与发展, 2021, 58(11): 2524-2537

    Guo Lei, Li Qiu-Ju, Liu Fang-Ai, Wang Xin-Hua. Shared-account cross-domain sequential recommendation with self-attention network. Journal of Computer Research and Development, 2021, 58(11): 2524-2537
    [6] Rao X, Chen L S, Liu Y, Shang S, Yao B, Han P. Graph-flashback network for next location recommendation. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Washington, USA: ACM, 2022. 1463−1471
    [7] 孟祥武, 梁弼, 杜雨露, 张玉洁. 基于位置的移动推荐系统效用评价研究. 计算机学报, 2019, 42(12): 2695-2721

    Meng Xiang-Wu, Liang Bi, Du Yu-Lu, Zhang Yu-Jie. A survey of evaluation for location-based mobile recommender systems. Chinese Journal of Computers, 2019, 42(12): 2695-2721
    [8] Hu K X, Li L, Liu J Q, Sun D. DuroNet: A dual-robust enhanced spatial-temporal learning network for urban crime prediction. ACM Transactions on Internet Technology, 2021, 21(1): Article No. 24
    [9] Wang J L, Ding K Z, Hong L J, Liu H, Caverlee J. Next-item recommendation with sequential hypergraphs. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2020. 1101−1110
    [10] 陈聪, 张伟, 王骏. 带有时间预测辅助任务的会话式序列推荐. 计算机学报, 2021, 44(9): 1841-1853

    Chen Cong, Zhang Wei, Wang Jun. Session-based sequential recommendation with auxiliary time prediction. Chinese Journal of Computers, 2021, 44(9): 1841-1853
    [11] Wang W J, Feng F L, He X N, Nie L Q, Chua T S. Denoising implicit feedback for recommendation. In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining. New York, USA: ACM, 2021. 373−381
    [12] Neupane K P, Zheng E, Yu Q. MetaEDL: Meta evidential learning for uncertainty-aware cold-start recommendations. In: Proceedings of the IEEE International Conference on Data Mining (ICDM). Auckland, New Zealand: IEEE, 2021. 1258−1263
    [13] Fan Z W, Liu Z W, Wang S, Zheng L, Yu P S. Modeling sequences as distributions with uncertainty for sequential recommendation. In: Proceedings of the 30th ACM International Conference on Information and Knowledge Management. Queensland, Australia: ACM, 2021. 3019−3023
    [14] Fan Z W, Liu Z W, Wang Y, Wang A, Nazari Z, Zheng L, et al. Sequential recommendation via stochastic self-attention. In: Proceedings of the ACM Web Conference. Lyon, France: ACM, 2022. 2036−2047
    [15] Jiang J Y, Yang D Q, Xiao Y H, Shen C L. Convolutional Gaussian embeddings for personalized recommendation with uncertainty. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao, China: AAAI Press, 2019. 2642−2648
    [16] Zhou X L, Liu H, Pourpanah F, Zeng T Y, Wang X Z. A survey on epistemic (model) uncertainty in supervised learning: Recent advances and applications. Neurocomputing, 2022, 489: 449-465 doi: 10.1016/j.neucom.2021.10.119
    [17] Sun F, Liu J, Wu J, Pei C H, Lin X, Ou W W, et al. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Beijing, China: ACM, 2019. 1441−1450
    [18] Ovadia Y, Fertig E, Ren J, Nado Z, Sculley D, Nowozin S, et al. Can you trust your model's uncertainty? Evaluating predictive uncertainty under dataset shift. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2019. Article No. 1254
    [19] Fort S, Hu H Y, Lakshminarayanan B. Deep ensembles: A loss landscape perspective. arXiv preprint arXiv: 1912.02757, 2019.
    [20] Lakshminarayanan B, Pritzel A, Blundell C. Simple and scalable predictive uncertainty estimation using deep ensembles. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 6405−6416
    [21] Renda A, Barsacchi M, Bechini A, Marcelloni F. Comparing ensemble strategies for deep learning: An application to facial expression recognition. Expert Systems With Applications, 2019, 136: 1-11 doi: 10.1016/j.eswa.2019.06.025
    [22] Deng D D, Wu L, Shi B E. Iterative distillation for better uncertainty estimates in multitask emotion recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Montreal, Canada: IEEE, 2021. 3550−3559
    [23] Reich S, Mueller D, Andrews N. Ensemble distillation for structured prediction: Calibrated, accurate, fast —— Choose three. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, USA: ACL, 2020. 5583−5595
    [24] Jiao X Q, Yin Y C, Shang L F, Jiang X, Chen X, Li L L, et al. TinyBERT: Distilling BERT for natural language understanding. In: Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg, USA: ACL, 2020. 4163−4174
    [25] Zhu J M, Liu J Y, Li W Q, Lai J C, He X Q, Chen L, et al. Ensembled CTR prediction via knowledge distillation. In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management. New York, USA: ACM, 2020. 2941−2958
    [26] Kang S K, Hwang J, Kweon W, Yu H. DE-RRD: A knowledge distillation framework for recommender system. In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management. New York, USA: ACM, 2020. 605−614
    [27] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. In: Proceedings of the 28th Conference on Neural Information Processing Systems. Montreal, Canada: Curran Associates, 2014. 1−9
    [28] Shen Z Q, Liu Z C, Xu D J, Chen Z T, Cheng K T, Savvides M. Is label smoothing truly incompatible with knowledge distillation: An empirical study. In: Proceedings of the 9th International Conference on Learning Representations. Virtual Event, Austria: ICLR, 2020. 1−17
    [29] Furlanello T, Lipton Z C, Tschannen M, Itti L, Anandkumar A. Born-again neural networks. In: Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: PMLR, 2018. 1602−1611
    [30] Romero A, Ballas N, Kahou S E, Chassang A, Gatta C, Bengio Y. FitNets: Hints for thin deep nets. In: Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR, 2015. 1−13
    [31] Lin T Y, Goyal P, Girshick R, He K M, Dollár P. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 2999−3007
    [32] Bengio Y, Louradour J, Collobert R, Weston J. Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning. Montreal, Canada: ACM, 2009. 41−48
    [33] Ericsson K A. Deliberate practice and acquisition of expert performance: A general overview. Academic Emergency Medicine, 2008, 15(11): 988-994 doi: 10.1111/j.1553-2712.2008.00227.x
    [34] Song W P, Shi C C, Xiao Z P, Duan Z J, Xu Y W, Zhang M, et al. Autoint: Automatic feature interaction learning via self-attentive neural networks. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Beijing, China: ACM, 2019. 1161−1170
    [35] Qin Y Q, Wang P F, Li C L. The world is binary: Contrastive learning for denoising next basket recommendation. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2021. 859−868
    [36] 黄震华, 杨顺志, 林威, 倪娟, 孙圣力, 陈运文, 等. 知识蒸馏研究综述. 计算计学报, 2022, 45(3): 624-653

    Hung Zhen-Hua, Yang Shun-Zhi, Lin Wei, Ni Juan, Sun Sheng-Li, Chen Yun-Wen, et al. Knowledge distillation: A survey. Chinese Journal of Computers, 2022, 45(3): 624-653
    [37] 潘瑞东, 孔维健, 齐洁. 基于预训练模型与知识蒸馏的法律判决预测算法. 控制与决策, 2022, 37(1): 67-76

    Pan Rui-Dong, Kong Wei-Jian, Qi Jie. Legal judgment prediction based on pre-training model and knowledge distillation. Control and Decision, 2022, 37(1): 67-76
    [38] Zhao B R, Cui Q, Song R J, Qiu Y Y, Liang J J. Decoupled knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 11943−11952
    [39] Zhang Y, Xiang T, Hospedales T M, Lu H C. Deep mutual learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4320−4328
    [40] Zhao H J, Yang G, Wang D, Lu H C. Deep mutual learning for visual object tracking. Pattern Recognition, 2021, 112: Article No. 107796 doi: 10.1016/j.patcog.2020.107796
    [41] Chen D F, Mei J P, Wang C, Feng Y, Chen C. Online knowledge distillation with diverse peers. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 3430−3437
    [42] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, et al. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 6000−6010
    [43] Toneva M, Sordoni A, des Combes R T, Trischler A, Bengio Y, Gordon G J. An empirical study of example forgetting during deep neural network learning. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans, USA: ICLR, 2019. 1−19
    [44] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. Sardinia, Italy: JMLR, 2010. 249−256
    [45] He K M, Zhang X Y, Ren S Q, Sun J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE, 2015. 1026−1034
  • 加载中
图(7) / 表(4)
计量
  • 文章访问数:  700
  • HTML全文浏览量:  554
  • PDF下载量:  155
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-04-28
  • 录用日期:  2022-09-13
  • 网络出版日期:  2022-10-26
  • 刊出日期:  2023-07-20

目录

    /

    返回文章
    返回