• 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

社会心理学启发的多模态人格评分预测方法研究

李琳 周阳 王聪慧 汪志浩 田浩

李琳, 周阳, 王聪慧, 汪志浩, 田浩. 社会心理学启发的多模态人格评分预测方法研究. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250374
引用本文: 李琳, 周阳, 王聪慧, 汪志浩, 田浩. 社会心理学启发的多模态人格评分预测方法研究. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250374
Li Lin, Zhou Yang, Wang Cong-Hui, Wang Zhi-Hao, Tian Hao. Multimodal personality rating prediction method inspired by social psychology. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250374
Citation: Li Lin, Zhou Yang, Wang Cong-Hui, Wang Zhi-Hao, Tian Hao. Multimodal personality rating prediction method inspired by social psychology. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250374

社会心理学启发的多模态人格评分预测方法研究

doi: 10.16383/j.aas.c250374 cstr: 32138.14.j.aas.c250374
基金项目: 国家自然科学基金(62276196)资助
详细信息
    作者简介:

    李琳:武汉理工大学教授. 主要研究方向为信息检索与推荐系统, 数据挖掘与模式识别和多模态机器学习. E-mail: cathylilin@whut.edu.cn

    周阳:武汉理工大学硕士研究生. 主要研究方向为自然语言处理和情感计算. E-mail: ychow@whut.edu.cn

    王聪慧:武汉理工大学硕士研究生. 主要研究方向为自然语言处理和多模态机器学习. E-mail: wch6606@csepdi.com

    汪志浩:武汉理工大学硕士研究生. 主要研究方向为多模态机器学习和信息检索与推荐系统. E-mail: gm2wzh@gmail.com

    田浩:湖北经济学院教授. 主要研究方向为金融风险, 服务发现与推荐和机器学习. 本文通信作者. E-mail: th@hbue.edu.cn

Multimodal Personality Rating Prediction Method Inspired by Social Psychology

Funds: Supported by National Natural Science Foundation of China (62276196)
More Information
    Author Bio:

    LI Lin Professor at School of Computer Science and Artificial Intelligence, Wuhan University of Technology. Her research interest covers information retrieval and recommender systems, data mining and pattern recognition, and multimodal machine learning

    ZHOU Yang Master student at School of Computer Science and Artificial Intelligence, Wuhan University of Technology. His research interest covers natural language processing and affective computing

    WANG Cong-Hui Master student at School of Computer Science and Artificial Intelligence, Wuhan University of Technology. Her research interest covers natural language processing and multimodal machine learning

    WANG Zhi-Hao Master student at School of Computer Science and Artificial Intelligence, Wuhan University of Technology. His research interest covers multimodal machine learning, information retrieval, and recommender systems

    TIAN Hao Professor at Hubei Key Laboratory of Digital Finance Innovation, Hubei University of Economics. His research interest covers financial risk, service discovery and recommendation, and machine learning. Corresponding author of this paper

  • 摘要: 人格特质作为个体在思想、情感和行为模式上独特且相对稳定的心理特征, 是理解和预测人类行为的重要维度. 多模态人格评分预测研究已成为心理学、社会学与计算科学交叉融合的前沿热点. 然而, 现有评分预测方法在捕捉个体稳定人格特质时, 常因行为表现中的非典型成分(如停顿、思考或环境噪声)而产生偏差, 影响了人格特质多维度评分预测的准确性. 针对这一问题, 受认知−情感人格系统(Cognitive-Affective Personality System, CAPS)理论启发, 提出一种多模态人格评分预测框架EBPNet(Emotion-Behavior-based Personality Network). 该框架充分利用社会情境对人格表现的调节作用, 通过构建上下文情境感知模块, 系统整合视频数据中的动态情境发展过程, 减少了非典型行为对人格特质评分预测的影响. 同时, 框架融合视觉大模型的细粒度情感分析能力, 精确提取情绪演变轨迹与微表情特征, 并与语音转录文本形成多类型数据的协同评分预测, 提升了对个体情感-行为时序模式的建模能力. 通过显式建模社会情境与多模态行为数据的交互关系, 该框架实现了人格特质的多维度评分预测. 实验结果表明, EBPNet在目前广泛认可的多模态人格分析数据集First Impressions V2上的表现优于现有基线模型, 验证了社会心理学启发的多维度评分预测方法的有效性.
    1)  11https://chalearnlap.cvc.uab.cat/challenge/14/description/
    2)  22https://huggingface.co/openai/clip-vit-large-patch14
  • 图  1  个体特定行为模式生成过程

    Fig.  1  The generation process of individual-specific behavioral patterns

    图  2  EBPNet整体框架图

    Fig.  2  Overview of our EBPNet

    图  3  VideoLLaMA3生成摘要文本示例

    Fig.  3  An example of summary text generated by VideoLLaMA3

    图  4  人格特质空间对齐模块可解释性分析图

    Fig.  4  Interpretability analysis of the personality trait space alignment module

    表  1  First Impressions V2数据集统计信息

    Table  1  Statistics of the First Impressions V2 dataset

    统计项数量/信息
    数据集样本总数10 000个
    训练集样本总数6 000个
    验证集样本总数2 000个
    测试集样本总数2 000个
    标签个数5个
    数据模态视频和文本
    采集YouTube视频数约3 000条
    同源视频片段上限6条
    视频时长15秒
    平均转录文本单词个数43个
    下载: 导出CSV

    表  2  软硬件实验环境

    Table  2  Software and hardware experimental configuration

    实验环境参数配置
    硬件环境GPUNVIDIA TITAN Xp
    硬件环境CPUIntel(R) Xeon(R) CPU E5-2650 V4 @ 2.20GHz
    硬件环境内存容量512GB
    硬件环境显存容量12GB
    软件环境操作系统CentOS 7.2.1511(Core)
    软件环境Python3.10
    软件环境PyTorch2.4.0
    软件环境CUDA12.1
    软件环境cuDNN8.9.6
    下载: 导出CSV

    表  3  EBPNet框架与其他基线模型在Acc上效果对比表

    Table  3  A performance comparison among EBPNet and baselines at Acc

    模型 开放性$\uparrow$ 尽责性$\uparrow$ 外倾性$\uparrow$ 宜人性$\uparrow$ 神经质性$\uparrow$ 平均结果$\uparrow$
    NJU-LAMDA[14] 91.23 91.66 91.33 91.26 91.00 91.30
    Evolgen[15] 91.17 91.19 91.50 91.19 90.99 91.21
    DRN[16] 91.11 91.38 91.07 91.02 90.89 91.09
    CR-Net[12] 91.95 92.18 92.02 91.77 91.46 91.88
    EMP[20] 91.72 92.05 92.10 91.52 91.68 91.81
    PCENet[13] 92.15 92.33 92.21 92.38 92.19 92.25
    AMIF-Net[17] 92.04 91.79 92.01 92.24 92.19 92.05
    EBPNet(ours) 92.20 92.84 92.67 92.28 92.25 92.45
    下载: 导出CSV

    表  5  EBPNet框架与其他基线模型在PCC上效果对比表

    Table  5  A performance comparison among EBPNet and baselines at PCC

    模型 开放性$\uparrow$ 尽责性$\uparrow$ 外倾性$\uparrow$ 宜人性$\uparrow$ 神经质性$\uparrow$ 平均结果$\uparrow$
    NJU-LAMDA[14] 0.36 0.45 0.43 0.37 0.34 0.39
    DRN[16] 0.25 0.20 0.36 0.12 0.25 0.24
    CR-Net[12] 0.62 0.60 0.59 0.51 0.47 0.56
    EMP[20] 0.52 0.58 0.63 0.42 0.55 0.54
    PCENet[13] 0.65 0.65 0.69 0.57 0.68 0.65
    AMIF-Net[17] 0.58 0.61 0.61 0.49 0.61 0.58
    EBPNet(ours) 0.77 0.69 0.73 0.67 0.72 0.72
    下载: 导出CSV

    表  4  Acc多种子统计比较

    Table  4  Acc comparison across multiple seeds

    模型均值$\uparrow$标准差差值(vs PCENet)p值(vs PCENet)
    EBPNet(ours)92.45$\pm$0.08--
    PCENet[13]92.25$\pm$0.120.200.084
    下载: 导出CSV

    表  6  PCC多种子统计比较

    Table  6  PCC comparison across multiple seeds

    模型均值$\uparrow$标准差差值(vs PCENet)p值(vs PCENet)
    EBPNet(ours)0.720$\pm$0.008--
    PCENet[13]0.650$\pm$0.0130.0700.002
    下载: 导出CSV

    表  7  EBPNet框架在Acc上消融实验结果

    Table  7  Ablation experimental results of EBPNet at Acc

    模型 开放性$\uparrow$ 尽责性$\uparrow$ 外倾性$\uparrow$ 宜人性$\uparrow$ 神经质性$\uparrow$ 平均结果$\uparrow$
    EBPNet(ours) 92.20 92.84 92.67 92.28 92.25 92.45
    w/o co-label 92.10 92.68 92.50 92.15 92.05 92.30
    zero-context 91.98 92.45 92.20 92.05 91.82 92.06
    w/o caption 91.88 92.42 92.18 91.98 91.75 92.04
    w/o context 91.50 92.18 91.95 91.85 91.50 91.80
    下载: 导出CSV

    表  8  EBPNet框架在PCC上消融实验结果

    Table  8  Ablation experimental results of EBPNet at PCC

    模型 开放性$\uparrow$ 尽责性$\uparrow$ 外倾性$\uparrow$ 宜人性$\uparrow$ 神经质性$\uparrow$ 平均结果$\uparrow$
    EBPNet(ours) 0.77 0.69 0.73 0.67 0.72 0.72
    w/o co-label 0.64 0.60 0.58 0.59 0.66 0.61
    zero-context 0.65 0.57 0.59 0.52 0.66 0.60
    w/o caption 0.60 0.55 0.55 0.50 0.64 0.57
    w/o context 0.56 0.50 0.47 0.45 0.61 0.52
    下载: 导出CSV

    表  9  上下文模块引入前后的预测一致性对比

    Table  9  Intra-person prediction consistency comparison with and without the context-aware module

    模型开放性$\downarrow$尽责性$\downarrow$外倾性$\downarrow$宜人性$\downarrow$神经质性$\downarrow$平均结果$\downarrow$
    w/o context0.0670.0700.0770.0650.0800.072
    EBPNet(ours)0.0410.0380.0360.0430.0340.038
    下载: 导出CSV

    表  10  典型心理学预期关系的预测一致性验证

    Table  10  Verification of prediction consistency for typical psychological expected relationships

    心理学预期 预期方向 真实标签 模型预测 是否一致
    外倾性(E) $\leftrightarrow$神经质性(N) 负相关 −0.22 −0.27 一致
    尽责性(C) $\leftrightarrow$开放性(O) 正相关 +0.18 +0.22 一致
    宜人性(A) $\leftrightarrow$外倾性(E) 正相关 +0.18 +0.22 一致
    下载: 导出CSV

    表  11  基于不同Prompt的视觉大模型实验结果

    Table  11  Experimental results of visual large models based on different prompts

    模型PromptMicro-F1
    开放性$ \uparrow $尽责性$ \uparrow $外倾性$ \uparrow $宜人性$ \uparrow $神经质性$ \uparrow $平均结果$ \uparrow $
    Ovis2-8B[36]P1提示策略0.390.380.350.420.360.38
    Ovis2-8B[36]P2提示策略0.460.460.430.500.390.45
    Ovis2-8B[36]P3提示策略0.480.480.450.510.420.46
    InternVL2_5-8B[37]P1提示策略0.480.490.450.520.430.48
    InternVL2_5-8B[37]P2提示策略0.630.640.610.640.490.61
    InternVL2_5-8B[37]P3提示策略0.670.650.630.670.510.63
    VideoLLaMA3-7B[33]P1提示策略0.440.430.410.480.400.43
    VideoLLaMA3-7B[33]P2提示策略0.530.540.520.570.470.53
    VideoLLaMA3-7B[33]P3提示策略0.580.590.560.600.480.56
    Qwen2.5-VL-7B[38]P1提示策略0.560.480.410.630.460.51
    Qwen2.5-VL-7B[38]P2提示策略0.650.680.630.660.460.62
    Qwen2.5-VL-7B[38]P3提示策略0.760.710.690.700.510.67
    下载: 导出CSV

    A1  基于CAPS理论的三种Prompt提示策略设计示例

    A1  Design Examples of Three Prompt Strategies Based on CAPS Theory

    提示策略核心Prompt内容设计意图
    P1基础预测You are a personality assessment expert. Please watch this video carefully and predict the speaker's Big Five personality traits (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism). For each trait, assign a level (Low / Medium / High) and briefly explain your reasoning based on what you observe in the video. Make sure to provide a rationale for each trait prediction.基准提示策略, 不引入情境或行为提示, 仅要求模型从视频中自主提取特征进行人格预测, 评估基础预测能力
    P2情境−行为
    引导
    You are a personality assessment expert. Please watch this video carefully. Pay close attention to the following aspects: (1) the social context and environment in which the speaker is situated; (2) the speaker's behavior patterns, including gestures, facial expressions, and speech style; (3) the speaker's responses to the context and any dynamic interactions. Based on these observations, assign a level (Low / Medium / High) for each of the Big Five personality traits and explain how each piece of evidence supports your prediction.在P1基础上增加显式情境与行为分析, 引导模型同时考虑情境和行为特征, 体现CAPS理论中情境对人格表现调节作用
    P3多轮渐进式
    分析
    Stage 1–Contextual Analysis: Please describe the social and environmental context of this video in detail. Identify the setting, the social situation, and any relevant background cues. Stage 2– Behavioral Analysis: Based on the context, analyze the speaker's behavior in detail. Include gestures, facial expressions, speech patterns, and how the speaker interacts with the environment or other individuals. Stage 3–Overall Personality Prediction: Integrate your analysis of both context and behavior to predict the speaker's Big Five personality traits (Low / Medium / High for each trait). For each trait, provide a detailed rationale explaining how the observed evidence supports your prediction.分三轮逐步引导模型分析: 第一轮提取情境特征, 第二轮分析行为, 第三轮整合情境和行为给出人格预测及详细依据, 较完整体现CAPS理论中情境–行为–人格推断链条
    下载: 导出CSV
  • [1] Masumura R, Orihashi S, Ihori M, Tanaka T, Makishima N, Suzuki S, et al. Multimodal fine-grained apparent personality trait recognition: Joint modeling of Big Five and questionnaire item-level scores. In: Proceedings of the 39th AAAI Conference on Artificial Intelligence. Philadelphia, USA: AAAI Press, 2025. 1456−1464
    [2] Alves G, Jannach D, Soares de Souza L, Garcia Manzato M. Towards personality-aware explanations for music recommendations using generative AI. In: Proceedings of the 19th ACM Conference on Recommender Systems. Prague, Czech Republic: ACM, 2025. 684−689
    [3] Wang X L, Li B, Dong J T, Lin Z J, Xing X J. PTDLRec: A recommendation model integrating personality traits and deep learning. Neurocomputing, 2025, 652: Article No. 131083 doi: 10.1016/j.neucom.2025.131083
    [4] Bi W H, Kou F F, Shi L, Li Y W, Li H S, Chen J P, et al. Leveraging the dual capabilities of LLM: LLM-enhanced text mapping model for personality detection. In: Proceedings of the 39th AAAI Conference on Artificial Intelligence. Philadelphia, USA: AAAI Press, 2025. 23487−23495
    [5] Zhang T Y, Qi T H, Koutsoumpis A, Zong Y, Zheng W M, Oostrom J K, et al. Assessing personality traits and interview performance from asynchronous video interviews. In: Proceedings of the 33rd ACM International Conference on Multimedia. Dublin, Ireland: ACM, 2025. 13895−13900
    [6] Li J, Wang Y, Qian W H, Hu J L, Hu Z Z, Hong R C, Wang M. Listening to the unspoken: Exploring “365” aspects of multimodal interview performance assessment. In: Proceedings of the 33rd ACM International Conference on Multimedia. Dublin, Ireland: ACM, 2025. 13909−13916
    [7] Carlyn M. An assessment of the Myers-Briggs type indicator. Journal of Personality Assessment, 1977, 41(5): 461−473
    [8] Matise M. The enneagram: An innovative approach. Journal of Professional Counseling: Practice, Theory & Research, 2007, 35(1): 38−58 doi: 10.1080/15566382.2007.12033832
    [9] Fiske D W. Consistency of the factorial structures of personality ratings from different sources. The Journal of Abnormal and Social Psychology, 1949, 44(3): 329−344 doi: 10.1037/h0057198
    [10] Tupes E C, Christal R E. Recurrent personality factors based on trait ratings. Journal of Personality, 1992, 60(2): 225−251 doi: 10.21236/ad0267778
    [11] Mocnik G, Rehberger A, Smogavc Z, Mlakar I, Smrke U, Mocnik S. Multimodal observable cues in mood, anxiety, and borderline personality disorders: A review of reviews to inform explainable AI in mental health. Frontiers in Artificial Intelligence, 2025, 8: Article No. 1696448 doi: 10.3389/frai.2025.1696448
    [12] Li Y N, Wan J, Miao Q G, Escalera S, Fang H J, Chen H Z, et al. CR-Net: A deep classification-regression network for multimodal apparent personality analysis. International Journal of Computer Vision, 2020, 128(12): 2763−2780 doi: 10.1007/s11263-020-01309-y
    [13] Zhu Y F, Wei Y T, Li M L, Zhang T T, Wei S Q, Wu B. PCENet: Psychological clues exploration network for multimodal personality assessment. In: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. Birmingham, United Kingdom: ACM, 2023. 3667−3676
    [14] Zhang C L, Zhang H, Wei X S, Wu J X. Deep bimodal regression for apparent personality analysis. In: Proceedings of Computer Vision–ECCV 2016 Workshops. Cham: Springer, 2016. 311−324
    [15] Subramaniam A, Patel V, Mishra A, Balasubramanian P, Mittal A. Bi-modal first impressions recognition using temporally ordered deep audio and stochastic visual features. In: Proceedings of Computer Vision–ECCV 2016 Workshops. Cham: Springer, 2016. 337−348
    [16] Güçlütürk Y, Güçlü U, van Gerven M A J, van Lier R. Deep impression: Audiovisual deep residual networks for multimodal apparent personality trait recognition. In: Proceedings of Computer Vision–ECCV 2016 Workshops. Cham: Springer, 2016. 349−358
    [17] Bao Y T, Liu X, Qi Y, Liu R J, Li H J. Adaptive information fusion network for multi-modal personality recognition. Computer Animation and Virtual Worlds, 2024, 35(3): Article No. e2268 doi: 10.1002/cav.2268
    [18] Costa P T, McCrae R R. The revised NEO Personality Inventory (NEO-PI-R). The SAGE Handbook of Personality Theory and Assessment: Volume 2–Personality Measurement and Testing. Thousand Oaks, USA: SAGE Publications, 2008. 179−198
    [19] Digman J M. Higher-order factors of the Big Five. Journal of Personality and Social Psychology, 1997, 73(6): 1246−1256
    [20] Wang Y S, Li D Y, Funakoshi K, Okumura M. EMP: Emotion-guided multi-modal fusion and contrastive learning for personality traits recognition. In: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval. Thessaloniki, Greece: ACM, 2023. 243−252
    [21] Wang R Q, Zhao X L, Xu X Y, Hao Y. A multimodal personality prediction framework based on adaptive graph transformer network and multi-task learning. Computer Graphics Forum, 2025, 44(2): Article No. e70030 doi: 10.1111/cgf.70030
    [22] Yang L J, Yu C, Huang C X, Zhang F Y, Liu R, Wen Z F, et al. Enhancing multimodal personality assessment with LLM-augmented hierarchical fusion. In: Proceedings of the 33rd ACM International Conference on Multimedia. Dublin, Ireland: ACM, 2025. 13917−13923
    [23] 郭浩, 李欣奕, 唐九阳, 郭延明, 赵翔. 自适应特征融合的多模态实体对齐研究. 自动化学报, 2024, 50(4): 758−770 doi: 10.16383/j.aas.c210518

    Guo Hao, Li Xin-Yi, Tang Jiu-Yang, Guo Yan-Ming, Zhao Xiang. Adaptive feature fusion for multi-modal entity alignment. Acta Automatica Sinica, 2024, 50(4): 758−770 doi: 10.16383/j.aas.c210518
    [24] Zhang L, Peng S, Winkler S. PersEmoN: A deep network for joint analysis of apparent personality, emotion and their relationship. IEEE Transactions on Affective Computing, 2022, 13(1): 298−305 doi: 10.1109/TAFFC.2019.2951656
    [25] Principi R D P, Palmero C, Junior J C S J, Escalera S. On the effect of observed subject biases in apparent personality analysis from audio-visual signals. IEEE Transactions on Affective Computing, 2021, 12(3): 607−621 doi: 10.1109/taffc.2019.2956030
    [26] Tang B, Pan K Q, Zheng M, Zhou N, Sui J L, Zhu D D, et al. Pose as a modality: A psychology-inspired network for personality recognition with a new multimodal dataset. In: Proceedings of the 39th AAAI Conference on Artificial Intelligence. Philadelphia, USA: AAAI Press, 2025. 1538−1546
    [27] Zatarain Cabada R, Cardenas Lopez H M, Escalante H J. Multimodal personality recognition for affective computing. Multimodal Affective Computing: Technologies and Applications in Learning Environments. Cham: Springer, 2023. 173−208
    [28] Sun X, Huang J, Zheng S X, Rao X H, Wang M. Personality assessment based on multimodal attention network learning with category-based mean square error. IEEE Transactions on Image Processing, 2022, 31: 2162−2174 doi: 10.1109/tip.2022.3152049
    [29] 张重生, 陈杰, 李岐龙, 邓斌权, 王杰, 陈承功. 深度对比学习综述. 自动化学报, 2023, 49(1): 15−39 doi: 10.16383/j.aas.c220421

    Zhang Chong-Sheng, Chen Jie, Li Qi-Long, Deng Bin-Quan, Wang Jie, Chen Cheng-Gong. Deep contrastive learning: A survey. Acta Automatica Sinica, 2023, 49(1): 15−39 doi: 10.16383/j.aas.c220421
    [30] 蒲志强, 易建强, 刘振, 丘腾海, 孙金林, 李飞漠. 知识与数据协同驱动的群体智能决策方法研究综述. 自动化学报, 2022, 48(3): 627−643 doi: 10.16383/j.aas.c210118

    Pu Zhi-Qiang, Yi Jian-Qiang, Liu Zhen, Qiu Teng-Hai, Sun Jin-Lin, Li Fei-Mo. Knowledge-based and data-driven integrating methodologies for collective intelligence decision making: A survey. Acta Automatica Sinica, 2022, 48(3): 627−643 doi: 10.16383/j.aas.c210118
    [31] 李霞, 卢官明, 闫静杰, 张正言. 多模态维度情感预测综述. 自动化学报, 2018, 44(12): 2142−2159

    Li Xia, Lu Guan-Ming, Yan Jing-Jie, Zhang Zheng-Yan. A survey of dimensional emotion prediction by multimodal cues. Acta Automatica Sinica, 2018, 44(12): 2142−2159
    [32] 权学良, 曾志刚, 蒋建华, 张亚倩, 吕宝粮, 伍冬睿. 基于生理信号的情感计算研究综述. 自动化学报, 2021, 47(8): 1769−1784 doi: 10.16383/j.aas.c200783

    Quan Xue-Liang, Zeng Zhi-Gang, Jiang Jian-Hua, Zhang Ya-Qian, Lv Bao-Liang, Wu Dong-Rui. Physiological signals based affective computing: A systematic review. Acta Automatica Sinica, 2021, 47(8): 1769−1784 doi: 10.16383/j.aas.c200783
    [33] Zhang B Q, Li K H, Cheng Z S, Hu Z Q, Yuan Y Q, Chen G Z, et al. VideoLLaMA 3: Frontier multimodal foundation models for image and video understanding. arXiv preprint arXiv: 2501.13106, 2025
    [34] John O P, Naumann L P, Soto C J. Paradigm shift to the integrative Big Five trait taxonomy. Handbook of Personality: Theory and Research. New York: Guilford Press, 2008. 114−158
    [35] Shen P, Wang D D, Xu Y Y, Zhang S Q, Zhao X M. PACMR: Progressive adaptive crossmodal reinforcement for multimodal apparent personality traits analysis. IEEE Signal Processing Letters, 2025, 32: 161−165 doi: 10.1109/LSP.2024.3505799
    [36] Lu S Y, Li Y, Chen Q G, Xu Z, Luo W H, Zhang K F, Ye H J. Ovis: Structural embedding alignment for multimodal large language model. arXiv preprint arXiv: 2405.20797, 2024
    [37] Chen Z, Wang W Y, Cao Y, Liu Y Z, Gao Z W, Cui E F, et al. Expanding performance boundaries of open-source multimodal models with model, data, and test-time scaling. arXiv preprint arXiv: 2412.05271, 2024
    [38] Bai S, Chen K Q, Liu X J, Wang J L, Ge W B, Song S B, et al. Qwen2.5-VL technical report. arXiv preprint arXiv: 2502.13923, 2025
  • 加载中
计量
  • 文章访问数:  7
  • HTML全文浏览量:  5
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-08-14
  • 录用日期:  2026-05-13
  • 网络出版日期:  2026-07-02

目录

    /

    返回文章
    返回