前额叶皮层启发的Transformer模型应用及其进展

潘雨辰; 贾克斌; 张铁林

doi:10.16383/j.aas.c240538

前额叶皮层启发的Transformer模型应用及其进展

doi: 10.16383/j.aas.c240538 cstr: 32138.14.j.aas.c240538

潘雨辰^{1, 2,},
贾克斌^2,,
张铁林^{1, 3, 4,}

1.
中国科学院脑科学与智能技术卓越创新中心上海 200031
2.
北京工业大学信息科学技术学院北京 100124
3.
中国科学院大学北京 100049
4.
中国科学院自动化研究所北京100190

基金项目: 北京市科技新星(20230484369), 上海市市级科技重大专项(2021SHZDZX), 中国科学院青促会基金, 多模态人工智能系统全国重点实验室开放课题基金资助

详细信息

作者简介:
潘雨辰：北京工业大学信息科学技术学院硕士研究生, 中国科学院脑科学与智能技术卓越创新中心联合培养学生. 2019年获得北京工业大学工学学士学位. 主要研究方向为类脑模型算法.E-mail: 18201335023@sina.cn

贾克斌：博士, 北京工业大学信息科学技术学院教授. 主要研究方向为图像/视频处理技术与生物医学信息处理技术.E-mail: kebinj@bjut.edu.cn

张铁林：中国科学院脑科学与智能技术卓越创新中心研究员. 主要研究方向为类脑脉冲神经网络算法, 类脑芯片及AI for Neuroscience研究. 本文通信作者.E-mail: zhangtielin@ion.ac.cn

计量
- 文章访问数: 1177
- HTML全文浏览量: 856
- PDF下载量: 301
- 被引次数: 0
出版历程
- 收稿日期: 2024-07-30
- 录用日期: 2024-12-13
- 网络出版日期: 2025-03-03
- 刊出日期: 2025-07-29

The Application and Progress of Prefrontal Cortex-inspired Transformer Model

PAN Yu-Chen^{1, 2
,},
JIA Ke-Bin^2
,,
ZHANG Tie-Lin^{1, 3, 4
,}

1.
Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031
2.
Schoolof Information Science and Technology, Beijing University of Technology, Beijing 100124
3.
University of Chinese Academy of Sciences, Beijing 100049
4.
Institute of Automation, Chinese Academy of Sciences, Beijing 100190

Funds: Supported by Beijing Nova Program (20230484369), Shanghai Municipal Science and Technology Major Project (2021SHZDZX), Youth Innovation Promotion Association of Chinese Academy of Sciences, and Open Projects Program of State Key Laboratory of Multimodal Artificial Intelligence Systems

More Information

Author Bio:
PAN Yu-Chen　Master student at the School of Information Science and Technology, Beijing University of Technology, co-supervised by the Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences. He received his bachelor degree in engineering from Beijing University of Technology in 2019. His main research interest is brain-inspired algorithms

JIA Ke-Bin　Ph.D., professor at the School of Information Science and Technology, Beijing University of Technology. His research interest covers image/video processing technology and biomedical information processing technology

ZHANG Tie-Lin　Professor at the Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences. He research interest covers research of brain-inspired spiking neural network algorithms, brain-inspired chips, and AI for Neuroscience. Corresponding author of this paper

摘要

摘要: 聚焦于生物结构与类脑智能的交叉研究方向, 探讨前额叶皮层的结构及其认知功能对人工智能领域Transformer模型的启发. 前额叶皮层在认知控制和决策制定中扮演着关键角色. 首先介绍前额叶皮层的注意力机制、生物编码、多感觉融合等相关生物研究进展, 然后探讨这些生物机制如何启发新型的类脑Transformer架构, 重点提升其在自注意力、位置编码、多模态整合等方面的生物合理性与计算高效性. 最后, 总结前额叶皮层启发的类脑新模型, 在支持多类型神经网络组合、多领域应用、世界模型构建等方面的发展与潜力, 为生物和人工智能两大领域之间交叉融合构建桥梁.
- 生物结构 /
- 类脑智能 /
- 前额叶皮层 /
- Transformer /
- 世界模型
Abstract: This article focuses on the integration of biological network connectomes and brain-inspired intelligence, exploring how the structure and cognitive functions of the prefrontal cortex can inspire transformer models in the field of artificial intelligence. The prefrontal cortex plays a critical role in cognitive control and decision-making. Firstly, the article introduces some advancements of biological research related to the prefrontal cortex＇s attention mechanisms, biological encoding, and multisensory integration. Then, it discusses how these biological mechanisms can inspire novel brain-inspired transformer architectures, with a focus on enhancing their biological plausibility and computational efficiency in self-attention, positional encoding, and multimodal integration. Finally, it summarizes the development and potential of new brain-inspired models influenced by the prefrontal cortex, highlighting their support for the integration of various neural network types, multi-domain applications, and the construction of world models in reinforcement learning, thereby building a bridge between the fields of biology and artificial intelligence.
- Biological structures /
- brain-inspired intelligence /
- prefrontal cortex /
- transformer /
- world model

HTML全文

图 1 PFC启发Transformer结构

Fig. 1 PFC-inspired transformer structure

下载: 全尺寸图片幻灯片

图 2 PFC生物功能启发生物功能模型与神经网络架构

Fig. 2 PFC biofunctional-inspired biofunctional model with neural network architecture

下载: 全尺寸图片幻灯片

图 3 PFC与类脑智能相互促进、共同进步

Fig. 3 PFC and brain-like intelligence promote each other and progress together

下载: 全尺寸图片幻灯片

图 4 PFC与Transformer注意力相关模型架构

Fig. 4 PFC and transformer attention-related model architecture

下载: 全尺寸图片幻灯片

图 5 PFC与Transformer以注意力机制为媒介相互启发

Fig. 5 PFC and transformer inspire each other through the medium of the attention mechanism

下载: 全尺寸图片幻灯片

图 6 标量相对位置编码 (SRPE) 原理

Fig. 6 Principle of scalar relative position encoding (SRPE)

下载: 全尺寸图片幻灯片

图 7 PFC生物编码过程启发Transformer位置编码

Fig. 7 PFC biological coding process inspired transformer position coding

下载: 全尺寸图片幻灯片

图 8 不同模态下EEG分类的多尺度卷积Transformer模型

Fig. 8 Multi-scale convolutional transformer model for EEG classification in different modalities

下载: 全尺寸图片幻灯片

图 9 PFC多感觉融合与多模态Transformer逻辑结构图

Fig. 9 PFC multisensory fusion with multimodal transformer logic structure diagram

下载: 全尺寸图片幻灯片

参考文献(134)

[1]	Ceccarelli F, Ferrucci L, Londei F, Ramawat S, Brunamonti E, Genovesio A. Static and dynamic coding in distinct cell types during associative learning in the prefrontal cortex. Nature Communications, 2023, 14(1): Article No. 8325 doi: 10.1038/s41467-023-43712-2
[2]	Watakabe A, Skibbe H, Nakae K, Abe H, Ichinohe N, Rachmadi M F, et al. Local and long-distance organization of prefrontal cortex circuits in the marmoset brain. Neuron, 2023, 111(14): 2258−2273 doi: 10.1016/j.neuron.2023.04.028
[3]	Passingham R E, Lau H. Do we understand the prefrontal cortex? Brain Structure and Function, 2023, 228(5): 1095−1105
[4]	Trapp N T, Bruss J E, Manzel K, Grafman J, Tranel D, Boes A D. Large-scale lesion symptom mapping of depression identifies brain regions for risk and resilience. Brain, 2023, 146(4): 1672−1685 doi: 10.1093/brain/awac361
[5]	Chafee M V, Heilbronner S R. Prefrontal cortex. Current Biology, 2022, 32(8): R346−R351 doi: 10.1016/j.cub.2022.02.071
[6]	Miller E K, Cohen J D. An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 2001, 24: 167−202 doi: 10.1146/annurev.neuro.24.1.167
[7]	Diehl G W, Redish A D. Differential processing of decision information in subregions of rodent medial prefrontal cortex. eLife, 2023, 12: Article No. e82833 doi: 10.7554/eLife.82833
[8]	Wang J X, Kurth-Nelson Z, Kumaran D, Tirumala D, Soyer H, Leibo J Z, et al. Prefrontal cortex as a meta-reinforcement learning system. Nature Neuroscience, 2018, 21(6): 860−868 doi: 10.1038/s41593-018-0147-8
[9]	Alexander W H, Brown J W. Medial prefrontal cortex as an action-outcome predictor. Nature Neuroscience, 2011, 14(10): 1338−1344 doi: 10.1038/nn.2921
[10]	Rikhye R V, Gilra A, Halassa M M. Thalamic regulation of switching between cortical representations enables cognitive flexibility. Nature Neuroscience, 2018, 21(12): 1753−1763 doi: 10.1038/s41593-018-0269-z
[11]	Gisiger T, Boukadoum M. Mechanisms gating the flow of information in the cortex: What they might look like and what their uses may be. Frontiers in Computational Neuroscience, 2011, 5: Article No. 1
[12]	Johnston K, Levin H M, Koval M J, Everling S. Top-down control-signal dynamics in anterior cingulate and prefrontal cortex neurons following task switching. Neuron, 2007, 53(3): 453−462 doi: 10.1016/j.neuron.2006.12.023
[13]	Tsuda B, Tye K M, Siegelmann H T, Sejnowski T J. A modeling framework for adaptive lifelong learning with transfer and savings through gating in the prefrontal cortex. Proceedings of the National Academy of Sciences of the United States of America, 2020, 117(47): 29872−29882
[14]	Wang Z H, Zhang J, Zhang X C, Chen P, Wang B. Transformer model for functional near-infrared spectroscopy classification. IEEE Journal of Biomedical and Health Informatics, 2022, 26(6): 2559−2569 doi: 10.1109/JBHI.2022.3140531
[15]	Choi S R, Lee M. Transformer architecture and attention mechanisms in genome data analysis: A comprehensive review. Biology, 2023, 12(7): Article No. 1033 doi: 10.3390/biology12071033
[16]	Li Q P, Zhuang Y. An efficient image-guided-based 3D point cloud moving object segmentation with transformer-attention in autonomous driving. International Journal of Applied Earth Observation and Geoinformation, 2023, 123: Article No. 103488 doi: 10.1016/j.jag.2023.103488
[17]	Brus J, Heng J A, Beliaeva V, Gonzalez Pinto F, Cassarà A M, Neufeld E, et al. Causal phase-dependent control of non-spatial attention in human prefrontal cortex. Nature Human Behaviour, 2024, 8(4): 743−757 doi: 10.1038/s41562-024-01820-z
[18]	Bichot N P, Heard M T, Degennaro E M, Desimone R. A source for feature-based attention in the prefrontal cortex. Neuron, 2015, 88(4): 832−844 doi: 10.1016/j.neuron.2015.10.001
[19]	Huang L, Wang J Y, He Q H, Li C, Sun Y L, Seger C A, et al. A source for category-induced global effects of feature-based attention in human prefrontal cortex. Cell Reports, 2023, 42(9): Article No. 113080 doi: 10.1016/j.celrep.2023.113080
[20]	Zhao M L, Xu D H, Gao T. From cognition to computation: A comparative review of human attention and transformer architectures. arXiv preprint arXiv: 2407.01548, 2024.
[21]	Kumar S, Sumers T R, Yamakoshi T, Goldstein A, Hasson U, Norman K A, et al. Shared functional specialization in transformer-based language models and the human brain. Nature Communications, 2024, 15(1): Article No. 5523 doi: 10.1038/s41467-024-49173-5
[22]	Muller L, Churchland P S, Sejnowski T J. Transformers and cortical waves: Encoders for pulling in context across time. Trends in Neurosciences, 2024, 47(10): 788−802 doi: 10.1016/j.tins.2024.08.006
[23]	Huang H M, Li R, Qiao X J, Li X R, Li Z Y, Chen S Y, et al. Attentional control influence habituation through modulation of connectivity patterns within the prefrontal cortex: Insights from stereo-EEG. NeuroImage, 2024, 294: Article No. 120640 doi: 10.1016/j.neuroimage.2024.120640
[24]	Li N N, Chen Y R, Li W F, Ding Z X, Zhao D B, Nie S. BViT: Broad attention-based vision transformer. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(9): 12772−12783 doi: 10.1109/TNNLS.2023.3264730
[25]	Shi Q Q, Fan J S, Wang Z R, Zhang Z X. Multimodal channel-wise attention transformer inspired by multisensory integration mechanisms of the brain. Pattern Recognition, 2022, 130: Article No. 108837 doi: 10.1016/j.patcog.2022.108837
[26]	Gong D Y, Zhang H T. Self-attention limits working memory capacity of transformer-based models. arXiv preprint arXiv: 2409.10715, 2024.
[27]	Maith O, Schwarz A, Hamker F H. Optimal attention tuning in a neuro-computational model of the visual cortex-basal ganglia-prefrontal cortex loop. Neural Networks, 2021, 142: 534−547 doi: 10.1016/j.neunet.2021.07.008
[28]	Spitale G, Biller-Andorno N, Germani F. AI model GPT-3 (dis) informs us better than humans. Science Advances, 2023, 9(26): Article No. eadh1850 doi: 10.1126/sciadv.adh1850
[29]	Brown T B, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2020. Article No. 159
[30]	Zhang S S, Roller S, Goyal N, Artetxe M, Chen M Y, Chen S H, et al. OPT: Open pre-trained transformer language models. arXiv preprint arXiv: 2205.01068, 2022.
[31]	Yue F P, Ko T. An investigation of positional encoding in transformer-based end-to-end speech recognition. In: Proceedings of the 12th International Symposium on Chinese Spoken Language Processing (ISCSLP). Hong Kong, China: IEEE, 2021. 1−5
[32]	Kazemnejad A, Padhi I, Natesan Ramamurthy K, Das P, Reddy S. The impact of positional encoding on length generalization in transformers. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates Inc., 2024. Article No. 1082
[33]	Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, Roberts A, et al. PaLM: Scaling language modeling with pathways. Journal of Machine Learning Research, 2023, 24(240): 1−113
[34]	Zhang R R, Han J M, Liu C, Gao P, Zhou A J, Hu X F, et al. LLaMA-adapter: Efficient fine-tuning of language models with zero-init attention. arXiv preprint arXiv: 2303.16199, 2023.
[35]	Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M A, Lacroix T, et al. LLaMA: Open and efficient foundation language models. arXiv preprint arXiv: 2302.13971, 2023.
[36]	Wu J S, Zhang R C, Mao Y Y, Chen J F. On scalar embedding of relative positions in attention models. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. AAAI, 2021. 14050−14057
[37]	Wallis J D, Anderson K C, Miller E K. Single neurons in prefrontal cortex encode abstract rules. Nature, 2001, 411(6840): 953−956 doi: 10.1038/35082081
[38]	Oota S R, Arora J, Rowtula V, Gupta M, Bapi R S. Visio-linguistic brain encoding. In: Proceedings of the 29th International Conference on Computational Linguistics. Gyeongju, Republic of Korea: ACL, 2022. 116−133
[39]	Bocincova A, Buschman T J, Stokes M G, Manohar S G. Neural signature of flexible coding in prefrontal cortex. Proceedings of the National Academy of Sciences of the United States of America, 2022, 119(40): Article No. e2200400119
[40]	Zhang K, Hao W N, Yu X H, Shao T H. An interpretable image classification model combining a fuzzy neural network with a variational autoencoder inspired by the human brain. Information Sciences, 2024, 661: Article No. 119885 doi: 10.1016/j.ins.2023.119885
[41]	Aoi M C, Mante V, Pillow J W. Prefrontal cortex exhibits multidimensional dynamic encoding during decision-making. Nature Neuroscience, 2020, 23(11): 1410−1420 doi: 10.1038/s41593-020-0696-5
[42]	Zhang Z M, Gong X. Positional label for self-supervised vision transformer. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI, 2023. 3516−3524
[43]	Caucheteux C, Gramfort A, King J R. Evidence of a predictive coding hierarchy in the human brain listening to speech. Nature Human Behaviour, 2023, 7(3): 430−441 doi: 10.1038/s41562-022-01516-2
[44]	Busch A, Roussy M, Luna R, Leavitt M L, Mofrad M H, Gulli R A, et al. Neuronal activation sequences in lateral prefrontal cortex encode visuospatial working memory during virtual navigation. Nature Communications, 2024, 15(1): Article No. 4471 doi: 10.1038/s41467-024-48664-9
[45]	Labaien J, Idé T, Chen P Y, Zugasti E, de Carlos X. Diagnostic spatio-temporal transformer with faithful encoding. Knowledge-Based Systems, 2023, 274: Article No. 110639 doi: 10.1016/j.knosys.2023.110639
[46]	Deihim A, Alonso E, Apostolopoulou D. STTRE: A spatio-temporal transformer with relative embeddings for multivariate time series forecasting. Neural Networks, 2023, 168: 549−559 doi: 10.1016/j.neunet.2023.09.039
[47]	Ma Y J, Wang R L. Relative-position embedding based spatially and temporally decoupled Transformer for action recognition. Pattern Recognition, 2024, 145: Article No. 109905 doi: 10.1016/j.patcog.2023.109905
[48]	Coen P, Sit T P H, Wells M J, Carandini M, Harris K D. Mouse frontal cortex mediates additive multisensory decisions. Neuron, 2023, 111(15): 2432−2447 doi: 10.1016/j.neuron.2023.05.008
[49]	Ferrari A, Noppeney U. Attention controls multisensory perception via two distinct mechanisms at different levels of the cortical hierarchy. PLoS Biology, 2021, 19(11): Article No. e3001465 doi: 10.1371/journal.pbio.3001465
[50]	Mihalik A, Noppeney U. Causal inference in audiovisual perception. The Journal of Neuroscience, 2020, 40(34): 6600−6612 doi: 10.1523/JNEUROSCI.0051-20.2020
[51]	Kang K, Rosenkranz R, Karan K, Altinsoy E, Li S C. Congruence-based contextual plausibility modulates cortical activity during vibrotactile perception in virtual multisensory environments. Communications Biology, 2022, 5(1): Article No. 1360 doi: 10.1038/s42003-022-04318-4
[52]	Cao Y N, Summerfield C, Park H, Giordano B L, Kayser C. Causal inference in the multisensory brain. Neuron, 2019, 102(5): 1076−1087 doi: 10.1016/j.neuron.2019.03.043
[53]	Giessing C, Thiel C M, Stephan K E, Rösler F, Fink G R. Visuospatial attention: How to measure effects of infrequent, unattended events in a blocked stimulus design. Neuroimage, 2004, 23(4): 1370−1381 doi: 10.1016/j.neuroimage.2004.08.008
[54]	Zheng Q H, Zhou L X, Gu Y. Temporal synchrony effects of optic flow and vestibular inputs on multisensory heading perception. Cell Reports, 2021, 37(7): Article No. 109999 doi: 10.1016/j.celrep.2021.109999
[55]	Liang P P, Zadeh A, Morency L P. Foundations & trends in multimodal machine learning: Principles, challenges, and open questions. ACM Computing Surveys, 2024, 56(10): Article No. 264
[56]	Klemen J, Chambers C D. Current perspectives and methods in studying neural mechanisms of multisensory interactions. Neuroscience and Biobehavioral Reviews, 2012, 36(1): 111−133
[57]	Paraskevopoulos G, Georgiou E, Potamianos A. Mmlatch: Bottom-up top-down fusion for multimodal sentiment analysis. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore: IEEE, 2022. 4573−4577
[58]	Sun H, Liu J Q, Chen Y W, Lin L F. Modality-invariant temporal representation learning for multimodal sentiment classification. Information Fusion, 2023, 91: 504−514 doi: 10.1016/j.inffus.2022.10.031
[59]	Wang Z L, Wan Z H, Wan X J. TransModality: An End2End fusion method with transformer for multimodal sentiment analysis. In: Proceedings of the Web Conference. Taipei, China: ACM, 2020. 2514−2520
[60]	Yu J F, Chen K, Xia R. Hierarchical interactive multimodal transformer for aspect-based multimodal sentiment analysis. IEEE Transactions on Affective Computing, 2023, 14(3): 1966−1978 doi: 10.1109/TAFFC.2022.3171091
[61]	Huang J H, Zhou J, Tang Z C, Lin J Y, Chen C Y C. TMBL: Transformer-based multimodal binding learning model for multimodal sentiment analysis. Knowledge-Based Systems, 2024, 285: Article No. 111346 doi: 10.1016/j.knosys.2023.111346
[62]	Yang D K, Liu Y, Huang C, Li M C, Zhao X, Wang Y Z, et al. Target and source modality co-reinforcement for emotion understanding from asynchronous multimodal sequences. Knowledge-Based Systems, 2023, 265: Article No. 110370 doi: 10.1016/j.knosys.2023.110370
[63]	Ahn H J, Lee D H, Jeong J H, Lee S W. Multiscale convolutional transformer for EEG classification of mental imagery in different modalities. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2023, 31: 646−656 doi: 10.1109/TNSRE.2022.3229330
[64]	Li J, Chen N, Zhu H Q, Li G Q, Xu Z Y, Chen D X. Incongruity-aware multimodal physiology signals fusion for emotion recognition. Information Fusion, 2024, 105: Article No. 102220 doi: 10.1016/j.inffus.2023.102220
[65]	Asif M, Gupta A, Aditya A, Mishra S, Tiwary U S. Brain multi-region information fusion using attentional transformer for EEG based affective computing. In: Proceedings of the IEEE 20th India Council International Conference (INDICON). Hyderabad, India: IEEE, 2023. 771−775
[66]	Chen Z H, Han Y C, Ma Z, Wang X N, Xu S R, Tang Y, et al. A prefrontal-thalamic circuit encodes social information for social recognition. Nature Communications, 2024, 15(1): Article No. 1036 doi: 10.1038/s41467-024-45376-y
[67]	Yu J, Li J, Yu Z, Huang Q M. Multimodal transformer with multi-view visual representation for image captioning. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(12): 4467−4480 doi: 10.1109/TCSVT.2019.2947482
[68]	Hu B, Guan Z H, Chen G R, Chen C L P. Neuroscience and network dynamics toward brain-inspired intelligence. IEEE Transactions on Cybernetics, 2022, 52(10): 10214−10227 doi: 10.1109/TCYB.2021.3071110
[69]	Sucholutsky I, Muttenthaler L, Weller A, Peng A D, Bobu A, Kim B, et al. Getting aligned on representational alignment. arXiv preprint arXiv: 2310.13018, 2023.
[70]	Chersoni E, Santus E, Huang C R, Lenci A. Decoding word embeddings with brain-based semantic features. Computational Linguistics, 2021, 47(3): 663−698 doi: 10.1162/coli_a_00412
[71]	Toneva M, Wehbe L. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain). In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2019. Article No. 1339
[72]	Yu S Y, Gu C Y, Huang K X, Li P. Predicting the next sentence (not word) in large language models: What model-brain alignment tells us about discourse comprehension. Science Advances, 2024, 10(21): Article No. eadn7744 doi: 10.1126/sciadv.adn7744
[73]	Cambria E, Das D, Bandyopadhyay S, Feraco A. Affective computing and sentiment analysis. A Practical Guide to Sentiment Analysis. Cham: Springer, 2017. 1−10
[74]	Mishra A, Dey K, Bhattacharyya P. Learning cognitive features from gaze data for sentiment and sarcasm classification using convolutional neural network. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver, Canada: ACL, 2017. 377−387
[75]	Zhang Z H, Wu C H, Chen H Y, Chen H Y. CogAware: Cognition-aware framework for sentiment analysis with textual representations. Knowledge-Based Systems, 2024, 299: Article No. 112094 doi: 10.1016/j.knosys.2024.112094
[76]	Montejo-Ráez A, Molina-González M D, Jiménez-Zafra S M, García-Cumbreras M Á, García-López L J. A survey on detecting mental disorders with natural language processing: Literature review, trends and challenges. Computer Science Review, 2024, 53: Article No. 100654 doi: 10.1016/j.cosrev.2024.100654
[77]	Ramachandran G, Yang R. CortexCompile: Harnessing cortical-inspired architectures for enhanced multi-agent NLP code synthesis. arXiv preprint arXiv: 2409.02938, 2024.
[78]	Li Z J, Zhao B, Zhang G Y, Dang J W. Brain network features differentiate intentions from different emotional expressions of the same text. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Rhodes Island, Greece: IEEE, 2023. 1−5
[79]	Squires M, Tao X H, Elangovan S, Gururajan R, Zhou X J, Acharya U R, et al. Deep learning and machine learning in psychiatry: A survey of current progress in depression detection, diagnosis and treatment. Brain Informatics, 2023, 10(1): Article No. 10 doi: 10.1186/s40708-023-00188-6
[80]	Song G Z, Huang D G, Xiao Z F. A study of multilingual toxic text detection approaches under imbalanced sample distribution. Information, 2021, 12(5): Article No. 205 doi: 10.3390/info12050205
[81]	Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv preprint arXiv: 2021.11929, 2020.
[82]	Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, et al. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 6000−6010
[83]	Bi Y D, Abrol A, Jia S H, Fu Z N, Calhoun V D. Gray matters: An efficient vision transformer GAN framework for predicting functional network connectivity biomarkers from brain structure. BioRxiv, 2024.
[84]	Dong S L, Gong Y H, Shi J G, Shang M, Tao X Y, Wei X, et al. Brain cognition-inspired dual-pathway CNN architecture for image classification. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(7): 9900−9914 doi: 10.1109/TNNLS.2023.3237962
[85]	Liu L, Wang F, Zhou K, Ding N, Luo H. Perceptual integration rapidly activates dorsal visual pathway to guide local processing in early visual areas. PLoS Biology, 2017, 15(11): Article No. e2003646 doi: 10.1371/journal.pbio.2003646
[86]	Bar M. The proactive brain: Using analogies and associations to generate predictions. Trends in Cognitive Sciences, 2007, 11(7): 280−289 doi: 10.1016/j.tics.2007.05.005
[87]	Baram A B, Muller T H, Nili H, Garvert M M, Behrens T E J. Entorhinal and ventromedial prefrontal cortices abstract and generalize the structure of reinforcement learning problems. Neuron, 2021, 109(4): 713−723 doi: 10.1016/j.neuron.2020.11.024
[88]	van Holstein M, Floresco S B. Dissociable roles for the ventral and dorsal medial prefrontal cortex in cue-guided risk/reward decision making. Neuropsychopharmacology, 2020, 45(4): 683−693 doi: 10.1038/s41386-019-0557-7
[89]	Averbeck B, O'Doherty J P. Reinforcement-learning in fronto-striatal circuits. Neuropsychopharmacology, 2022, 47(1): 147−162 doi: 10.1038/s41386-021-01108-0
[90]	Hu S C, Shen L, Zhang Y, Chen Y X, Tao D C. On transforming reinforcement learning with transformers: The development trajectory. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(12): 8580−8599 doi: 10.1109/TPAMI.2024.3408271
[91]	Zhang Y, Jia M, Chen T, Li M, Wang J Y, Hu X M, et al. A neuroergonomics model for evaluating nuclear power plants operators' performance under heat stress driven by ECG time-frequency spectrums and fNIRS prefrontal cortex network: A CNN-GAT fusion model. Advanced Engineering Informatics, 2024, 62: Article No. 102563 doi: 10.1016/j.aei.2024.102563
[92]	Law C K, Kolling N, Chan C C H, Chau B K H. Frontopolar cortex represents complex features and decision value during choice between environments. Cell Reports, 2023, 42(6): Article No. 112555 doi: 10.1016/j.celrep.2023.112555
[93]	Lee J, Jung M, Lustig N, Lee J H. Neural representations of the perception of handwritten digits and visual objects from a convolutional neural network compared to humans. Human Brain Mapping, 2023, 44(5): 2018−2038 doi: 10.1002/hbm.26189
[94]	Viswanathan K A, Mylavarapu G, Chen K, Thomas J P. A study of prefrontal cortex task switching using spiking neural networks. In: Proceedings of the 12th International Conference on Advanced Computational Intelligence (ICACI). Dali, China: IEEE, 2020. 199−206
[95]	Li B Z, Pun S H, Feng W, Vai M I, Klug A, Lei T C. A spiking neural network model mimicking the olfactory cortex for handwritten digit recognition. In: Proceedings of the 9th International IEEE/EMBS Conference on Neural Engineering (NER). San Francisco, USA: IEEE, 2019. 1167−1170
[96]	Hyafil A, Summerfield C, Koechlin E. Two mechanisms for task switching in the prefrontal cortex. The Journal of Neuroscience, 2009, 29(16): 5135−5142 doi: 10.1523/JNEUROSCI.2828-08.2009
[97]	Kushleyeva Y, Salvucci D D, Lee F J. Deciding when to switch tasks in time-critical multitasking. Cognitive Systems Research, 2005, 6(1): 41−49 doi: 10.1016/j.cogsys.2004.09.005
[98]	Brass M, von Cramon D Y. The role of the frontal cortex in task preparation. Cerebral Cortex, 2002, 12(9): 908−914 doi: 10.1093/cercor/12.9.908
[99]	Wei Q L, Han L Y, Zhang T L. Learning and controlling multiscale dynamics in spiking neural networks using recursive least square modifications. IEEE Transactions on Cybernetics, 2024, 54(8): 4603−4616 doi: 10.1109/TCYB.2023.3343430
[100]	Demir A, Koike-Akino T, Wang Y, Haruna M, Erdogmus D. EEG-GNN: Graph neural networks for classification of electroencephalogram (EEG) signals. In: Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). Mexico, Mexico: IEEE, 2021. 1061−1067
[101]	Balaji S S, Parhi K K. Classifying subjects with PFC lesions from healthy controls during working memory encoding via graph convolutional networks. In: Proceedings of the 11th International IEEE/EMBS Conference on Neural Engineering (NER). Baltimore, USA: IEEE, 2023. 1−4
[102]	Yang Y W, Ye C F, Ma T. A deep connectome learning network using graph convolution for connectome-disease association study. Neural Networks, 2023, 164: 91−104 doi: 10.1016/j.neunet.2023.04.025
[103]	Achterberg J, Akarca D, Strouse D J, Duncan J, Astle D E. Spatially embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings. Nature Machine Intelligence, 2023, 5(12): 1369−1381 doi: 10.1038/s42256-023-00748-9
[104]	Jensen K T, Hennequin G, Mattar M G. A recurrent network model of planning explains hippocampal replay and human behavior. Nature Neuroscience, 2024, 27(7): 1340−1348 doi: 10.1038/s41593-024-01675-7
[105]	Pratiwi M. Comparative analysis of brain waves for EEG-based depression detection in the prefrontal cortex lobe using LSTM. In: Proceedings of the 7th International Conference on New Media Studies (CONMEDIA). Bali, Indonesia: IEEE, 2023. 173−178
[106]	Pratiwi M. EEG-based depression detection in the prefrontal cortex lobe using mRMR feature selection and bidirectional LSTM. Ultima Computing: Jurnal Sistem Komputer, 2023, 15(2): 71−78 doi: 10.31937/sk.v15i2.3426
[107]	Sharma S, Sharma S, Athaiya A. Activation functions in neural networks. International Journal of Engineering Applied Sciences and Technology, 2020, 4(12): 310−316
[108]	Jagtap A D, Kawaguchi K, Karniadakis G E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics, 2020, 404: Article No. 109136 doi: 10.1016/j.jcp.2019.109136
[109]	Abbasi J, Andersen P Ø. Physical activation functions (PAFs): An approach for more efficient induction of physics into physics-informed neural networks (PINNs). Neurocomputing, 2024, 608: Article No. 128352 doi: 10.1016/j.neucom.2024.128352
[110]	Jagtap A D, Karniadakis G E. How important are activation functions in regression and classification? A survey, performance comparison, and future directions. Journal of Machine Learning for Modeling and Computing, 2023, 4(1): 21−75 doi: 10.1615/JMachLearnModelComput.2023047367
[111]	Manola L, Roelofsen B H, Holsheimer J, Marani E, Geelen J. Modelling motor cortex stimulation for chronic pain control: Electrical potential field, activating functions and responses of simple nerve fibre models. Medical and Biological Engineering and Computing, 2005, 43(3): 335−343 doi: 10.1007/BF02345810
[112]	Steinerberger S, Wu H T. Fundamental component enhancement via adaptive nonlinear activation functions. Applied and Computational Harmonic Analysis, 2023, 63: 135−143 doi: 10.1016/j.acha.2022.11.007
[113]	Pappas C, Kovaios S, Moralis-Pegios M, Tsakyridis A, Giamougiannis G, Kirtas M, et al. Programmable tanh-, ELU-, sigmoid-, and sin-based nonlinear activation functions for neuromorphic photonics. IEEE Journal of Selected Topics in Quantum Electronics, 2023, 29(6: Photonic Signal Processing): Article No. 6101210
[114]	Ha D, Schmidhuber J. World models. arXiv preprint arXiv: 1803.10122, 2018.
[115]	Eslami S M A, Jimenez Rezende D, Besse F, Viola F, Morcos A S, Garnelo M, et al. Neural scene representation and rendering. Science, 2018, 360(6394): 1204−1210 doi: 10.1126/science.aar6170
[116]	Yamins D L K, Hong H, Cadieu C F, Solomon E A, Seibert D, Dicarlo J J. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences of the United States of America, 2014, 111(23): 8619−8624
[117]	Friston K, Moran R J, Nagai Y, Taniguchi T, Gomi H, Tenenbaum J. World model learning and inference. Neural Networks, 2021, 144: 573−590 doi: 10.1016/j.neunet.2021.09.011
[118]	Robine J, HöFtmann M, Uelwer T, Harmeling S. Transformer-based world models are happy with 100k interactions. In: Proceedings of the Eleventh International Conference on Learning Representations. Kigali, Rwanda: ICLR, 2023.
[119]	Micheli V, Alonso E, Fleuret F. Transformers are sample-efficient world models. In: Proceedings of the Eleventh International Conference on Learning Representations. Kigali, Rwanda: ICLR, 2023.
[120]	Chen C, Wu Y F, Yoon J, Ahn S. TransDreamer: Reinforcement learning with transformer world models. arXiv preprint arXiv: 2202.09481, 2022.
[121]	Zhang W P, Wang G, Sun J, Yuan Y T, Huang G. STORM: Efficient stochastic transformer based world models for reinforcement learning. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates Inc., 2024. Article No. 1182
[122]	Hafner D, Pasukonis J, Ba J, Lillicrap T. Mastering diverse domains through world models. arXiv preprint arXiv: 2301.04104, 2023.
[123]	Hafner D, Lillicrap T P, Norouzi M, Ba J. Mastering Atari with discrete world models. arXiv preprint arXiv: 2010.02193, 2020.
[124]	Barto A G, Sutton R S, Anderson C W. Looking back on the actor-critic architecture. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2021, 51(1): 40−50 doi: 10.1109/TSMC.2020.3041775
[125]	Kaiser L, Babaeizadeh M, Miłos P, Osiński B, Campbell R H, Czechowski K, et al. Model based reinforcement learning for Atari. In: Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020.
[126]	Moerland T M, Broekens J, Plaat A, Jonker C M. Model-based reinforcement learning: A survey. Foundations and Trends® in Machine Learning, 2023, 16(1): 1−118
[127]	Gu A, Goel K, Ré C. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv: 2111.00396, 2021.
[128]	Smith J T H, Warrington A, Linderman S. Simplified state space layers for sequence modeling. In: Proceedings of the Eleventh International Conference on Learning Representations. Kigali, Rwanda: ICLR, 2023.
[129]	Deng F, Park J, Ahn S. Facing off world model backbones: RNNs, transformers, and S4. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates Inc., 2024. Article No. 3188
[130]	Chen J Y, Li S E, Tomizuka M. Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(6): 5068−5078 doi: 10.1109/TITS.2020.3046646
[131]	Hafner D, Lillicrap T, Ba J, Norouzi M. Dream to control: Learning behaviors by latent imagination. In: Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020.
[132]	Zhang Y H, Mu Y, Yang Y J, Guan Y, Li S E, Sun Q, et al. Steadily learn to drive with virtual memory. arXiv preprint arXiv: 2102.08072, 2021.
[133]	Gao Z Y, Mu Y, Chen C, Duan J L, Luo P, Lu Y F, et al. Enhance sample efficiency and robustness of end-to-end urban autonomous driving via semantic masked world model. IEEE Transactions on Intelligent Transportation Systems, 2024, 25(10): 13067−13079 doi: 10.1109/TITS.2024.3400227
[134]	Yu N G, Lv Z X, Yan J H, Wang Z X. Spatial cognition and decision model based on hippocampus-prefrontal cortex interaction. In: Proceedings of the China Automation Congress (CAC). Chongqing, China: IEEE, 2023. 3754−3759