2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种改进的特征子集区分度评价准则

谢娟英 吴肇中 郑清泉 王明钊

谢娟英, 吴肇中, 郑清泉, 王明钊. 一种改进的特征子集区分度评价准则. 自动化学报, 2022, 48(5): 1292−1306 doi: 10.16383/j.aas.c200704
引用本文: 谢娟英, 吴肇中, 郑清泉, 王明钊. 一种改进的特征子集区分度评价准则. 自动化学报, 2022, 48(5): 1292−1306 doi: 10.16383/j.aas.c200704
Xie Juan-Ying, Wu Zhao-Zhong, Zheng Qing-Quan, Wang Ming-Zhao. An improved criterion for evaluating the discernibility of a feature subset. Acta Automatica Sinica, 2022, 48(5): 1292−1306 doi: 10.16383/j.aas.c200704
Citation: Xie Juan-Ying, Wu Zhao-Zhong, Zheng Qing-Quan, Wang Ming-Zhao. An improved criterion for evaluating the discernibility of a feature subset. Acta Automatica Sinica, 2022, 48(5): 1292−1306 doi: 10.16383/j.aas.c200704

一种改进的特征子集区分度评价准则

doi: 10.16383/j.aas.c200704
基金项目: 国家自然科学基金(62076159, 12031010, 61673251), 中央高校基本科研业务费(GK202105003)资助
详细信息
    作者简介:

    谢娟英:陕西师范大学计算机科学学院教授. 主要研究方向为机器学习, 数据挖掘, 生物医学大数据分析. 本文通信作者. E-mail: xiejuany@snnu.edu.cn

    吴肇中:陕西师范大学计算机科学学院硕士研究生. 主要研究方向为机器学习, 生物医学数据分析. E-mail: wzz@snnu.edu.cn

    郑清泉:陕西师范大学计算机科学学院硕士研究生. 主要研究方向为数据挖掘, 生物医学数据分析. E-mail: zhengqingqsnnu@163.com

    王明钊:陕西师范大学生命科学学院博士研究生. 2017 年获得陕西师范大学计算机科学学院硕士学位. 主要研究方向为生物信息学. E-mail: wangmz2017@snnu.edu.cn

An Improved Criterion for Evaluating the Discernibility of a Feature Subset

Funds: Supported by National Natural Science Foundation of China (62076159, 12031010, 61673251), Fundamental Research Funds for the Central Universities (GK202105003)
More Information
    Author Bio:

    XIE Juan-Ying Professor at the School of Computer Science, Shaanxi Normal University. Her research interest covers machine learning, data mining, and biomedical big data analysis. Corresponding author of this paper

    WU Zhao-Zhong Master student at the School of Computer Science, Shaanxi Normal University. His research interest covers machine learning and biomedical data analysis

    ZHENG Qing-Quan Master student at the School of Computer Science, Shaanxi Normal University. His research interest covers data mining and biomedical data analysis

    WANG Ming-Zhao Ph.D. candidate at the College of Life Sciences, Shaanxi Normal University. He received his master degree from the School of Computer Science, Shaanxi Normal University in 2017. His main research interest is bioinformatics

  • 摘要: 针对特征子集区分度准则(Discernibility of feature subsets, DFS)没有考虑特征测量量纲对特征子集区分能力影响的缺陷, 引入离散系数, 提出GDFS (Generalized discernibility of feature subsets)特征子集区分度准则. 结合顺序前向、顺序后向、顺序前向浮动和顺序后向浮动4种搜索策略, 以极限学习机为分类器, 得到4种混合特征选择算法. UCI数据集与基因数据集的实验测试, 以及与DFS、Relief、DRJMIM、mRMR、LLE Score、AVC、SVM-RFE、VMInaive、AMID、AMID-DWSFS、CFR和FSSC-SD的实验比较和统计重要度检测表明: 提出的GDFS优于DFS, 能选择到分类能力更好的特征子集.
  • 图  1  DFS+SFS算法的5-折交叉验证实验结果

    Fig.  1  The 5-fold cross-validation experimental results of DFS+SFS

    图  4  DFS+SBFS算法的5-折交叉验证实验结果

    Fig.  4  The 5-fold cross-validation experimental results of DFS+SBFS

    图  2  DFS+SBS算法的5-折交叉验证实验结果

    Fig.  2  The 5-fold cross-validation experimental results of DFS+SBS

    图  3  DFS+SFFS算法的5-折交叉验证实验结果

    Fig.  3  The 5-fold cross-validation experimental results of DFS+SFFS

    图  5  各特征选择算法的Nemenyi检验结果

    Fig.  5  Nemenyi test results of 13 feature selection algorithms in terms of performance metrics of ELM built on their selected features

    表  1  实验用UCI数据集描述

    Table  1  Descriptions of datasets from UCI

    数据集样本个数特征数类别数
    iris15043
    thyroid-disease21553
    glass21492
    wine178133
    Heart Disease297133
    WDBC569302
    WPBC194332
    dermatology358346
    ionosphere351342
    Handwrite3232562
    下载: 导出CSV

    表  2  GDFS+SFS与DFS+SFS算法的5-折交叉验证实验结果

    Table  2  The 5-fold cross-validation experimental results of GDFS+SFS and DFS+SFS algorithms

    Data sets#原特征#选择特征测试准确率
    GDFSDFSGDFSDFS
    iris42.230.97330.9667
    thyroid-disease51.41.60.91630.9070
    glass92.43.20.93460.9439
    wine133.63.60.92720.8925
    Heart Disease132.83.40.58890.5654
    WDBC303.46.20.92270.9193
    WPBC331.820.78350.7732
    dermatology344.650.71510.6938
    ionosphere344.430.90290.8717
    Handwrite2567.47.20.96570.9440
    平均43.13.43.820.86300.8478
    下载: 导出CSV

    表  5  GDFS+SBFS与DFS+SBFS算法的5-折交叉验证实验结果

    Table  5  The 5-fold cross-validation experimental results of GDFS+SBFS and DFS+SBFS algorithms

    Data sets#原特征#选择特征 测试准确率
    GDFSDFS GDFSDFS
    iris42.42.8 0.980.9667
    thyroid-disease52.42.2 0.93950.9209
    glass95.44 0.89790.9490
    wine139.29.4 0.65190.6086
    Heart Disease135.46.4 0.57570.5655
    WDBC3022.824.6 0.89110.8893
    WPBC3324.625.4 0.76810.7319
    dermatology3428.227.2 0.94440.9362
    ionosphere3428.426.2 0.91740.9087
    Handwrite256137.4148 0.99380.9722
    平均43.126.6227.62 0.85600.8449
    下载: 导出CSV

    表  3  GDFS+SBS与DFS+SBS算法的5-折交叉验证实验结果

    Table  3  The 5-fold cross-validation experimental results of GDFS+SBS and DFS+SBS algorithms

    Data sets#原特征#选择特征 测试准确率
    GDFSDFS GDFSDFS
    iris42.63.2 0.98670.9733
    thyroid-disease52.83.2 0.92690.9070
    glass98.26.8 0.95800.9375
    wine131211.6 0.68550.6515
    Heart Disease1311.811.8 0.54900.5419
    WDBC302828.8 0.89810.8616
    WPBC3330.831.6 0.77850.7633
    dermatology343131 0.94430.9303
    ionosphere3431.832.2 0.90310.8947
    Handwrite256245248.6 10.9936
    平均43.140.440.88 0.86300.8455
    下载: 导出CSV

    表  4  GDFS+SFFS与DFS+SFFS算法的5-折交叉验证实验结果

    Table  4  The 5-fold cross-validation experimental results of GDFS+SFFS and DFS+SFFS algorithms

    Data sets#原特征#选择特征 测试准确率
    GDFSDFS GDFSDFS
    iris42.83 0.98670.9667
    thyroid-disease52.22.2 0.93950.9349
    glass94.24.4 0.96290.9442
    wine134.24.4 0.92610.9041
    Heart Disease134.44.8 0.59280.5757
    WDBC301111.4 0.93850.9074
    WPBC335.84.4 0.79430.7886
    dermatology3416.817.4 0.95220.9552
    ionosphere349.610.2 0.91730.9231
    Handwrite25642.240.8 0.99070.9846
    平均43.110.3210.3 0.89920.8885
    下载: 导出CSV

    表  6  实验使用的基因数据集描述

    Table  6  Descriptions of gene datasets using in experiments

    数据集样本数特征数类别数
    Colon6220002
    Prostate102126252
    Myeloma173126252
    Gas2124222832
    SRBCT8323084
    Carcinoma174918211
    下载: 导出CSV

    表  7  各算法在表6基因数据集的5-折交叉验证实验结果

    Table  7  The 5-fold cross-validation experimental results of all algorithms on datasets from Table 6

    Data sets算法特征数AccuracyAUCrecallprecisionF-measureF2-measure
    ColonGDFS+SFFS5.20.75900.89250.90.70.780.4133
    DFS+SFFS5.40.72560.780.82500.68560.73520.2332
    Relief80.72310.75750.90.62910.73960.16
    DRJMIM130.72820.78250.87500.66420.74950.3250
    mRMR50.76020.73250.850.62810.71850.1578
    LLE Score70.75770.65630.87500.65370.74310.2057
    AVC20.72560.72970.860.64390.72560.2126
    SVM-RFE50.75770.75880.750.62730.67750.3260
    VMInaive20.7423110.64620.78480
    AMID80.74360.950.950.63280.75810
    AMID-DWSFS20.83970.98750.97500.66880.78950.1436
    CFR30.76030.9510.64620.78480
    FSSC-SD20.72690.97500.97500.64010.77210
    ProstateGDFS+SFFS6.40.93050.90290.88360.88360.88290.8818
    DFS+SFFS6.60.91050.93490.88160.88180.85290.8497
    Relief110.930.85250.82550.78240.79810.79
    DRJMIM90.940.86290.78910.87470.82160.83
    mRMR120.94140.78950.73270.78160.75200.7597
    LLE Score260.91190.67960.72910.65820.68470.6616
    AVC120.95140.81440.76550.75980.75920.7573
    SVM-RFE220.920.84530.69270.84740.75670.7824
    VMInaive90.94190.86050.76550.74180.74810.7580
    AMID270.93140.79290.76550.79360.76900.7797
    AMID-DWSFS40.95140.72510.71270.71710.70110.7098
    CFR70.94100.78400.880.74300.79220.7942
    FSSC-SD230.90240.77960.80180.82050.78920.8130
    MyelomaGDFS+SFFS9.60.79740.68050.89710.82300.85580.5463
    DFS+SFFS9.80.77440.62960.89710.80470.84740.3121
    Relief230.86160.64530.86930.82250.84150.4631
    DRJMIM360.85590.62100.83920.78810.81240.2682
    mRMR120.84360.63320.80950.80460.80670.3539
    LLE Score640.84920.61690.91270.79090.84610.2313
    AVC220.83290.58200.89740.80980.85010.3809
    SVM-RFE200.83300.62700.89710.79350.84160.3846
    VMInaive190.83830.56390.88470.79020.83310.2691
    AMID110.83250.67430.89790.82820.86030.5254
    AMID-DWSFS380.83810.62330.83810.81970.82490.5224
    CFR140.85040.59310.91240.80140.85230.3010
    FSSC-SD150.83810.66620.87540.81730.84380.4992
    Gas2GDFS+SFFS7.40.98400.97040.90510.98460.94120.9474
    DFS+SFFS8.40.94290.94650.90640.92120.92030.9018
    Relief40.97630.95200.85770.93160.89110.9005
    DRJMIM190.97500.90040.81920.88480.84490.8584
    mRMR50.97560.93580.85510.91310.88150.8895
    LLE Score250.97690.93120.86590.87480.84490.8538
    AVC30.98400.90730.88970.93900.91220.9160
    SVM-RFE180.97560.90090.82050.90520.85030.8716
    VMInaive100.97630.94250.73720.97780.83110.8778
    AMID160.98330.93050.92050.88290.89680.9013
    AMID-DWSFS20.98400.92470.83590.94240.88390.8977
    CFR100.99170.90800.90130.82360.84320.8434
    FSSC-SD160.95960.90950.85380.87580.85550.8642
    SRBCTGDFS+SFFS11.60.93720.97490.95670.96840.95790.9573
    DFS+SFFS11.60.90340.91300.93560.94490.94520.9352
    Relief100.96310.94790.94390.95890.94670.9390
    DRJMIM40.93890.93630.96560.95110.95550.9503
    mRMR80.95280.94790.92830.96240.92750.9294
    LLE Score110.92710.89410.93330.93320.92470.9154
    AVC80.90420.93550.91390.95440.92230.9183
    SVM-RFE130.84210.91490.91280.93850.91590.8240
    VMInaive140.94090.91810.92500.94290.92690.9188
    AMID130.93870.89990.95670.93350.94070.9239
    AMID-DWSFS90.91670.81510.81780.85160.820.7466
    CFR80.93140.68390.89940.85700.86930.7150
    FSSC-SD60.88060.90960.92670.94220.92840.9160
    CarcinomaGDFS+SFFS23.40.76220.90370.78720.78790.78390.5570
    DFS+SFFS19.40.74690.89980.78080.78690.78010.6261
    Relief420.73510.87010.76870.77850.76800.5392
    DRJMIM130.77570.89910.67420.66210.66560.4557
    mRMR240.80790.91880.76130.75050.75330.5089
    LLE Score760.66820.84520.66890.67020.66630.4109
    AVC770.72270.87460.78720.77900.77960.5068
    SVM-RFE300.72130.870.70270.69330.69290.4065
    VMInaive330.74430.87840.74870.75270.74410.4731
    AMID420.73070.88780.72950.71650.71940.4841
    AMID-DWSFS380.74120.62310.75580.74470.74570.4255
    CFR330.70540.62160.75140.740.74100.5315
    FSSC-SD210.73060.87160.70390.70160.69920.4344
    下载: 导出CSV

    表  8  各算法所选特征子集分类能力的Friedman检测结果

    Table  8  The Friedman's test of the classification capability of feature subsets of all algorithms

    AccuracyAUCrecallprecisionF-measureF2-measure
    ${\chi ^2}$23.409427.552722.158529.293626.760832.5446
    df121212121212
    p0.02440.00640.03580.00360.00840.0011
    下载: 导出CSV
  • [1] 陈晓云, 廖梦真. 基于稀疏和近邻保持的极限学习机降维. 自动化学报, 2019, 45(2): 325-333

    Chen Xiao-Yun, Liao Meng-Zhen. Dimensionality reduction with extreme learning machine based on sparsity and neighborhood preserving. Acta Automatica Sinica, 2019, 45(2): 325-333
    [2] Xie J Y, Lei J H, Xie W X, Shi Y, Liu X H. Two-stage hybrid feature selection algorithms for diagnosing erythemato-squamous diseases. Health Information Science and Systems, 2013, 1: Article No. 10 doi: 10.1186/2047-2501-1-10
    [3] 谢娟英, 周颖. 一种新聚类评价指标. 陕西师范大学学报(自然科学版), 2015, 43(6): 1-8

    Xie Juan-Ying, Zhou Ying. A new criterion for clustering algorithm. Journal of Shaanxi Normal University (Natural Science Edition), 2015, 43(6): 1-8
    [4] Kou G, Yang P, Peng Y, Xiao F, Chen Y, Alsaadi F E. Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Applied Soft Computing, 2020, 86: Article No. 105836 doi: 10.1016/j.asoc.2019.105836
    [5] Xue Y, Xue B, Zhang M J. Self-adaptive particle swarm optimization for large-scale feature selection in classification. ACM Transactions on Knowledge Discovery from Data, 2019, 13(5): Article No. 50
    [6] Zhang Y, Gong D W, Gao X Z, Tian T, Sun X Y. Binary differential evolution with self-learning for multi-objective feature selection. Information Sciences, 2020, 507: 67-85. doi: 10.1016/j.ins.2019.08.040
    [7] Nguyen B H, Xue B, Zhang M J. A survey on swarm intelligence approaches to feature selection in data mining. Swarm and Evolutionary Computation, 2020, 54: Article No. 100663 doi: 10.1016/j.swevo.2020.100663
    [8] Solorio-Fernández S, Carrasco-Ochoa J A, Martínez-Trinidad J F. A review of unsupervised feature selection methods. Artificial Intelligence Review, 2020, 53(2): 907-948 doi: 10.1007/s10462-019-09682-y
    [9] Karasu S, Altan A, Bekiros S, Ahmad W. A new forecasting model with wrapper-based feature selection approach using multi-objective optimization technique for chaotic crude oil time series.Energy, 2020, 212: Article No. 118750 doi: 10.1016/j.energy.2020.118750
    [10] Al-Tashi Q, Abdulkadir S J, Rais H, Mirjalili S, Alhussian H. Approaches to multi-objective feature selection: A systematic literature review. IEEE Access, 2020, 8: 125076-125096 doi: 10.1109/ACCESS.2020.3007291
    [11] Deng X L, Li Y Q, Weng J, Zhang J L. Feature selection for text classification: A review. Multimedia Tools and Applications, 2019, 78(3): 3797-3816 doi: 10.1007/s11042-018-6083-5
    [12] 贾鹤鸣, 李瑶, 孙康健. 基于遗传乌燕鸥算法的同步优化特征选择. 自动化学报, DOI: 10.16383/j.aas.c200322

    Jia He-Ming, Li Yao, Sun Kang-Jian. Simultaneous feature selection optimization based on hybrid sooty tern optimization algorithm and genetic algorithm. Acta Automatica Sinica, DOI: 10.16383/j.aas.c200322
    [13] Xie J Y, Wang C X. Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases. Expert Systems With Applications, 2011, 38(5): 5809-5815 doi: 10.1016/j.eswa.2010.10.050
    [14] Bolón-Canedo V, Alonso-Betanzos A. Ensembles for feature selection: A review and future trends. Information Fusion, 2019, 52: 1-12 doi: 10.1016/j.inffus.2018.11.008
    [15] Kira K, Rendell L A. The feature selection problem: Traditional methods and a new algorithm. In: Proceedings of the 10th National Conference on Artificial Intelligence. San Jos, USA: AAAI Press, 1992. 129−134
    [16] Kononenko I. Estimating attributes: Analysis and extensions of RELIEF. In: Proceedings of the 7th European Conference on Machine Learning. Catania, Italy: Springer, 1994. 171−182
    [17] Liu H, Setiono R. Feature selection and classification — a probabilistic wrapper approach. In: Proceedings of the 9th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems. Fukuoka, Japan: Gordon and Breach Science Publishers, 1997. 419−424
    [18] Guyon I, Weston J, Barnhill S. Gene selection for cancer classification using support vector machines. Machine Learning, 2002, 46(1-3): 389-422
    [19] Peng H C, Long F H, Ding C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(8): 1226-1238 doi: 10.1109/TPAMI.2005.159
    [20] Chen Y W, Lin C J. Combining SVMs with various feature selection strategies. Feature Extraction: Foundations and Applications. Berlin, Heidelberg: Springer, 2006. 315−324
    [21] 谢娟英, 王春霞, 蒋帅, 张琰. 基于改进的F-score与支持向量机的特征选择方法. 计算机应用, 2010, 30(4): 993-996 doi: 10.3724/SP.J.1087.2010.00993

    Xie Juan-Ying, Wang Chun-Xia, Jiang Shuai, Zhang Yan. Feature selection method combing improved F-score and support vector machine. Journal of Computer Applications, 2010, 30(4): 993-996 doi: 10.3724/SP.J.1087.2010.00993
    [22] 谢娟英, 雷金虎, 谢维信, 高新波. 基于D-score与支持向量机的混合特征选择方法. 计算机应用, 2011, 31(12): 3292-3296

    Xie Juan-Ying, Lei Jin-Hu, Xie Wei-Xin, Gao Xin-Bo. Hybrid feature selection methods based on D-score and support vector machine. Journal of Computer Applications, 2011, 31(12): 3292-3296
    [23] 谢娟英, 谢维信. 基于特征子集区分度与支持向量机的特征选择算法. 计算机学报, 2014, 37(8): 1704-1718

    Xie Juan-Ying, Xie Wei-Xin. Several feature selection algorithms based on the discernibility of a feature subset and support vector machines. Chinese Journal of Computers, 2014, 37(8): 1704-1718
    [24] 李建更, 逄泽楠, 苏磊, 陈思远. 肿瘤基因选择方法LLE Score. 北京工业大学学报, 2015, 41(8): 1145-1150

    Li Jian-Geng, Pang Ze-Nan, Su Lei, Chen Si-Yuan. Feature selection method LLE score used for tumor gene expressive data. Journal of Beijing University of Technology, 2015, 41(8): 1145-1150
    [25] Roweis S T, Saul L K. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000, 290(5500): 2323-2326 doi: 10.1126/science.290.5500.2323
    [26] Sun L, Wang J, Wei J M. AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity. BMC Bioinformatics, 2017, 18(Suppl 3): Article No. 50
    [27] 谢娟英, 王明钊, 胡秋锋. 最大化ROC曲线下面积的不平衡基因数据集差异表达基因选择算法. 陕西师范大学学报(自然科学版), 2017, 45(1): 13-22

    Xie Juan-Ying, Wang Ming-Zhao, Hu Qiu-Feng. The differentially expressed gene selection algorithms for unbalanced gene datasets by maximize the area under ROC. Journal of Shaanxi Normal University (Natural Science Edition), 2017, 45(1): 13-22
    [28] Hu L, Gao W F, Zhao K, Zhang P, Wang F. Feature selection considering two types of feature relevancy and feature interdependency. Expert Systems With Applications, 2018, 93: 423-434 doi: 10.1016/j.eswa.2017.10.016
    [29] Sun L, Zhang X Y, Qian Y H, Xu J C, Zhang S G. Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification. Information Sciences, 2019, 502:18-41 doi: 10.1016/j.ins.2019.05.072
    [30] 谢娟英, 王明钊, 周颖, 高红超, 许升全. 非平衡基因数据的差异表达基因选择算法研究. 计算机学报, 2019, 42(6): 1232-1251 doi: 10.11897/SP.J.1016.2019.01232

    Xie Juan-Ying, Wang Ming-Zhao, Zhou Ying, Gao Hong-Chao, Xu Sheng-Quan. Differential expression gene selection algorithms for unbalanced gene datasets. Chinese Journal of Computers, 2019, 42(6): 1232-1251 doi: 10.11897/SP.J.1016.2019.01232
    [31] Li J D, Cheng K W, Wang S H, Morstatter F, Trevino R P, Tang J L, et al. Feature selection: A data perspective. ACM Computing Surveys, 2018, 50(6): Article No. 94
    [32] 刘春英, 贾俊平. 统计学原理. 北京: 中国商务出版社, 2008.

    Liu Chun-Ying, Jia Jun-Ping. The Principles of Statistics. Beijing: China Commerce and Trade Press, 2008.
    [33] Huang G B, Zhu Q Y, Siew C K. Extreme learning machine: Theory and applications. Neurocomputing, 2006, 70(1-3): 489-501 doi: 10.1016/j.neucom.2005.12.126
    [34] Frank A, Asuncion A. UCI machine learning repository [Online], available: http://archive.ics.uci.edu/ml, October 13, 2020
    [35] Chang C C, Lin C J. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): Article No. 27
    [36] Hsu C W, Chang C C, Lin C J. A practical guide to support vector classification [Online], available: https://www.ee.columbia.edu/~sfchang/course/spr/papers/svm-practical-guide.pdf, March 11, 2021
    [37] Alon U, Barkai N, Notterman D A, Gish K, Ybarra S, Mack D, et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences of the United States of America, 1999, 96(12): 6745-6750 doi: 10.1073/pnas.96.12.6745
    [38] Singh D, Febbo P G, Ross K, Jackson D G, Manola J, Ladd C, et al. Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 2002, 1(2): 203-209 doi: 10.1016/S1535-6108(02)00030-2
    [39] Tian E M, Zhan F H, Walker R, Rasmussen E, Ma Y P, Barlogie B, et al. The role of the Wnt-signaling antagonist DKK1 in the development of osteolytic lesions in multiple myeloma. The New England Journal of Medicine, 2003, 349(26): 2483-2494 doi: 10.1056/NEJMoa030847
    [40] Wang G S, Hu N, Yang H H, Wang L M, Su H, Wang C Y, et al. Comparison of global gene expression of gastric cardia and noncardia cancers from a high-risk population in China. PLoS One, 2013, 8(5): Article No. e63826 doi: 10.1371/journal.pone.0063826
    [41] Li W Q, Hu N, Burton V H, Yang H H, Su H, Conway C M, et al. PLCE1 mRNA and protein expression and survival of patients with esophageal squamous cell carcinoma and gastric adenocarcinoma. Cancer Epidemiology, Biomarkers & Prevention, 2014, 23(8): 1579-1588
    [42] Khan J, Wei J S, Ringnér M, Saal L H, Ladanyi M, Westermann F, et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine, 2001, 7(6): 673-679 doi: 10.1038/89044
    [43] Gao S Y, Steeg G V, Galstyan A. Variational information maximization for feature selection. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates, 2016. 487−495
    [44] Gao W F, Hu L, Zhang P, He J L. Feature selection considering the composition of feature relevancy. Pattern Recognition Letters, 2018, 112: 70-74 doi: 10.1016/j.patrec.2018.06.005
    [45] 谢娟英, 丁丽娟, 王明钊. 基于谱聚类的无监督特征选择算法. 软件学报, 2020, 31(4): 1009-1024

    Xie Juan-Ying, Ding Li-Juan, Wang Ming-Zhao. Spectral clustering based unsupervised feature selection algorithms. Journal of Software, 2020, 31(4): 1009-1024
    [46] Muschelli III J. ROC and AUC with a binary predictor: A potentially misleading metric. Journal of Classification, 2020, 37(3): 696-708 doi: 10.1007/s00357-019-09345-1
    [47] Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters, 2006, 27(8): 861-874 doi: 10.1016/j.patrec.2005.10.010
    [48] Bowers A J, Zhou X L. Receiver operating characteristic (ROC) area under the curve (AUC): A diagnostic measure for evaluating the accuracy of predictors of education outcomes. Journal of Education for Students Placed at Risk (JESPAR), 2019, 24(1): 20-46 doi: 10.1080/10824669.2018.1523734
    [49] 卢绍文, 温乙鑫. 基于图像与电流特征的电熔镁炉欠烧工况半监督分类方法. 自动化学报, 2021, 47(4): 891-902

    Lu Shso-Wen, Wen Yi-Xin. Semi-supervised classification of semi-molten working condition of fused magnesium furnace based on image and current features. Acta Automatica Sinica, 2021, 47(4): 891-902
    [50] Xie J Y, Gao H C, Xie W X, Liu X H, Grant P W. Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Information Sciences, 2016, 354: 19-40 doi: 10.1016/j.ins.2016.03.011
    [51] 谢娟英, 吴肇中, 郑清泉. 基于信息增益与皮尔森相关系数的2D自适应特征选择算法. 陕西师范大学学报(自然科学版), 2020, 48(6): 69-81

    Xie Juan-Ying, Wu Zhao-Zhong, Zheng Qing-Quan. An adaptive 2D feature selection algorithm based on information gain and pearson correlation coefficient. Shaanxi Normal University (Natural Science Edition), 2020, 48(6): 69-81
  • 加载中
图(5) / 表(8)
计量
  • 文章访问数:  998
  • HTML全文浏览量:  379
  • PDF下载量:  118
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-09-01
  • 修回日期:  2021-03-02
  • 网络出版日期:  2021-04-25
  • 刊出日期:  2022-05-13

目录

    /

    返回文章
    返回