基于低秩约束的熵加权多视角模糊聚类算法

张嘉旭; 王骏; 张春香; 林得富; 周塔; 王士同

doi:10.16383/j.aas.c190350

基于低秩约束的熵加权多视角模糊聚类算法

doi: 10.16383/j.aas.c190350

张嘉旭^1,,
王骏^{1, 2,},
张春香^1,,
林得富^1,,
周塔^3,,
王士同^1,

1.
江南大学数字媒体学院无锡 214122
2.
上海大学通信与信息工程学院上海 200444
3.
江苏科技大学电子信息学院镇江 212100

基金项目: 国家自然科学基金(61772239), 江苏省自然科学基金(BK20181339)资助

详细信息

作者简介:
张嘉旭：江南大学数字媒体学院硕士研究生. 主要研究方向为人工智能和模式识别. E-mail: zhangjiaxu@hl.chinamobile.com

王骏：上海大学通信与信息工程学院副教授. 主要研究方向为人工智能, 模糊聚类和医学图像分类. 本文通信作者. E-mail: wangjun_sytu@hotmail.com

张春香：江南大学数字媒体学院硕士研究生. 主要研究方向为人工智能和模式识别. E-mail: 17851308360@163.com

林得富：江南大学数字媒体学院硕士研究生. 主要研究方向为人工智能和模式识别. E-mail: jiangnandaxu_2022@yeah.net

周塔：江苏科技大学电子信息学院副教授. 主要研究方向为人工智能, 模式识别与智能系统. E-mail: jkdzhout@just.edu.cn

王士同：江南大学数字媒体学院教授. 主要研究方向为人工智能和模式识别. E-mail: wxwangst@aliyun.com

计量
- 文章访问数: 867
- HTML全文浏览量: 236
- PDF下载量: 217
- 被引次数: 0
出版历程
- 收稿日期: 2019-05-09
- 录用日期: 2019-07-17
- 网络出版日期: 2022-05-31
- 刊出日期: 2022-07-01

Entropy-weighting Multi-view Fuzzy C-means With Low Rank Constraint

ZHANG Jia-Xu^1
,,
WANG Jun^{1, 2
,},
ZHANG Chun-Xiang^1
,,
LIN De-Fu^1
,,
ZHOU Ta^3
,,
WANG Shi-Tong^1
,

1.
School of Digital Media, Jiangnan University, Wuxi 214122
2.
School of Communication and Information Engineering, Shanghai University, Shanghai 200444
3.
School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212100

Funds: Supported by National Natural Science Foundation of China (61772239) and Natural Science Foundation of Jiangsu Province (BK20181339)

More Information

Author Bio:
ZHANG Jia-Xu　Master student at the School of Digital Media, Jiangnan University. His research interest covers artificial intelligence and data mining

WANG Jun　Associate professor at the School of Communication and Information Engineering,Shanghai University. His research interest covers artificial intelligence, fuzzy clustering, and medical image classification. Corresponding author of this paper

ZHANG Chun-Xiang　Master student at the School of Digital Media, Jiangnan University. Her research interest covers artificial intelligence and data mining

LIN De-Fu　Master student at the School of Digital Media, Jiangnan University. His research interest covers artificial intelligence and data mining

ZHOU Ta　Associate professor at the School of Electronic Information, Jiangsu University of Science and Technology. His research interest covers artificial intelligence, pattern recognition, and intelligent systems

WANG Shi-Tong　Professor at the School of Digital Media, Jiangnan University. His research interest covers artificial intelligence and data mining

摘要

摘要: 如何有效挖掘多视角数据内部的一致性以及差异性是构建多视角模糊聚类算法的两个重要问题. 本文在Co-FKM算法框架上, 提出了基于低秩约束的熵加权多视角模糊聚类算法(Entropy-weighting multi-view fuzzy C-means with low rank constraint, LR-MVEWFCM). 一方面, 从视角之间的一致性出发, 引入核范数对多个视角之间的模糊隶属度矩阵进行低秩约束; 另一方面, 基于香农熵理论引入视角权重自适应调整策略, 使算法根据各视角的重要程度来处理视角间的差异性. 本文使用交替方向乘子法(Alternating direction method of multipliers, ADMM)进行目标函数的优化. 最后, 人工模拟数据集和UCI (University of California Irvine)数据集上进行的实验结果验证了该方法的有效性.
- 多视角模糊聚类 /
- 香农熵 /
- 低秩约束 /
- 核范数 /
- 交替方向乘子法
Abstract: Effective mining both internal consistency and diversity of multi-view data is important to develop multi-view fuzzy clustering algorithms. In this paper, we propose a novel multi-view fuzzy clustering algorithm called entropy-weighting multi-view fuzzy c-means with low-rank constraint (LR-MVEWFCM). On the one hand, we introduce the nuclear norm as the low-rank constraint of the fuzzy membership matrix. On the other hand, the adaptive adjustment strategy of view weight is introduced to control the differences among views according to the importance of each view. The learning criterion can be optimized by the alternating direction method of multipliers (ADMM). Experimental results on both artificial and UCI (University of California Irvine) datasets show the effectiveness of the proposed method.
- Multi-view fuzzy clustering /
- Shannon entropy /
- low-rank constraint /
- nuclear norm /
- alternating direction method of multipliers (ADMM)

HTML全文

图 1 Co-FKM算法处理多视角聚类任务工作流程

Fig. 1 Co-FKM algorithm for multi-view clustering task

下载: 全尺寸图片幻灯片

图 2 LR-MVEWFCM算法处理多视角聚类任务工作流程

Fig. 2 LR-MVEWFCM algorithm for multi-view clustering task

下载: 全尺寸图片幻灯片

图 3 模拟数据集及各视角数据集

Fig. 3 Simulated data under multiple views

下载: 全尺寸图片幻灯片

图 4 低秩约束对算法性能的影响(横坐标为数据集编号, 纵坐标为聚类性能指标)

Fig. 4 The influence of low rank constraints on the performance of the algorithm (the X-coordinate is the data set number and the Y-coordinate is the clustering performance index)

下载: 全尺寸图片幻灯片

图 5 LR-MVEWFCM算法的收敛曲线

Fig. 5 Convergence curve of LR-MVEWFCM algorithm

下载: 全尺寸图片幻灯片

图 6 模拟数据集7上参数敏感性分析

Fig. 6 Sensitivity analysis of parameters on simulated dataset 7

下载: 全尺寸图片幻灯片

表 1 参数定义和设置

Table 1 Parameter setting in the experiments

算法	算法说明	参数设置
FCM	经典的单视角模糊聚类算法	模糊指数$m=\frac{\min (N, D-1)}{\min (N, D-1)-2}$, 其中, $N$表示样本数, $D$表示样本维数
CombKM	组合${\rm{K}}\text{-}{\rm{means}}$算法	—
Co-FKM	多视角协同划分的模糊聚类算法	模糊指数$m=\frac{\min (N, D-1)}{\min (N, D-1)-2}$, 协同学习系数$\eta{}\in{}\frac{K-1}{K}$, 其中, $K$为视角数, 步长$\rho{}=0.01$
Co-Clustering	基于样本与特征空间的协同聚类算法	正则化系数$\lambda \in\left\{10^{-3}, 10^{-2}, \cdots, 10^{3}\right\}$, 正则化系数$\mu \in\left\{10^{-3}, 10^{-2}, \cdots, 10^{3}\right\}$
LR-MVEWFCM	基于低秩约束的熵加权多视角模糊聚类算法	视角权重平衡因子$\lambda{}\in{}\left\{{10}^{-5}, {10}^{-4}, \cdots{}, {10}^5\right\}$, 低秩约束正则项系数$\theta{}\in{}\left\{{10}^{-3}, 10^{-2}, \cdots{}, {10}^3\right\}$, 模糊指数$m=2$
MVEWFCM	LR-MVEWFCM 算法中低秩约束正则项系数$\theta{}=0$	视角权重平衡因子$\lambda{}\in{}\left\{{10}^{-5}, {10}^{-4}, \cdots{}, {10}^5\right\}$, 模糊指数$m=2$

下载: 导出CSV

表 2 模拟数据集特征组成

Table 2 Characteristic composition of simulated dataset

视角	包含特征
视角 1	$x,y$
视角 2	$y,z$
视角 3	$x,z$

下载: 导出CSV

表 3 模拟数据实验算法性能对比

Table 3 Performance comparison of the proposed algorithms on simulated dataset

编号	包含特征	NMI	RI
1	视角1	1.0000 ± 0.0000	1.0000 ± 0.0000
2	视角2	0.7453 ± 0.0075	0.8796 ± 0.0081
3	视角3	0.8750 ± 0.0081	0.9555 ± 0.0006
4	视角1, 视角2	1.0000 ± 0.0000	1.0000 ± 0.0000
5	视角1, 视角3	1.0000 ± 0.0000	1.0000 ± 0.0000
6	视角2, 视角3	0.9104 ± 0.0396	0.9634 ± 0.0192
7	视角2, 视角3	1.0000 ± 0.0000	1.0000 ± 0.0000

下载: 导出CSV

表 4 模拟数据集7上各算法的性能比较

Table 4 Performance comparison of the proposed algorithms on simulated dataset 7

数据集	指标	Co-Clustering	CombKM	FCM	Co-FKM	LR-MVEWFCM
A	NMI-mean	1.0000	0.9305	1.0000	1.0000	1.0000
	NMI-std	0.0000	0.1464	0.0000	0.0000	0.0000
	RI-mean	1.0000	0.9445	1.0000	1.0000	1.0000
	RI-std	0.0000	0.1171	0.0000	0.0000	0.0000

下载: 导出CSV

表 5 基于UCI数据集构造的多视角数据

Table 5 Multi-view data constructded based on UCI dataset

编号	原数据集	说明	视角特征	样本	视角	类别
8	IS	Shape	9	2 310	2	7
8	IS	RGB	9	2 310	2	7
9	Iris	Sepal长度	2	150	2	3
		Sepal宽度	2
		Petal长度	2
		Petal宽度	2
10	Balance	天平左臂重量	2	625	2	3
		天平左臂长度	2
		天平右臂重量	2
		天平右臂长度	2
11	Iris	Sepal长度	1	150	4	3
		Sepal宽度	1
		Petal长度	1
		Petal宽度	1
12	Balance	天平左臂重量	1	625	4	3
		天平左臂长度	1
		天平右臂重量	1
		天平右臂长度	1
13	Ionosphere	每个特征单独作为一个视角	1	351	34	2
14	Wine	每个特征单独作为一个视角	1	178	13	3

下载: 导出CSV

表 6 5种聚类方法的NMI值比较结果

Table 6 Comparison of NMI performance of five clustering methods

编号	Co-Clustering		CombKM		FCM		Co-FKM		LR-MVEWFCM
编号	均值	P-value	均值	P-value	均值	P-value	均值	P-value	均值
8	0.5771 ± 0.0023	0.0019	0.5259 ± 0.0551	0.2056	0.5567 ± 0.0184	0.0044	0.5881 ± 0.0109	3.76×10⁻⁴	0.5828 ± 0.0044
9	0.7582 ± 7.4015 ×10⁻¹⁷	2.03×10⁻²⁴	0.7251 ± 0.0698	2.32×10⁻⁷	0.7578 ± 0.0698	1.93×10⁻²⁴	0.8317 ± 0.0064	8.88×10⁻¹⁶	0.9029 ± 0.0057
10	0.2455 ± 0.0559	0.0165	0.1562 ± 0.0749	3.47×10⁻⁵	0.1813 ± 0.1172	0.0061	0.2756 ± 0.0309	0.1037	0.3030 ± 0.0402
11	0.7582 ± 1.1703×10⁻¹⁶	2.28×10⁻¹⁶	0.7468 ± 0.0079	5.12×10⁻¹⁶	0.7578 ± 1.1703×10⁻¹⁶	5.04×10⁻¹⁶	0.8244 ± 1.1102×10⁻¹⁶	2.16×10⁻¹⁶	0.8768 ± 0.0097
12	0.2603 ± 0.0685	0.3825	0.1543 ± 0.0763	4.61×10⁻⁴	0.2264 ± 0.1127	0.1573	0.2283 ± 0.0294	0.0146	0.2863 ± 0.0611
13	0.1385 ± 0.0085	2.51×10⁻⁹	0.1349 ± 2.9257×10⁻¹⁷	2.35×10⁻¹³	0.1299 ± 0.0984	2.60×10⁻¹⁰	0.2097 ± 0.0329	0.0483	0.2608 ± 0.0251
14	0.4288 ± 1.1703×10−16	1.26×10⁻⁰⁸	0.4215 ± 0.0095	7.97×10⁻⁰⁹	0.4334 ± 5.8514×10⁻¹⁷	2.39×10⁻⁰⁸	0.5295 ± 0.0301	0.4376	0.5413 ± 0.0364

下载: 导出CSV

表 7 5种聚类方法的RI值比较结果

Table 7 Comparison of RI performance of five clustering methods

编号	Co-Clustering		CombKM		FCM		Co-FKM		LR-MVEWFCM
编号	均值	P-value	均值	P-value	均值	P-value	均值	P-value	均值
8	0.8392 ± 0.0010	1.3475 ×10⁻¹⁴	0.8112 ± 0.0369	1.95×10⁻⁷	0.8390 ± 0.0115	0.0032	0.8571 ± 0.0019	0.0048	0.8508 ± 0.0013
9	0.8797 ± 0.0014	1.72×10⁻²⁶	0.8481 ± 0.0667	2.56×10⁻⁵	0.8859 ± 1.1703×10⁻¹⁶	6.49×10⁻²⁶	0.9358 ± 0.0037	3.29×10⁻¹⁴	0.9665 ± 0.0026
10	0.6515 ± 0.0231	3.13×10⁻⁴	0.6059 ± 0.0340	1.37×10⁻⁶	0.6186 ± 0.0624	0.0016	0.6772 ± 0.0227	0.0761	0.6958 ± 0.0215
11	0.8797 ± 0.0014	1.25×10⁻¹⁸	0.8755 ± 0.0029	5.99×10⁻¹²	0.8859 ± 0.0243	2.33×10⁻¹⁸	0.9267 ± 2.3406×10⁻¹⁶	5.19×10⁻¹⁸	0.9527 ± 0.0041
12	0.6511 ± 0.0279	0.0156	0.6024 ± 0.0322	2.24×10⁻⁵	0.6509 ± 0.0652	0.1139	0.6511 ± 0.0189	0.008	0.6902 ± 0.0370
13	0.5877 ± 0.0030	1.35×10⁻¹²	0.5888 ± 0.0292	2.10×10⁻¹⁴	0.5818 ± 1.1703×10⁻¹⁶	4.6351 ×10⁻¹³	0.6508 ± 0.0147	0.0358	0.6855 ± 0.0115
14	0.7187 ± 1.1703×10⁻¹⁶	3.82×10⁻⁶	0.7056 ± 0.0168	1.69×10⁻⁶	0.7099 ± 1.1703×10⁻¹⁶	8.45×10⁻⁷	0.7850 ± 0.0162	0.5905	0.7917 ± 0.0353

下载: 导出CSV

参考文献(20)

[1]	Xu C, Tao D, Xu C. Multi-view Learning with Incomplete Views[J]. IEEE Transactions on Image Processing, 2015, 24(12): 5812-5825 doi: 10.1109/TIP.2015.2490539
[2]	Brefeld U. Multi-view learning with dependent views. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, Salamanca, Spain: ACM, 2015. 865−870
[3]	Muslea I, Minton S, Knoblock C A. Active Learning with Multiple Views[J]. Journal of Artificial Intelligence Research, 2006, 27(1): 203-233
[4]	Zhang C, Adeli E, Wu Z, et al. Infant brain development prediction with latent partial multi-view representation learning[J]. IEEE Transactions on Medical Imaging, 2018, 38(4): 909-918
[5]	Bickel S, Scheffer T. Multi-view clustering. In: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM＇04), Brighton, UK: IEEE, 2004. 19−26
[6]	Wang Y, Chen L. Multi-view fuzzy clustering with minimax optimization for effective clustering of data from multiple sources[J]. Expert Systems with Applications, 2017, 72: 457-466 doi: 10.1016/j.eswa.2016.10.006
[7]	王骏, 王士同, 邓赵红. 聚类分析研究中的若干问题[J]. 控制与决策, 2012, 27(3): 321-328 Wang J, Wang S T, Deng Z H. Survey on challenges in clustering analysis research. Control and Decision, 2012, 27(3): 321-328
[8]	Pedrycz W. Collaborative fuzzy clustering[J]. Pattern Recognition Letters, 2002, 23(14): 1675-1686 doi: 10.1016/S0167-8655(02)00130-7
[9]	Cleuziou G, Exbrayat M, Martin L, Sublemontier J H. CoFKM: A centralized method for multiple-view clustering. In: Proceedings of the 9th IEEE International Conference on Data Mining, Miami, FL, USA: IEEE, 2009. 752−757
[10]	Jiang Y, Chung F L, Wang S, et al. Collaborative fuzzy clustering from multiple weighted views[J]. IEEE Trans Cybern, 2015, 45(4): 688-701 doi: 10.1109/TCYB.2014.2334595
[11]	Bettoumi S, Jlassi C, Arous N. Collaborative multi-view k-means clustering[J]. Soft Computing, 2019, 23(3): 937-945
[12]	Zhang G Y, Wang C D, Huang D, et al. TW-Co-k-means: two-level weighted collaborative k-means for multi-view clustering[J]. Knowledge-Based Systems, 2018, 150: 127-138 doi: 10.1016/j.knosys.2018.03.009
[13]	Cao X C, Zhang C Q, Fu H Z, Liu S, Zhang H. Diversity-induced multi-view subspace clustering. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA: IEEE, 2015. 586−594
[14]	Zhang C Q, Fu H Z, Liu S, Liu G C, Cao X C. Low-rank tensor constrained multiview subspace clustering. In: Proceedings of the 2015 IEEE International Conference on Computer Visio, Santiago, Chile: IEEE, 2015. 1582−1590
[15]	Boyd S, Parikh N, Chu E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Foundations and Trends in Machine learning, 2011, 3(1): 1-122
[16]	Liu G, Lin Z, Yan S, et al. Robust Recovery of Subspace Structures by Low-Rank Representation[J]. In: Proceedings of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 171-184 doi: 10.1109/TPAMI.2012.88
[17]	Bezdek J C, Ehrlich R, Full W. FCM: The fuzzy c -means clustering algorithm[J]. Computers Geosciences, 1984, 10(2): 191-203
[18]	Abavisani M, Patel V M. Multimodal sparse and low-rank subspace clustering[J]. Information Fusion, 2018, 39: 168-177 doi: 10.1016/j.inffus.2017.05.002
[19]	Gu Q Q, Zhou J. Learning the shared subspace for multi-task clustering and transductive transfer classification. In: Proceedings of the 9th IEEE International Conference on Data Mining, Miami beach, FL, USA: IEEE, 2009. 159−168
[20]	Gu Q Q, Zhou J. Co-clustering on manifolds. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France: ACM, 2009. 359−368