-
摘要: 提出一种核k最近邻算法. 首先给出用于最近邻学习的信息能度量方法, 该方法克服了高维数据不便于用传统距离度量表示的困难, 提高了数据间类别相似性和距离的一致性. 在此基础上, 将传统的kNN扩展为非线性形式, 并采用半正定规划学习全局最优的度量矩阵. 算法主要特点是: 能较好地适用于高维数据, 并有效提升kNN 的分类性能. 多个数据集的实验和分析表明, 本文的Kernel-kNN算法与传统的kNN算法比较, 在低维数据上, 分类准确率相当; 在高维数据上, 分类性能有明显提高.
-
关键词:
- 距离度量 /
- 非线性变换 /
- k-最近邻(k-NN) /
- 核方法
Abstract: This paper proposes a new algorithm named Kernel-kNN. To begin with, an approach for information energy metric is proposed, which is used to learn the nearest neighbor. This method overcomes the inconvenience for distance metric expression with high dimensional data set, and improves the consistency between the class similarity and the distance. Meanwhile, the traditional kNN is extended to an nonlinear form, and semidefinite programming is usd to learn the globally optimal metric matrix. The main characteristic of the proposed algorithm is that it is suitable for high dimensional data set, and can improve the classification performance efficiently. Experiments and analysis on many data sets have shown that Kernel-kNN can get the common performance in low dimensional data, and have a significant improvement on large scale data in high dimensions.-
Key words:
- Distance metric /
- nonlinear transformation /
- k-nearest neighbor (kNN) /
- kernel method
计量
- 文章访问数: 2079
- HTML全文浏览量: 54
- PDF下载量: 1624
- 被引次数: 0