摘要:
CAPTCHA是一种阻止机器人滥用自然人资源的网络安全机制.研究CAPTCHA识别技术有助于发现CAPTCHA自身的缺陷,促使其变得更加安全. 针对现有方法难以识别的高粘着CAPTCHA,本文提出了一种新的识别算法. 该算法首先使用递归神经网络(Recurrent neural network, RNN) 对CAPTCHA进行识别,然后为了提高识别结果的可靠性, 提出了一种基于SVM的拒识新算法,并使用数据降维方法对拒识特征进行降维. 实验结果表明: 1)本文所提识别算法能够识别高粘着型CAPTCHA,并且识别结果具有高可靠性; 2)新的拒识算法相对于其他拒识算法具有明显优势; 3)数据降维方法能够进一步改善拒识算法的性能,从而取得更高的可靠性.
Abstract:
CAPTCHA is a kind of network security mechanism that blocks machines from abusing network resource owned by human. Studying the recognition of CAPTCHA can help to find its hidden defects, and thus make it securer. To read closely-connected CAPTCHA that can hardly be recognized by methods of state of art, this paper brought up a new recognition algorithm based on rejection. During the process of this algorithm, recurrent neural network (RNN) was first used to recognize the unknown CAPTCHA. Then, to make the recognition results reliable, a new rejection algorithm was brought up. Data dimension reduction was also performed on rejection features. Experiment results show the following three points: Firstly, our new recognition algorithm can recognize closely-connected CAPTCHA with high reliability. Secondly, the new rejection algorithm is superior to other methods of state of art. Lastly, data dimension reduction algorithm can improve the performance of the rejection algorithm, thus making the recognition results more reliable.