-
摘要: 对垃圾短信发送用户的识别和过滤具有十分重要的研究价值和社会意义. 随着新形式和内容的垃圾短信出现, 传统的关键字匹配和发送速度频率过滤方法无法有效地处理这一问题. 在对短信发送/接收网络形式化表达的基础上, 以真实短信发送和接收以及通话关系数据为例, 统计和分析了短信发送网络的网络特性. 进一步分析和挖掘了垃圾短信用户在网络上发送接收的异常模式和行为, 并以此提出了一个基于语音关联程度和短信回复比率的过滤算法(NASFA算法). 通过实验和分析表明, 本文的算法能够高效地识别垃圾短信发送用户, 同时能够有效地控制将正常用户误识别为垃圾短信用户的比率.Abstract: It is very important to recognize and filter the spam short messages (SMS). As the contents and formats of spam messages are diverse, the ordinary filtering methods based on keyword matching and sending speed can not tackle this problem effectively. This paper first presents a formalized representation of the SMS network. On the basis of real short message samples, the social characteristics of the SMS network are analyzed and studied. Further analysis and statistical work are carried out to discover the un-normal patterns of spam senders in SMS network. An $N$-degree association spam filter algorithm (NASFA) based on the un-normal patterns of spam senders is presented. Experiments and analysis show that the algorithm can efficiently recognize spam senders, and the wrong recognition rate is reduced significantly.
点击查看大图
计量
- 文章访问数: 2673
- HTML全文浏览量: 94
- PDF下载量: 1350
- 被引次数: 0