›› 2012,Vol. 27 ›› Issue (3): 506-514.doi: 10.1007/s11390-012-1239-7

所属专题: Data Management and Data Mining

• • 上一篇    下一篇

基于社会关系网络分析的垃圾短信检测

Jian-Yun Liu1 (刘建芸), Yu-Hang Zhao1 (赵宇航), Member, CCF, Zhao-Xiang Zhang1 (张兆翔), Member, CCF, ACM, IEEE, Yun-Hong Wang1 (王蕴红), Member, CCF, ACM, IEEE, Xue-Mei Yuan2 (袁雪梅), Lei Hu2 (胡磊), Member, CCF, and Zhen-Jiang Dong2 (董振江)   

  • 收稿日期:2011-12-31 修回日期:2012-03-13 出版日期:2012-05-05 发布日期:2012-05-05

Spam Short Messages Detection via Mining Social Networks

Jian-Yun Liu1 (刘建芸), Yu-Hang Zhao1 (赵宇航), Member, CCF, Zhao-Xiang Zhang1 (张兆翔), Member, CCF, ACM, IEEE, Yun-Hong Wang1 (王蕴红), Member, CCF, ACM, IEEE, Xue-Mei Yuan2 (袁雪梅), Lei Hu2 (胡磊), Member, CCF, and Zhen-Jiang Dong2 (董振江)   

  1. 1. School of Computer Science and Engineering, Beihang University, Beijing 100191, China;
    2. ZTE Corporation, Nanjing 210012, China
  • Received:2011-12-31 Revised:2012-03-13 Online:2012-05-05 Published:2012-05-05
  • About author:Jian-Yun Liu received the B.E. degree from China University of Geo-sciences in 2009. He is now pursu-ing the M.S. degree in the Labora-tory of Intelligent Recognition and Image Processing at Beihang Univer-sity. His research interests include data mining, video analysis and pat-tern recognition.
  • Supported by:

    This work is supported by the National Natural Science Foundation of China under Grant No. 60873158, the National Basic Research 973 Program of China under Grant No. 2010CB327902, the Fundamental Research Funds for the Central Universities of China, and the Opening Funding of the State Key Laboratory of Virtual Reality Technology and Systems of China.

近年来随着手机的不断普及,短消息服务正越来越多地步入人们的生活,成为现代通讯的最重要手段之一.然而伴随短信服务而出现的大量商业类短信、骚扰类短信、诈骗类短信、煽动类短信正逐渐成为社会公害.目前运营商广泛使用的垃圾短信过滤手段主要可分为基于流量的方法和基于关键字匹配的方法,这些方法都存在着一定的不足,可以被针对性的破解.本文提出了一种基于社会网络分析的垃圾段短信过滤方法,在对短信发送/接收网络形式化表达的基础上,统计和分析了短信发送网络的图结构特性.本文使用了静态特征、动态特征和高级图特征来描述短信发送节点在网络中的行为.在真实的数据集中的实验结果验证了本文提出的方法的有效性.

Abstract: Short message service (SMS) is now becoming an indispensable way of social communication, and the problem of mobile spam is getting increasingly serious. We propose a novel approach for spam messages detection. Instead of conventional methods that focus on keywords or flow rate filtering, our system is based on mining under a more robust structure: the social network constructed with SMS. Several features, including static features, dynamic features and graph features, are proposed for describing activities of nodes in the network in various ways. Experimental results operated on real dataset prove the validity of our approach.

[1] Hidalgo J M G, Bringas G C, Sanz E P, Garcia F C. Contentbased SMS spam filtering. In Proc. the 2006 ACM Symposiumon Document Engineering, Amsterdam, Netherlands,Oct. 10-13, 2006, pp.107-114.

[2] http://www.isc.org.cn/zxzx/xhdt/listinfo-1532.html.

[3] Boykin P O, Roychowdhury V P. Leveraging social networksto fight spam. Computer, April 2005, 38(4): 61-68.

[4] Yoo S, Yang Y, Lin F, Moon I. Mining social networks forpersonalized email prioritization. In Proc. the 15th ACMSIGKDD, Paris, France, June 28-July 1, 2009, pp.967-976.

[5] Gomes L H, Almeida R B, Bettencourt L M A, Almeida V,Almeida J M. Comparative graph theoretical characterizationof networks of spam and legitimate email. In Proc. Conf.Email and Anti-Spam, California, CA, USA, July 21-22, 2005.

[6] Newman M E J. Modularity and community structure innetworks. Proceedings of National Academy of Sciences ofU.S.A., May 2006, 103(23): 8577-8582.

[7] Goldberg A V. Finding a maximum density subgraph. Tech-nical Report, Uni. California, Berkeley, 1984. http://www.eecs.berkeley.edu/Pubs/TechRpts/1984/CSD-84-171.pdf.

[8] Grandoni F. A note on the complexity of minimum dominat-ing set. J. Discrete Algorithms, June 2006, 4(2): 209-214.

[9] Edmonds J. Paths, trees, and flowers. Canadian Journal ofMathematics, 1965, 17(3): 449-467.

[10] Knuth D. Dancing links. Millennial Perspectives in ComputerScience, Nov. 2000: 187-214.

[11] Almeida V, Bestavros A, Crovella M, de Oliveira A. Chara-cterizing reference locality in the WWW. In Proc. the 4th Int.Conf. Parallel and Distributed Information Systems, MiamiBeach, USA, Dec. 18-20, 1996, pp.92-103.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] . 为基于分类法的知识表示系统架通现实世界语义与模型世界语义之间的桥梁[J]. , 2005, 20(3): 296 -308 .
[2] . [J]. , 2005, 20(4): 483 -490 .
[3] . 暂缺[J]. , 2007, 22(3): 379 -386 .
[4] . 软硬件划分的计算模型与算法[J]. , 2008, 23(4 ): 644 -651 .
[5] Belal Al-Khateeb1 and Graham Kendall2, Senior Member, IEEE. 前瞻深度对演化象棋的影响[J]. , 2012, 27(5): 996 -1006 .
[6] Ren-Jie He (贺仁杰) and Zhen-Yu Yang (杨振宇), Member, CCF, ACM, IEEE. 一种基于自适应变异和Levy概率分布的差分进化算法[J]. , 2012, 27(5): 1035 -1055 .
[7] Jie Tang, Chen Liu, Shao-Shan Liu, Zhi-Min Gu, Jean-Luc Gaudiot. 多核服务分工:Intel SCC下的XML数据解析研究[J]. , 2013, 28(1): 3 -13 .
[8] Chong Cao, Hai-Zhou Ai. 用户参与的人脸相似度学习[J]. , 2015, 30(3): 499 -510 .
[9] Huai-Yu Wan, Zhi-Wei Wang You-Fang Lin, Xu-Guang Jia, Yuan-Wei Zhou. 旅客社交网络中的家庭团体发现研究[J]. , 2015, 30(5): 1141 -1153 .
[10] Fan-Fu Zhou, Ru-Hui Ma, Jian Li, Li-Xia Chen, Wei-Dong Qiu, Hai-Bing Guan. 高性能网络虚拟化的优化技术[J]. , 2016, 31(1): 107 -116 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: