›› 2015, Vol. 30 ›› Issue (5): 942-956.doi: 10.1007/s11390-015-1573-7

Special Issue: Data Management and Data Mining

• Special Section on Software Systems • Previous Articles     Next Articles

Detecting Android Malware Using Clone Detection

Jian Chen1(陈健), Manar H. Alalfi2, Member, ACM, IEEE, Thomas R. Dean1, Ying Zou1(邹颖)   

  1. 1 Department of Electrical and Computer Engineering, Queen's University, Kingston, K7L 3N6, Canada;
    2 School of Computing, Queen's University, Kingston, K7L 3N6, Canada
  • Received:2015-03-27 Revised:2015-07-27 Online:2015-09-05 Published:2015-09-05
  • About author:Jian Chen received his M.S. degree in computer science from the Queen's University, Kingston, in 2014. He has worked as a software developer for many years. He is pursuing his Ph.D. degree at Queen's University.
  • Supported by:

    The research is supported by the Ontario Research Fund of Canada.

Android is currently one of the most popular smartphone operating systems. However, Android has the largest share of global mobile malware and significant public attention has been brought to the security issues of Android. In this paper, we investigate the use of a clone detector to identify known Android malware. We collect a set of Android applications known to contain malware and a set of benign applications. We extract the Java source code from the binary code of the applications and use NiCad, a near-miss clone detector, to find the classes of clones in a small subset of the malicious applications. We then use these clone classes as a signature to find similar source files in the rest of the malicious applications. The benign collection is used as a control group. In our evaluation, we successfully decompile more than 1,000 malicious apps in 19 malware families. Our results show that using a small portion of malicious applications as a training set can detect 95% of previously known malware with very low false positives and high accuracy at 96.88%. Our method can effectively and reliably pinpoint malicious applications that belong to certain malware families.

[1] Zhou Y, Jiang X. Dissecting Android malware:Characterization and evolution. In Proc. the 2012 IEEE Symposium on Security and Privacy, May 2012, pp.95-109.

[2] Zhou W, Zhou Y, Jiang X et al. Detecting repackaged smartphone applications in third-party Android marketplaces. In Proc. the 2nd CODASPY, Feb. 2012, pp.317-326.

[3] Crussell J, Gibler C, Chen H. Attack of the clones:Detecting cloned applications on Android markets. In Lecture Notes in Computer Science 7459, Foresti S, Yung M, Martinelli F (eds.), Springer, 2012, pp.37-54.

[4] Bruschi D, Martignoni L, Monga M. Using code normalization for fighting self-mutating malware. In Proc. Int. Symp. Secure Software Engineering, Mar. 2006.

[5] Walenstein A, Lakhotia A. The software similarity problem in malware analysis. In Proc. Dagstuhl Seminar 06301:Duplication, Redundancy, and Similarity in Software, July 2006.

[6] Roy C, Cordy J, Koschke R. Comparison and evaluation of code clone detection techniques and tools:A qualitative approach. Science of Computer Programming, 2009, 74(7):470-495.

[7] Cordy J R, Roy C K. The NiCad clone detector. In Proc. the 19th ICPC, June 2011, pp.219-220.

[8] Griffin K, Schneider S, Hu X et al. Automatic generation of string signatures for malware detection. In Proc. the 12th RAID, Sept. 2009, pp.101-120.

[9] Christodorescu M, Jha S, Seshia S A et al. Semantics-aware malware detection. In Proc. the 2005 IEEE Symposium on Security and Privacy, May 2005, pp.32-46.

[10] Hanna S, Huang L, Wu E et al. JuxtApp:A scalable system for detecting code reuse among Android applications. In Lecture Notes in Computer Science 7591, Flegel U, Markatos E, Robertson W (eds.), Springer Berlin Heidelberg, 2013, pp.62-81.

[11] Enck W, Gilbert P, Chun B et al. TaintDroid:An information-flow tracking system for realtime privacy monitoring on smartphones. In Proc. the 9th USENIX Conf. Operating Systems Design and Implementation, Oct. 2010, pp.1-6.

[12] Arzt S, Rasthofer S, Fritz C et al. FlowDroid:Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. ACM SIGPLAN Notice, 2014, 49(6):259-269.

[13] Burguera I, Zurutuza U, Nadjm-Tehrani S. Crowdroid:Behavior-based malware detection system for Android. In Proc. the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, Oct. 2011, pp.15-26.

[14] Christodorescu M, Jha S. Static analysis of executables to detect malicious patterns. In Proc. the 12th Conference on USENIX Security Symposium, Volume 12, Aug. 2003.

[15] Wu D,Mao C,Wei T et al. DroidMat:AnDroid malware detection through manifest and API calls tracing. In Proc. the 7th Asia Joint Conference on Information Security (Asia JCIS), Aug. 2012, pp.62-69.

[16] Crussell J, Gibler C, Chen H. AnDarwin:Scalable detection of semantically similar Android applications. In Lecture Notes in Computer Science 8134, Crampton J, Jajodia S, Mayes K (eds.), Springer Berlin Heidelberg, 2013, pp.182-199.

[17] Andoni A, Indyk P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In Proc. the 47th Symp. Foundations of Computer Science, Oct. 2006, pp.459-468.

[18] Chen K Z, Johnson N M, D'Silva V et al. Contextual policy enforcement in Android applications with permission event graphs. In Proc. the 20th NDSS, Feb. 2013.

[19] Feng Y, Anand S, Dillig I et al. Apposcopy:Semanticsbased detection of Android malware through static analysis. In Proc. the 22nd ACM SIGSOFT Int. Symp. Foundations of Soft. Eng., Nov. 2014, pp.576-587.

[20] Baxter I D, Yahin A, Moura L et al. Clone detection using abstract syntax trees. In Proc. International Conference on Software Maintenance, Nov. 1998, pp.368-377.

[21] Cordy J. The TXL source transformation language. Sci. Comput. Program., 2006, 61(3):190-210.

[22] van Rijsbergen C J. Information Retrieval (2nd edition). Butterworth-Heinemann, Newton, MA, USA, 1979.

[23] Karademir S, Dean T, Leblanc S. Using clone detection to find malware in Acrobat files. In Proc. Conf. the Center for Advanced Studies on Collaborative Research, Nov. 2013, pp.70-80.

[24] Farhadi M R. Assembly code clone detection for malware binaries[M.A. Thesis]. Concordia University, April 2013. http://spectrum.libray.concorida.ca/977131,Nov. 2013.

[25] Farhadi M R, Fung B C M, Charland P et al. BinClone:Detecting code clones in malware. In Proc. the 8th Int. Conf. Software Security and Reliability, June 30-July 2, 2014, pp.78-87.

[26] Yin R K. Case Study Research:Design and Methods. Sage Publications, 2014.

[27] Vallee-Rai R, Hendren L J. Jimple:Simplifying Java bytecode for analyses and transformations. Sable Technical Report 1998-4. Sable Research Group, McGill University, 1998.

[28] Gruver B. Smali:An assembler/disassembler for Android's dex format@ONLINE. http://code:google:com/p/smali/, July 2015.

[29] Bartel A, Klein J, Traon Y L, Monperrus M et al. Dexpler:Converting Android Dalvik bytecode to Jimple for static analysis with Soot. In Proc. ACM SIGPLAN International Workshop on the State of the Art in Java Program Analysis, June 2012, pp.27-38.

[30] Gilbert D. Malware posing as official Google Play app found in....official Google Play Store. http://www.ibtimes.co.uk/malware-posing-official-google-play-app-found-official-google-play-store-1453409, July 2015.
No related articles found!
Full text



[1] Zhu Mingyuan;. Two Congruent Semantics for Prolog with CUT[J]. , 1990, 5(1): 82 -91 .
[2] Zheng Chongxun; Zhang Kenong;. Orthogonal Algorithm of Logic Probability and Syndrome-Testable Analysis[J]. , 1990, 5(2): 203 -209 .
[3] Su Bogong; Wang Jian; Xia Jinshi;. TST——An Algorithm for Global Microcode Compaction with Timing Constraints[J]. , 1991, 6(1): 97 -107 .
[4] Sui Yuefei;. The Polynomially Exponential Time Restrained Analytical Hierarchy[J]. , 1991, 6(3): 282 -284 .
[5] Yao Xin; Li Guojie;. General Simulated Annealing[J]. , 1991, 6(4): 329 -338 .
[6] Ma Xiaohu; Pan Zhigeng; Zhang Fuyan;. The Automatic Generation of Chinese Outline Font Based on Stroke Extraction[J]. , 1995, 10(1): 42 -52 .
[7] Cao Cungen;. Expansion Nets and Expansion Processes of Elementary Net Systems[J]. , 1995, 10(4): 325 -333 .
[8] Gao Qingshi; Liu Zhiyong;. K-Dimensional Optimal Parallel Algorithm for the Solution of a General Class of Recurrence Equations[J]. , 1995, 10(5): 417 -424 .
[9] Zhao Yu; Zhang Qiong; Xiang Hui; Shi Jiaosing; He Zhijun;. A Simplified Model for Generating 3D Realistic Sound in the Multimedia and Virtual Reality Systems[J]. , 1996, 11(4): 461 -470 .
[10] Zheng Fang; Wu Wenhu; Fang Ditang;. Center-Distance Continuous Probability Models and the Distance Measure[J]. , 1998, 13(5): 426 -437 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved