›› 2015, Vol. 30 ›› Issue (6): 1201-1214.doi: 10.1007/s11390-015-1594-2

Special Issue: Data Management and Data Mining

• Special Section on Networking and Distributed Computing for Big Data • Previous Articles     Next Articles

Enhancing Telco Service Quality with Big Data Enabled Churn Analysis: Infrastructure, Model, and Deployment

Hui Li(李慧), Di Wu*(吴迪), Senior Member, CCF, Gao-Xiang Li(李高翔), Yi-Hao Ke(柯毅豪), Wen-Jie Liu(刘文杰), Yuan-Huan Zheng(郑元欢), Xiao-La Lin(林小拉)   

  1. Department of Computer Science, Sun Yat-sen University, Guangzhou 510006, China
  • Received:2015-05-15 Revised:2015-08-26 Online:2015-11-05 Published:2015-11-05
  • About author:Hui Li received her B.S. degree in computer science from the University of Science and Technology of China, Hefei, in 1990, her M.S. degree in optics from South China Normal University, Guangzhou, in 1993, and her MBA degree from Beijing University of Posts and Telecommunications in 2013. She is now pursuing her Ph.D. degree in computer science in Sun Yat-sen University, Guangzhou. She is the big-data expert of e-government affairs in Guangdong Province of China and the expert of Group Industry Application in China Unicom.
  • Supported by:

    This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61272397, 61472454, and 61572538, and the Guangdong Natural Science Funds for Distinguished Young Scholar under Grant No. S20120011187.

The penetration of mobile phones is nearly saturated in both developing and developed regions. In such a circumstance, how to prevent subscriber churn has become an important issue for today's telecom operators, as the cost to acquire a new subscriber is much higher than that to retain an existing subscriber. In this paper, we propose to leverage the power of big data to mitigate the problem of subscriber churn and enhance the service quality of telecom operators. As the information hub, telecom operators have accumulated a huge volume of valuable data on subscriber behaviors, service usage, and network operations. To enable efficient big data processing, we first build a dedicated distributed cloud infrastructure that integrates both online and offline processing capabilities. Second, we develop a complete churn analysis model based on deep data mining techniques, and utilize inter-subscriber influence to improve prediction accuracy. Finally, we use real datasets obtained from a large telecom operator in China to verify the accuracy of our churn analysis models. The dataset contains the information of over 3.5 million subscribers, which generate over 600 million call detail records (CDRs) per month. The empirical results demonstrate that our proposed method can achieve around 90% accuracy for T + 1 testing periods and identify subscribers with high negative influence successfully.

[1] Hu H, Wen Y G, Chua T S, Wang Z, Huang J, Zhu W W, Wu D. Community-based effective social video contents placement in cloud-centric CDN network. In Proc. the 2014 IEEE International Conference on Multimedia and Expo (ICME), July 2014.

[2] Zhao Y, Li B, Li X, Liu W, Ren S. Customer churn prediction using improved one-class support vector machine. In Proc. the 1st Int. Conf. Advanced Data Mining and Applications, July 2005, pp.300-306.

[3] Lee T, Chiu C, Chou Y, Lu C. Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Computational Statistics and Data Analysis, 2006, 50(4): 1113-1130.

[4] Huang S, Yen D C, Wang H. Applying data mining to telecom churn management. Expert Systems with Applications, 2006, 31(3): 515-524.

[5] Lemmens A, Croux C. Bagging and boosting classification trees to predict churn. Journal of Marketing Research, 2006, 43(2): 276-286.

[6] Archaux C, Martin A, Khenchaf A. An SVM based churn detector in prepaid mobile telephony. In Proc. International Conference on Information and Communication Technologies: From Theory to Applications, April 2004, pp.459-460.

[7] AuW, Chan K, Yao X. A novel evolutionary data mining algorithm with applications to churn prediction. IEEE Transactions on Evolutionary Computation, 2003, 7(6): 532-545.

[8] Idris A, Khan A, Lee Y S. Intelligent churn prediction in telecom: Employing mRMR feature selection and rotboost based ensemble classification. Applied Intelligence, 2013, 39(3): 659-672.

[9] Coussement K, van den Poel D. Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert Systems with Applications, 2008, 34(1): 313-327.

[10] Ngonmang B, Viennet E, Tchuente M. Churn prediction in a real online social network using local community analysis. In Proc. International Conference on Advances in Social Networks Analysis and Mining, Aug. 2012, pp.282-288.

[11] Dasgupta K, Singh R, Viswanathan B, Chakraborty D, Mukherjea S, Nanavati A A, Joshi A. Social ties and their relevance to churn in mobile telecom networks. In Proc. the 11th International Conference on Extending Database Technology, March 2008, pp.668-677.

[12] Phadke C, Uzunalioglu H, Mendiratta V B, Kushnir D, Doran D. Prediction of subscriber churn using social network analysis. Bell Labs Technical Journal, 2013, 17(4): 63-76.

[13] Zhang X, Zhu J, Xu S, Wan Y. Predicting customer churn through interpersonal influence. Knowledge-Based Systems, 2012, 28: 97-104.

[14] Kusuma P D, Radosavljevik D, Takes F W, van der Putten P. Combining customer attribute and social network mining for prepaid mobile churn prediction. In Proc. the 23rd Annual Belgian Dutch Conference on Machine Learning (BENELEARN), June 2013, pp.50-58.

[15] Gomez-Rodriguez M, Leskovec J, Krause A. Inferring networks of diffusion and influence. ACM Transactions on Knowledge Discovery from Data (TKDD), 2012, 5(4): 21:1- 21:37.

[16] Myers S, Zhu C, Leskovec J. Information diffusion and external influence in networks. In Proc. the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2012, pp.33-41.

[17] Wen Y G, Zhu X Q, Rodrigues J J P C, Chen C W. Cloud mobile media: Reflections and outlook. IEEE Transactions on Multimedia (TMM), 2014, 16(4): 885-902.

[18] Hu H, Wen Y G, Chua T S, Li X L. Towards scalable systems for big data analytics: A technology tutorial. IEEE Access Journal, 2014, 2: 652-687.

[19] Xia W F, Wen Y G, Foh C H, Niyato D, Xie H Y. A survey on software-defined networking. IEEE Communication Surveys and Tutorials, 2015, 17(1): 27-51.

[20] Magnusson J. Social network analysis utilizing big data technology [M.S. Thesis]. Uppsala University, 2012.

[21] Euler T. Churn prediction in telecommunications using MiningMart. In Proc. the Workshop on Data Mining and Business, October 2005, pp.1-2.

[22] Mozer M, Wolniewicz R, Grimes D B, Johnson E, Kaushanky H. Predicting subscriber dissatisfaction and improving retention in wireless telecommunications industry. IEEE Transactions on Neural Networks, 2000, 11(3): 690- 696.

[23] Nanavati A, Gurumurthy S, Das G, Chakraborty D, Dasgupta K, Mukherjea S, Joshi A. On the structural properties of massive telecom call graphs: Findings and implications. In Proc. ACM CIKM, November 2006, pp.435-444.

[24] Motahari S, Jung T, Zang H, Janakiraman K, Li X Y, Hoo K S. Predicting the influencers on wireless subscriber churn. In Proc. IEEE Wireless Communications and Networking Conference (WCNC), April 2014, pp.3402-3407.
No related articles found!
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved