|
计算机科学技术学报 ›› 2019,Vol. 34 ›› Issue (1): 234-252.doi: 10.1007/s11390-019-1907-y
所属专题: Artificial Intelligence and Pattern Recognition; Data Management and Data Mining
• • 上一篇
Jing-Ya Zhou1,2,3, Member, CCF, ACM, Jian-Xi Fan1, Member, CCF, Cheng-Kuan Lin1, Member, CCF, and Bao-Lei Cheng1, Member, CCF
Jing-Ya Zhou1,2,3, Member, CCF, ACM, Jian-Xi Fan1, Member, CCF, Cheng-Kuan Lin1, Member, CCF, and Bao-Lei Cheng1, Member, CCF
随着越来越多的用户通过社交媒体进行在线交流,在线社交网络正在快速发展壮大。面对用户产生的大数据,数据存储必须是分布式的、可扩展的并且经济有效。解决这一问题最重要的挑战是如何在不降低系统性能的前提下最小化成本代价。虽然目前许多存储系统采用分布式健值存储技术,但该技术并不适合直接应用到社交网络存储系统。这是因为社交用户的数据高度关联,哈希存储容易导致服务器间频繁通信,而高额的通信代价极大地降低了社交网络存储系统的可扩展性。以往研究提出对社交图采用网络划分,和对数据进行复制两种方法。然而,数据复制会增加存储成本,并影响通信成本。在本文中,我们关注于如何从数据存储的角度结合划分和复制以最小化成本。我们提出的经济有效的数据存储方法能够很好地支持社交网络存储系统的扩展。该方法利用网络划分和数据复制同步优化将经常发生交互的用户的数据放在一起,放置过程始终满足系统负载均衡的约束。我们在Facebook数据集上进行了大量实验以验证本文所提算法。
[1] Althoff T, Jindal P, Leskovec J. Online actions with offline impact:How online social networks influence online and offline user behavior. In Proc. the 10th ACM International Conference on Web Search and Data Mining, February 2017, pp.537-546. [2] Peng S C, Yang A M, Cao L H, Yu S, Xie D Q. Social influence modeling using information theory in mobile social networks. Information Sciences, 2017, 379:146-159. [3] Wang F, Wang H Y, Xu K, Wu J H, Jia X H. Characterizing information diffusion in online social networks with linear diffusive model. In Proc. the 33rd International Conference on Distributed Computing Systems, July 2013, pp.307-316. [4] Al-Fares M, Loukissas A, Vahdat, A. A scalable, commodity data center network architecture. In Proc. the ACM SIGCOMM Conference on Data communication, August 2008, pp.63-74. [5] Shvachko K, Kuang H, Radia S, Chansler R. The Hadoop distributed file system. In Proc. the 26th IEEE Symposium on Mass Storage Systems and Technologies, May 2010. [6] Lakshman A, Malik P. Cassandra:A decentralized structured storage system. ACM SIGOPS Operating Systems Review, 2010, 44(2):35-40. [7] Sumbaly R, Kreps J, Gao L, Feinberg A, Soman C, Shah S. Serving large-scale batch computed data with project Voldemort. In Proc. the 10th USENIX Conference on File and Storage Technologies, February 2012, pp.223-235. [8] Karypis G, Kumar V. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 1998, 20(1):359-392. [9] Chen H H, Jin H, Wu S. Minimizing inter-server communications by exploiting self-similarity in online social networks. IEEE Transactions on Parallel and Distributed Systems, 2016, 27(4):1116-1130. [10] Liu G X, Shen H Y, Chandler H. Selective data replication for online social networks with distributed datacenters. IEEE Transactions on Parallel and Distributed Systems, 2016, 27(8):2377-2393. [11] Pujol J M, Erramilli V, Siganos G, Yang X, Laoutaris N, Chhabra P, Rodriguez P. The little engine(s) that could:Scaling online social networks. IEEE/ACM Transactions on Networking, 2012, 20(4):1162-1175. [12] Tran D A, Zhang T. S-PUT:An EA-based framework for socially aware data partitioning. Computer Networks, 2014, 75:504-518. [13] Yu B Y, Pan J P. Location-aware associated data placement for geo-distributed data intensive applications. In Proc. the 34th IEEE International Conference on Computer Communications, April 2015, pp.603-611. [14] Zhou J Y, Fan J X, Cheng B L, Jia J C. Optimizing interserver communications by exploiting overlapping communities in online social networks. In Proc. the 16th International Conference on Algorithms and Architectures for Parallel Processing, December 2016, pp.231-244. [15] Tran D A, Nguyen K, Pham C. S-CLONE:Socially-aware data replication for social networks. Computer Networks, 2012, 56(7):2001-2013 [16] Zhang J H, Chen J, Luo J Z, Song A B. Efficient locationaware data placement for data-intensive applications in geodistributed scientific data centers. Tsinghua Science and Technology, 2016, 21(5):471-481. [17] Jiao L, Lit J, Du W, Fu X M. Multi-objective data placement for multi-cloud socially aware services. In Proc. the 33rd IEEE International Conference on Computer Communications, April 2014, pp.28-36. [18] Gregory S. Finding overlapping communities in networks by label propagation. New Journal of Physics, 2010, 12(10):Article No. 103018. [19] Lancichinetti A, Fortunato S, Kertész J. Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics, 2009, 11(3):Article No. 033015. [20] Qiao S, Han N, Zhang K, Zou L, Wang H, Alberto G L. Algorithm for detecting overlapping communities from complex network big data. Journal of Software, 2017, 28(3):631-647. (in Chinese) [21] Wilson C, Sala A, Puttaswamy K P N, Zhao B Y. Beyond social graphs:User interactions in online social networks and their implications. ACM Transactions on the Web, 2012, 6(4):Article No. 17. [22] Gjoka M, Kurant M, Butts C T, Markopoulou A. Walking in Facebook:A case study of unbiased sampling of OSNs. In Proc. the 29th IEEE International Conference on Computer Communications, March 2010, pp.2498-2506. [23] Jiang J, Wilson C, Wang X, Sha W P, Huang P, Dai Y F, Zhao B Y. Understanding latent interactions in online social networks. ACM Transactions on the Web, 2013, 7(4):Article No. 18. [24] Benevenuto F, Rodrigues T, Cha M, Almeida V A F. Characterizing user behavior in online social networks. In Proc. the 9th ACM SIGCOMM Conference on Internet Measurement, November 2009, pp.49-62. [25] Roy A, Zeng H, Bagga J, Porter G, Snoeren A C. Inside the social network's (datacenter) network. In Proc. the 2015 ACM Conference on Special Interest Group on Data Communication, August 2015, pp.123-137. |
[1] | Jing Jiang, Zi-Fei Shan, Xiao Wang, Li Zhang, Ya-Fei Dai. 真实环境下虚假团体的测量研究[J]. , 2015, 30(6): 1344-1357. |
[2] | Da-Wei Sun (孙大为), Student Member, CCF, ACM, Gui-Ran Chang (常桂然), Shang Gao (高尚), Li-Zhong Jin (靳立忠), and Xing-Wei Wang, (王兴伟), Senior Member, CCF, ACM. 云计算环境中一种实现高可用性的动态数据复制策略[J]. , 2012, (2): 256-272. |
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |