›› 2015,Vol. 30 ›› Issue (4): 917-932.doi: 10.1007/s11390-015-1570-x

所属专题: Artificial Intelligence and Pattern Recognition Data Management and Data Mining

• Special Section on Selected Paper from NPC 2011 • 上一篇    

分解遇见异构潜在主题:一个可解释的跨网站推荐模型

Xin Xin1*(辛欣), Member, CCF, ACM, IEEE, Chin-Yew Lin2(林钦佑), Member, ACM, IEEE, Xiao-Chi Wei1(魏骁驰), Member, CCF, ACM, He-Yan Huang1(黄河燕), Member, ACM   

  1. 1. Beijing Engineering Research Center of High Volume Language Information Processing & Cloud Computing, School of Computer Science, Beijing Institute of Technology, Beijing 100081, China;
    2. Microsoft Research Asia, Beijing 100080, China
  • 收稿日期:2014-11-15 修回日期:2015-03-20 出版日期:2015-07-05 发布日期:2015-07-05
  • 通讯作者: Xin Xin is currently an assistant professor of the School of Computer Science, Beijing Institute of Technology. E-mail:xxin@bit.edu.cn
  • 作者简介:Xin Xin is currently an assistant professor of the School of Computer Science, Beijing Institute of Technology. He received his B.S. and M.S. degrees from the Department of Computer Science and Technology, Tsinghua University, Beijing, in 2006 and 2008, respectively, and his Ph.D. degree in computer science from The Chinese University of Hong Kong in 2011. His research interests include data mining, machine learning, etc. He is a member of CCF, ACM, and IEEE.
  • 基金资助:

    This work was supported by the National Basic Research 973 Program of China under Grant No. 2013CB329605, the National Natural Science Foundation of China under Grant Nos. 61300076 and 61375045, the Ph.D. Programs Foundation of Ministry of Education of China under Grant No. 20131101120035, and the Excellent Young Scholars Research Fund of Beijing Institute of Technology.

When Factorization Meets Heterogeneous Latent Topics: An Interpretable Cross-Site Recommendation Framework

Xin Xin1*(辛欣), Member, CCF, ACM, IEEE, Chin-Yew Lin2(林钦佑), Member, ACM, IEEE, Xiao-Chi Wei1(魏骁驰), Member, CCF, ACM, He-Yan Huang1(黄河燕), Member, ACM   

  1. 1. Beijing Engineering Research Center of High Volume Language Information Processing & Cloud Computing, School of Computer Science, Beijing Institute of Technology, Beijing 100081, China;
    2. Microsoft Research Asia, Beijing 100080, China
  • Received:2014-11-15 Revised:2015-03-20 Online:2015-07-05 Published:2015-07-05
  • Contact: Xin Xin is currently an assistant professor of the School of Computer Science, Beijing Institute of Technology. E-mail:xxin@bit.edu.cn
  • About author:Xin Xin is currently an assistant professor of the School of Computer Science, Beijing Institute of Technology. He received his B.S. and M.S. degrees from the Department of Computer Science and Technology, Tsinghua University, Beijing, in 2006 and 2008, respectively, and his Ph.D. degree in computer science from The Chinese University of Hong Kong in 2011. His research interests include data mining, machine learning, etc. He is a member of CCF, ACM, and IEEE.
  • Supported by:

    This work was supported by the National Basic Research 973 Program of China under Grant No. 2013CB329605, the National Natural Science Foundation of China under Grant Nos. 61300076 and 61375045, the Ph.D. Programs Foundation of Ministry of Education of China under Grant No. 20131101120035, and the Excellent Young Scholars Research Fund of Beijing Institute of Technology.

数据稀疏是社会推荐系统应用的典型挑战。过去缓解该问题的主要方法是利用该社交网站内的相关信息。在本文中,我们将通过探索跨网站信息来解决该挑战。具体研究目标包括:(1)如何利用跨网站的评分信息和内容特征信息提高推荐算法的精度和效率?(2)如何利用内容特征信息解释推荐结果?为此,本文提出矩阵分解与主题分析的联合模型,作为推荐算法框架。该框架下,异构内容特征会被描述为不同种类的主题,并以此实现数据准确降维,提升预测精度。另外,融合主题模型令矩阵分解的潜在特征向量具有可解释性。因此,上述两问题通过该联合模型同时解决。通过包含用户在三个社交网站的真实数据进行验证,所提出方法能够有效提高推荐精度,并具有解释性。通过复杂度分析,该算法的计算复杂度随数据的增加线性增长,因此该算法可用于大规模数据。

Abstract: Data sparsity is a well-known challenge in applications of recommender systems. Previous work alleviate this problem by incorporating the information within the corresponding social media site. In this paper, we are going to solve this challenge by exploring the cross-site information. Specifically, we target at: 1) How to effectively and efficiently utilize cross-site ratings and content features to improve the recommendation performance? and 2) How to make the recommendation interpretable by utilizing the content features? We propose a joint model of matrix factorization and latent topic analysis as the recommendation framework. In this model, heterogeneous content features can be modeled by multiple kinds of latent topics, by which the feature dimensionality reduction is accurately conducted for improving recommendation performance. In addition, the combination of matrix factorization and latent topics makes the recommendation result interpretable from many aspects. Therefore, the above two issues are simultaneously solved. Through a real world dataset, where user behaviors in three social media sites are collected, we demonstrate that the proposed model is effective in improving the recommendation performance and interpreting the rationale of ratings.

[1] Chen T, Tang L, Liu Q et al. Combining factorization model and additive forest for collaborative followee recommendation. In Proc. 2012 KDD Cup Workshop, Aug. 2012.

[2] Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer, 2009, 42(8): 30–37.

[3] Salakhutdinov R, Mnih A. Probabilistic matrix factorization. In Proc. the 21st Annual Conference on Neural Information Processing Systems, Dec. 2007, pp.1257–1264.

[4] Töscher A, Jahrer M, Bell R M. The BigChaos solution to the Netflix grand prize. 2009. http://www.netflixprize.com/assets/GrandPrize2009 BPC BigChaos.pdf, April 2015.

[5] Hu L, Cao J, Xu G, Cao L, Gu Z, Zhu C. Personalized recommendation via cross-domain triadic factorization. In Proc. the 22nd International World Wide Web Conference, May 2013, pp.595–606.

[6] Agarwal D, Chen B C. Regression-based latent factor models. In Proc. the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, June 28–July 1, 2009, pp.19–28.

[7] Xu Z, Tresp V, Rettinger A, Kersting K. Social network mining with nonparametric relational models. In Proc. the 2nd Int. Conf. Advances in Social Network Mining and Analysis, Aug. 2008, pp.77–96.

[8] Sharma A, Cosley D. Do social explanations work? Studying and modeling the effects of social explanations in recommender systems. In Proc. the 22nd International World Wide Web Conference, May 2013, pp.1133–1143.

[9] Agarwal D, Chen B C. fLDA: Matrix factorization through latent dirichlet allocation. In Proc. the 3rd ACM International Conference on Web Search and Data Mining, Feb. 2010, pp.91–100.

[10] McAuley J, Leskovec J. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proc. the 7th ACM Conference on Recommender Systems, Oct. 2013, pp.165–172.

[11] Wang C, Blei D. Collaborative topic modeling for recommending scientific articles. In Proc. the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2011, pp.448–456.

[12] Hofmann T. Probabilistic latent semantic indexing. In Proc. the 22nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Aug. 1999, pp.50–57.

[13] Breese J, Heckerman D, Kadie C. Empirical analysis of predictive algorithms for collaborative filtering. In Proc. the 14th Conference on Uncertainty in Artificial Intelligence, July 1998, pp.43–52.

[14] Cai Y, Leung H F, Li Q, Min H, Tang J, Li J. Typicalitybased collaborative filtering recommendation. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(3): 766–779.

[15] Deshpande M, Karypis G. Item-based top-n recommendation algorithms. ACM Transactions on Information Systems, 2004, 22(1): 143–177.

[16] Liu N N, Yang Q. Eigenrank: A ranking-oriented approach to collaborative filtering. In Proc. the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 2008, pp.83–90.

[17] Ma H, King I, Lyu M R. Effective missing data prediction for collaborative filtering. In Proc. the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 2007, pp.39–46.

[18] Hofmann T. Collaborative filtering via Gaussian probabilistic latent semantic analysis. In Proc. the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 28-Aug. 1, 2003, pp.259–266.

[19] Si L, Jin R. Flexible mixture model for collaborative filtering. In Proc. the 20th Annual International Conference on Machine Learning, Aug. 2003, pp.704–711.

[20] Zhang Y, Koren J. Efficient Bayesian hierarchical user modeling for recommendation system. In Proc. the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 2007, pp.47– 54.

[21] Porteous I, Bart E, Welling M. Multi-HDP: A non parametric Bayesian model for tensor factorization. In Proc. the 23rd AAAI Conference on Artificial Intelligence, July 2008, pp.1487–1490.

[22] Salakhutdinov R, Mnih A, Hinton G. Restricted Boltzmann Machines for collaborative filtering. In Proc. the 24th Annual International Conference on Machine Learning, June 2007, pp.791–798.

[23] Salakhutdinov R, Mnih A. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proc. the 25th Annual International Conference on Machine Learning, July 2008, pp.880–887.

[24] Rennie J D, Srebro N. Fast maximum margin matrix factorization for collaborative prediction. In Proc. the 22nd Annual International Conference on Machine Learning, Aug. 2005, pp.713–719.

[25] Yu K, Zhu S, Lafferty J, Gong Y. Fast nonparametric matrix factorization for large-scale collaborative filtering. In Proc. the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, July 2009, pp.211–218.

[26] Ma H, Zhou T C, Lyu M R, King I. Improving recommender systems by incorporating social contextual information. ACM Transactions on Information Systems, 2011, 29(2): Article No. 9.

[27] Wu S, Sun J, Tang J. Patent partner recommendation in enterprise social networks. In Proc. the 6th ACM International Conference on Web Search and Data Mining, Feb. 2013, pp.43–52.

[28] Jamali M, Lakshmanan L. HeteroMF: Recommendation in heterogeneous information networks using context dependent factor models. In Proc. the 22nd International World Wide Web Conference, May 2013, pp.643–654.

[29] Ma H, King I, Lyu M R. Learning to recommend with social trust ensemble. In Proc. the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, July 2009, pp.203–210.

[30] Stern D H, Herbrich R, Graepel T. Matchbox: Large scale online Bayesian recommendations. In Proc. the 18th International World Wide Web Conference, Apr. 2009, pp.111– 120.

[31] Zhang Y, Nie J. Probabilistic latent relational model for integrating heterogeneous information for recommendation. Technical Report, School of Engineering, University of California Santa Cruz, 2010.

[32] Shan H, Banerjee A. Generalized probabilistic matrix factorizations for collaborative filtering. In Proc. the 10th IEEE International Conference on Data Mining, Dec. 2010, pp.1025–1030.

[33] Ma H, Zhou D, Liu C, Lyu M R, King I. Recommender systems with social regularization. In Proc. the 4th International Conference on Web Search and Data Mining, Feb. 2011, pp.287–296.

[34] Singh A P, Gordon G J. Relational learning via collective matrix factorization. In Proc. the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2008, pp.650–658.

[35] Zhang X, Cheng J, Yuan T, Niu B, Lu H. TopRec: Domainspecific recommendation through community topic mining in social network. In Proc. the 22nd International Conference on World Wide Web, May 2013, pp.1501–1510.

[36] Tang J, Wu S, Sun J, Su H. Cross-domain collaboration recommendation. In Proc. the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2012, pp.1285–1293.

[37] Blei D M, McAuliffe J D. Supervised topic models. In Proc. the 21st Annual Conference on Neural Information Processing Systems, Dec. 2007, pp.121–128.

[38] Blei DM, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993–1022.

[39] Bertsekas D. Nonlinear Programming (2nd edition). Athena Scientific, 1999.

[40] Liu J, Zhang F, Song X, Song Y I, Lin C Y, Hon H W. What's in a name?: An unsupervised approach to link users across communities. In Proc. the 6th ACM International Conference on Web Search and Data Mining, Feb. 2013, pp.495–504.

[41] Yuan N J, Zhang F, Lian D, Zheng K, Yu S, Xie X.We know how you live: Exploring the spectrum of urban lifestyles. In Proc. the 1st Conference on Online Social Networks, Oct. 2013, pp.3–14.

[42] Ma H, King I, Lyu M R. Learning to recommend with explicit and implicit social relations. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): Article No. 29.

[43] Jamali M, Ester M. A matrix factorization technique with trust propagation for recommendation in social networks. In Proc. the 4th ACM Conference on Recommender Systems, Sept. 2010, pp.135–142.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 沈理;. Testability Analysis at Switch Level for CMOS Circuits[J]. , 1990, 5(2): 197 -202 .
[2] 许志明;. Discrete Interpolation Surface[J]. , 1990, 5(4): 329 -332 .
[3] 韩建超; 史忠植;. Formalizing Default Reasoning[J]. , 1990, 5(4): 374 -378 .
[4] 黄志毅; 胡守仁;. Detection of And-Parallelism in Logic Programs[J]. , 1990, 5(4): 379 -387 .
[5] 李涛;. Competition Based Neural Networks for Assignment Problems[J]. , 1991, 6(4): 305 -315 .
[6] $Hock C. Chan;. Translational Semantics for a Conceptual Level Query Language[J]. , 1995, 10(2): 175 -187 .
[7] 王学军; 石纯一;. A Multiagent Dynamic interaction Testbed:Theoretic Framework, System Architecture and Experimentation[J]. , 1997, 12(2): 121 -132 .
[8] Ewen Denney. Simply-typed Underdeterminism[J]. , 1998, 13(6): 491 -508 .
[9] . 基于上下文的二维变长熵编码器[J]. , 2006, 21(3): 315 -322 .
[10] . 扩展交互式Web服务以改进门户中表示层集成[J]. , 2006, 21(4): 620 -629 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: