›› 2018,Vol. 33 ›› Issue (4): 711-726.doi: 10.1007/s11390-018-1851-2

所属专题: Artificial Intelligence and Pattern Recognition Data Management and Data Mining

• Special Section on Computer Networks and Distributed Computing • 上一篇    下一篇

基于微博多特征的标签推荐

Fei-Fei Kou, Jun-Ping Du*, Distinguished Member, CCF, Cong-Xian Yang, Yan-Song Shi, Wan-Qiu Cui Mei-Yu Liang, Yue Geng   

  1. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science Beijing University of Posts and Telecommunications, Beijing 100876, China
  • 收稿日期:2018-01-14 修回日期:2018-05-11 出版日期:2018-07-05 发布日期:2018-07-05
  • 通讯作者: Jun-Ping Du,E-mail:junpingd@bupt.edu.cn E-mail:junpingd@bupt.edu.cn
  • 作者简介:Fei-Fei Kou currently is a Ph.D. candidate in computer science and technology at Beijing University of Posts and Telecommunications, Beijing. She received her B.S. degree in electronic information engineering from Yantai University, Yantai, in 2010, and M.S. degree in computer technology from Beijing Technology and Business University, Beijing, in 2013. Her major research interest includes semantic learning and multimedia information retrieval and recommendation.
  • 基金资助:

    This work was supported by the National Natural Science Foundation of China under Grant Nos. 61320106006, 61532006, 61772083, and 61502042, and the Fundamental Research Funds for the Central Universities of China under Grant No. 2017RC39.

Hashtag Recommendation Based on Multi-Features of Microblogs

Fei-Fei Kou, Jun-Ping Du*, Distinguished Member, CCF, Cong-Xian Yang, Yan-Song Shi, Wan-Qiu Cui Mei-Yu Liang, Yue Geng   

  1. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2018-01-14 Revised:2018-05-11 Online:2018-07-05 Published:2018-07-05
  • Contact: Jun-Ping Du,E-mail:junpingd@bupt.edu.cn E-mail:junpingd@bupt.edu.cn
  • About author:Fei-Fei Kou currently is a Ph.D. candidate in computer science and technology at Beijing University of Posts and Telecommunications, Beijing. She received her B.S. degree in electronic information engineering from Yantai University, Yantai, in 2010, and M.S. degree in computer technology from Beijing Technology and Business University, Beijing, in 2013. Her major research interest includes semantic learning and multimedia information retrieval and recommendation.
  • Supported by:

    This work was supported by the National Natural Science Foundation of China under Grant Nos. 61320106006, 61532006, 61772083, and 61502042, and the Fundamental Research Funds for the Central Universities of China under Grant No. 2017RC39.

微博标签推荐是一个热门的研究话题,在很多与微博相关的任务中均起到了积极作用。然而,由于微博中的文本较短且微博标签的使用率较低会引起数据稀疏性问题,从而导致传统的标签推荐方法很难实现精准推荐。因此,在本文中,我们基于微博的多种特征提出了微博多特征标签推荐算法HRMF。该方法首先将短文本扩展为长文本,并通过设计新的主题模型建模了微博的多种特征(用户、标签和文本)。为了进一步削弱数据稀疏性问题,该方法同时采用了相似用户和相似微博的标签作为了候选标签。特别地,我们采用主题模型与协同过滤相结合的方法实现了相似用户的精准查找。最后,通过基于多特征的话题表示对每个候选标签的推荐值进行计算,实现了标签推荐。实验数据采用新浪微博上爬取的真实数据集,通过实验验证了所提标签推荐算法HRMF的有效性。

Abstract: Hashtag recommendation for microblogs is a very hot research topic that is useful to many applications involving microblogs. However, since short text in microblogs and low utilization rate of hashtags will lead to the data sparsity problem, it is difficult for typical hashtag recommendation methods to achieve accurate recommendation. In light of this, we propose HRMF, a hashtag recommendation method based on multi-features of microblogs in this article. First, our HRMF expands short text into long text, and then it simultaneously models multi-features (i.e., user, hashtag, text) of microblogs by designing a new topic model. To further alleviate the data sparsity problem, HRMF exploits hashtags of both similar users and similar microblogs as the candidate hashtags. In particular, to find similar users, HRMF combines the designed topic model with typical user-based collaborative filtering method. Finally, we realize hashtag recommendation by calculating the recommended score of each hashtag based on the generated topical representations of multi-features. Experimental results on a real-world dataset crawled from Sina Weibo demonstrate the effectiveness of our HRMF for hashtag recommendation.

[1] Bai T, Dou H J, Zhao W X, Yang D Y, Wen J R. An experimental study of text representation methods for cross-site purchase preference prediction using the social text data. Journal of Computer Science and Technology, 2017, 32(4):828-842.

[2] Chen W, Yin H, Wang W, Zhao L, Hua W, Zhou X. Exploiting spatio-temporal user behaviors for user linkage. In Proc. the 2017 ACM Conference on Information and Knowledge Management (CIKM), November 2017, pp.517-526.

[3] Wang W, Yin H, Sadiq S, Chen L, Xie M, Zhou X. STSAGE:A spatial-temporal sparse additive generative model for spatial item recommendation. ACM Transactions on Intelligent Systems and Technology (TIST), 2017, 8(3):Article No. 48.

[4] Deng L, Jia Y, Zhou B, Huang J, Han Y. User interest mining via tags and bidirectional interactions on Sina Weibo. In Proc. the 26th International Conference on World Wide Web, April 2017, pp.1-22.

[5] Chen H, Yin H, Li X, Wang M, Chen W, Chen T. People opinion topic model:Opinion based user clustering in social networks. In Proc. the 26th International Conference on World Wide Web Companion, April 2017, pp.1353-1359.

[6] Hu F, Li L, Zhang Z L, Wang J Y, Xu X F. Emphasizing essential words for sentiment classification based on recurrent neural networks. Journal of Computer Science and Technology, 2017, 32(4):785-795.

[7] Wang Y, Liu J, Huang Y, Feng X. Using hashtag graphbased topic model to connect semantically-related words without co-occurrence in microblogs. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(7):1919-1933.

[8] Bansal P, Jain S, Varma V. Towards semantic retrieval of hashtags in microblogs. In Proc. the 24th International Conference on World Wide Web (WWW), May 2015, pp.7-8.

[9] Gong Y, Zhang Q, Huang X. Hashtag recommendation for multimodal microblog posts. Neurocomputing, 2018, 272:170-177.

[10] Ding Z, Qiu X, Zhang Q, Huang X. Learning topical translation model for microblog hashtag suggestion. In Proc. the 2013 Joint Conference on Artificial Intelligence, July 2013, pp.2078-2084.

[11] Godin F, Slavkovikj V, de Neve W, Schrauwen B, van de Walle R. Using topic models for twitter hashtag recommendation. In Proc. the 2013 International World Wide Web Conferences Steering Committee, April 2013, pp.593-596.

[12] Zhao F, Zhu Y, Jin H, Yang L T. A personalized hashtag recommendation approach using LDA-based topic model in microblog environment. Future Generation Computer Systems, 2016, 65(C):196-206.

[13] Li J, Xu H, He X, Deng J, Sun X. Tweet modeling with LSTM recurrent neural networks for hashtag recommendation. In Proc. the International Joint Conference on Neural Networks (IJCNN), July 2016, pp.1570-1577.

[14] Sedhai S, Sun A. Hashtag recommendation for hyperlinked tweets. In Proc. the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, July 2014, pp.831-834.

[15] Kywe S, Hoang T A, Lim E P, Zhu F. On recommending hashtags in twitter networks. Social Informatics, 2012:337-350.

[16] Wang Y, Qu J, Liu J, Chen J, Huang Y. What to tag your microblog:Hashtag recommendation based on topic analysis and collaborative filtering. In Proc. the Asia-Pacific Web Conference (APWeb), September 2014, pp.610-618.

[17] Mikolov T, Sutskever I, Chen K, Corrado G S, Dean J. Distributed representations of words and phrases and their compositionality. In Proc. the 2013 Advances in Neural Information Processing Systems (NIPS), December 2013, pp.3111-3119.

[18] Arora S, Liang Y, Ma T. A simple but tough-to-beat baseline for sentence embeddings. In Proc. the 2017 International Conference on Learning Representations, April 2017.

[19] Li Q, Shah S, Nourbakhsh A, Liu X, Fang R. Hashtag recommendation based on topic enhanced embedding, tweet entity data and learning to rank. In Proc. the 2016 ACM International Conference on Information and Knowledge Management (CIKM), October 2016, pp.2085-2088.

[20] Li J, Xu H. Suggest what to tag:Recommending more precise hashtags based on users' dynamic interests and streaming tweet content. Knowledge-Based Systems, 2016, 106:196-205.

[21] Zhou X, Chen L, Zhang Y, Qin D, Cao L, Huang G, Wang C. Enhancing online video recommendation using social user interactions. VLDB Journal, 2017(1):1-20.

[22] She J, Chen L. TOMOHA:Topic model-based hashtag recommendation on twitter. In Proc. the 23rd International Conference on World Wide Web (WWW), April 2014, pp.371-372.

[23] Song S, Meng Y, Zheng Z. Recommending hashtags to forthcoming tweets in microblogging. In Proc. the 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), January 2016.

[24] Li T, Wu Y, Zhang Y. Twitter hash tag prediction algorithm. In Proc. the International Conference on Internet Computing (ICOMP), July 2011.

[25] Tomar A, Godin F, Vandersmissen B, de Neve W, van de Walle R. Towards Twitter hashtag recommendation using distributed word representations and a deep feed forward neural network. In Proc. the Advances in Computing, Communications and Informatics (ICACCI), September 2014, pp.362-368.

[26] Zhang Y, Xiao Y, Hwang S W, Wang H, Wang X S, Wang W. Entity suggestion with conceptual explanation. In Proc. the 26th International Joint Conference on Artificial Intelligence (IJCAI), August 2017, pp.4244-4250.

[27] Tong Y, Chen L, Zhou Z, Jagadish H V, Shou L, Lv W. SLADE:A smart large-scale task decomposer in crowdsourcing. IEEE Transactions on Knowledge and Data Engineering, DOI:10.1109/TKDE.2018.2797962.

[28] Cao C C, Tong Y, Chen L, Jagadish H V. WiseMarket:A new paradigm for managing wisdom of online social users. In Proc. the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), August 2013, pp.455-463.

[29] She J, Tong Y, Chen L, Cao C C. Conflict-aware eventparticipant arrangement and its variant for online setting. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(9):2281-2295.

[30] Tong Y, She Y, Meng R. Bottleneck-aware arrangement over event-based social networks:The Max-min approach. World Wide Web Journal, 2016, 19(6):1151-1177.

[31] Tong Y, She J, Chen L. Towards better understanding of app functions. Journal of Computer Science and Technology, 2015, 30(5):1130-1140.

[32] Tong Y, Cao C C, Chen L. TCS:Efficient topic discovery over crowd-oriented service data. In Proc. the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), August 2014, pp.861-870.

[33] Jiang D, Tong Y, Song Y. Cross-lingual topic discovery from multilingual search engine query log. ACM Transactions on Information Systems, 2016, 35(2):Article No. 9.

[34] Bicalho P, Pita M, Pedrosa G, Lacerda A, Pappa G L. A general framework to expand short text for topic modeling. Information Sciences, 2017, 393:66-81.

[35] Zhao Y, Liang S, Ren Z, Ma J, Yilmaz E, de Rijke M. Explainable user clustering in short text streams. In Proc. the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, July 2016, pp.155-164.

[36] Yan X, Guo J, Lan Y, Cheng X. A biterm topic model for short texts. In Proc. International Conference on World Wide Web (WWW), May 2013, pp.1445-1456.

[37] Chen T, SalahEldeen H M, He X, Kan M Y, Lu D. VELDA:Relating an image tweet's text and images. In Proc. the 29th AAAI Conference on Artificial Intelligence, January 2015, pp.30-36.

[38] Newman D, Lau J H, Grieser K, Baldwin T. Automatic evaluation of topic coherence. In Proc. Human Language Technologies:The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, June 2010, pp.100-108.

[39] Li C, Duan Y, Wang H, Zhang Z, Sun A, Ma Z. Enhancing topic modeling for short texts with auxiliary word embeddings. ACM Transactions on Information Systems, 2017, 36(2):Article No. 11.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李未;. A Structural Operational Semantics for an Edison Like Language(2)[J]. , 1986, 1(2): 42 -53 .
[2] 李万学;. Almost Optimal Dynamic 2-3 Trees[J]. , 1986, 1(2): 60 -71 .
[3] 冯玉琳;. Recursive Implementation of VLSI Circuits[J]. , 1986, 1(2): 72 -82 .
[4] C.Y.Chung; 华宣仁;. A Chinese Information Processing System[J]. , 1986, 1(2): 15 -24 .
[5] 孙钟秀; 商陆军;. DMODULA:A Distributed Programming Language[J]. , 1986, 1(2): 25 -31 .
[6] 陈世华;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[7] 高庆狮; 张祥; 杨树范; 陈树清;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[8] 金兰; 杨元元;. A Modified Version of Chordal Ring[J]. , 1986, 1(3): 15 -32 .
[9] 潘启敬;. A Routing Algorithm with Candidate Shortest Path[J]. , 1986, 1(3): 33 -52 .
[10] 吴恩华;. A Graphics System Distributed across a Local Area Network[J]. , 1986, 1(3): 53 -64 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: