Journal of Computer Science and Technology

   

SMEC: Scene Mining for E-Commerce

Gang Wang1, Xiang Li2, Zi-Yi Guo3, Da-Wei Yin4, and Shuai Ma1   

  1. 1State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China
    2School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
    3JD.com, Beijing 100176, China
    4Baidu Inc., Beijing 100085, China

Scene-based recommendation has proven its usefulness in E-commerce, by recommending commodities based on a given scene. However, scenes are typically unknown in advance, which necessitates its discovery for E-commerce. In this article, we study scene discovery for E-commerce systems. We first formalize a scene as a set of commodity categories that occur simultaneously and frequently in real-world situations, and model an E-commerce platform as a heterogeneous information network (HIN), whose nodes and links represent different types of objects and different relationships between objects, respectively. We then formulate the scene mining problem for E-commerce as an unsupervised learning problem that finds the overlapping clusters of commodity categories in the HIN. To solve the problem, we propose a non-negative matrix factorization based method SMEC (Scene Mining for E-Commerce), and theoretically prove its convergence. Using 6 real-world E-commerce datasets, we finally conduct an extensive experimental study to evaluate SMEC against 13 other methods, and show that SMEC consistently outperforms its competitors with regard to various evaluation measures.


中文摘要

研究背景:
基于场景的商品推荐在电商平台已经逐渐受到重视,是未来商品推荐优化的方向之一。这类方法是在给定场景信息的帮助下向用户提供更加满意的个性化推荐结果。传统的商品推荐方法大多推荐与用户浏览记录中商品相似或可替代的商品。然而,看似不相关的商品有时会形成特定场景下有实际意义的集合。向推荐系统中引入场景,能够提高推荐结果的多样性,从而激发更多潜在的购买意愿,减少令用户不适的购物体验。虽然商品推荐已经被广泛研究,但场景的难以获得,阻碍了基于场景的推荐研究。场景挖掘是一项非常繁重的工作,需要耗费大量的人力物力。为了方便后续的推荐任务,需要一种有效的自动化方法来挖掘适用于电子商务的场景。
目标:
我们的研究目标是通过开发一种无监督的方法来抽取电子商务平台复杂网络中的多源信息,从而挖掘出适用于带有形式化定义的基于场景商品推荐方法的一组稳定、抽象、可重用的场景。
方法:
首先,将用于辅助商品推荐的场景形式化定义为一个在现实环境中同时频繁出现的商品类别集合。并将电子商务平台中的多种复杂关系建模为一个异构信息网络,其节点和链接分别表示不同类型的对象和对象之间的不同关系。然后,将电子商务中的场景挖掘问题定义为一个无监督学习问题,即在异构信息网络中对商品类别进行重叠聚类。为了解决这一问题,我们提出了一种基于非负矩阵分解的方法SMEC,该方法将电商异构信息网络各个子网对应的邻接矩阵进行联合分解,从而得到场景。最后,在6个真实的电子商务数据集上与其他13种方法进行对比,通过实施多角度丰富的实验来以评估SMEC方法的性能。
结果:
SMEC是一种有效的场景挖掘方法。与所用数据集上表现最佳的基线方法相比,SMEC在平均F1分数、NMI和Omega指数三个指标上,分别平均提高5.2%、22.4%和189.5%。SMEC发现的场景具有实际意义,大部分场景都可以对应一个特定的现实场景,并且他们同时具有场景应有的多个属性。
结论:
针对电子商务系统,本文研究了一个新颖的场景挖掘问题,其中的场景被定义为一组商品类别。并提出了一种通过在电商异构信息网络上进行非负矩阵分解来发现场景的方法,该方法可以有效地提取多方面信息并提高结果的可解释性。此外,通过将场景信息直接编码到向量来构造稀疏表征,以避免额外的后续处理产生的错误决策。大量的实验验证了所提出方法用于场景挖掘的有效性以及对于不同的电商网络的可伸缩性。未来工作的方向是将场景引入到电子商务的个性化推荐中,从而实现在给定场景中基于用户偏好精准推荐商品。


Key words: graph clustering, E-commerce, heterogeneous information network, scene mining


;

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Zhou Di;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] Chen Shihua;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] Feng Yulin;. Recursive Implementation of VLSI Circuits[J]. , 1986, 1(2): 72 -82 .
[4] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[5] Jin Lan; Yang Yuanyuan;. A Modified Version of Chordal Ring[J]. , 1986, 1(3): 15 -32 .
[6] Zhang Cui; Zhao Qinping; Xu Jiafu;. Kernel Language KLND[J]. , 1986, 1(3): 65 -79 .
[7] Qu Yanwen;. AGDL: A Definition Language for Attribute Grammars[J]. , 1986, 1(3): 80 -91 .
[8] Shen Li; Stephen Y.H.Su;. Generalized Parallel Signature Analyzers with External Exclusive-OR Gates[J]. , 1986, 1(4): 49 -61 .
[9] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[10] Lu Xuemiao;. On the Complexity of Induction of Structural Descriptions[J]. , 1987, 2(1): 12 -21 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved