We use cookies to improve your experience with our site.
Ying Guo, Shu-Ting Hu, Min-Jing Yu, Ran Yi, Qi Wang, Jie Liu, Yong-Jin Liu. NR-CLIP: CLIP-Guided Multimodal News Recommendation via Multi-View LearningJ. Journal of Computer Science and Technology. DOI: 10.1007/s11390-025-5356-5
Citation: Ying Guo, Shu-Ting Hu, Min-Jing Yu, Ran Yi, Qi Wang, Jie Liu, Yong-Jin Liu. NR-CLIP: CLIP-Guided Multimodal News Recommendation via Multi-View LearningJ. Journal of Computer Science and Technology. DOI: 10.1007/s11390-025-5356-5

NR-CLIP: CLIP-Guided Multimodal News Recommendation via Multi-View Learning

  • In the era of social media, the evolution of news has diversified its format, incorporating texts, images, and videos. However, the majority of news recommendation methods focus solely on text data, overlooking the substantial role of news images. This paper introduces a news recommendation method based on CLIP (Contrastive Language-Image Pretraining), NR-CLIP, which is a CLIP-guided algorithm with enhanced multimodal news recommendation via multi-view learning. Specifically, our method employs a CLIP encoder to embed textual and visual information into the same feature space of neural networks, where a unified news textual representation is learned by treating titles, categories, subcategories, and bodies as different views of news. In addition, feature enhancements are applied to fully fuse textual and visual features. Finally, click history and user representations are embedded to predict the click probability of candidate news. Extensive comparisons and in-depth analyses with state-of-the-art news recommendation methods have been presented on V-MIND (Visual Microsoft News Dataset), which provides the visual information on the basis of the classic MIND (Microsoft News Dataset), demonstrating that our proposed method effectively improves the performance of news recommendation.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return