We use cookies to improve your experience with our site.

基于情感知识图谱的中文情感分析

EKBSA: A Chinese Sentiment Analysis Model by Enhancing K-BERT

  • 摘要:
    研究背景 文本情感分析是指通过分析、处理、归纳三个阶段,获得文本信息带有的主观情感倾向的过程。情感知识在很大程度上决定文本的情感倾向,有助于实现更准确的情感分析。对于中文情感分析而言,历史故事和寓言等赋予了词语更丰富的内涵和更复杂的情感,如‘鸡肋’一词暗含贬义,但模型无法直接从输入文本中获取这些知识。因此,将情感知识和情感分析模型结合十分重要。BERT等预训练语言模型已经在许多自然语言处理任务上取得了很好的效果,并且已有许多研究试图将常识知识与预训练语言模型结合,从而更好的适应各种下游任务。但是,情感知识仍未被充分考虑。
    目的 本文希望实现情感知识与预训练语言模型的有效结合,改善现有方法无法获取潜在情感倾向的不足。并进一步优化模型,提升中文情感分析的准确性。
    方法 本文提出了面向中文情感分析任务的模型EKBSA,在中文情感分析任务上取得了较好的效果,主要分为三部分。(1) 情感知识图谱的构建:合并多个现有情感词典,构建了大规模的情感知识图谱Sen-Graph。(2) 加入Type embedding:模型输入是文本和知识三元组结合的句子树,通过Type embedding赋予文本和知识三元组不同的标记,获得更好的表征;(3) 引入局部注意力机制:利用语法依存树来增强相关词语之间的注意力权重,提高模型的文本理解能力。此外,本文引入了Context-attention建模对话历史,提出Context EKBSA模型实现中文对话情感分析。
    结果 我们分别在两个中文数据集上对各个模型进行了对比实验,实验结果如表1所示。从实验结果可以看出,EKBSA在两个数据集上都取得了最好的结果。
    表1 中文情感分析实验结果。" HN "、" CN "和" Sen "分别表示HowNet、CN- DBpedia和Sen- Graph。
    Model\DatasetsChinese_metaphorBook_excerpet
    Accuracy (%)F1 (%)Accuracy (%)F1 (%)
    BERT87.2488.7487.9087.77
    RoBERTa88.1289.4787.5087.26
    ERNIE87.0288.8685.6985.28
    BERT–BiLSTM84.3886.8089.4088.22
    K-BERT (HN)86.4787.7689.3089.33
    K-BERT (CN)86.2587.9089.6089.28
    K-BERT (HN+Sen)87.9088.9690.1090.03
    K-BERT (CN+Sen)87.0288.5489.6089.52
    EKBSA (HN+Sen)88.1289.7290.8090.89
    EKBSA (CN+Sen)87.3588.7890.8090.78
    EKBSA (HN+CN+Sen)87.4689.1089.5089.45
    在中文对话情感分析数据集上进行了实验。如表2所示,Context-EKBSA实现了良好的性能。
    表2 中文对话情感分析实验结果。
    Model\DatasetsWBEmoDialogWBEmoDialog (with label)
    Accuracy (%)F1 (%)Accuracy (%)F1 (%)
    Context EKBSA (HN+CN+Sen)89.17%91.16%88.64%90.87%
    结论 实验结果表明,通过加入情感知识、Type embedding和局部注意力机制,EKBSA在中文情感分析任务上取得了较好的效果。目前,大语言模型在各种NLP任务中表现出显著的成就。对于情感分析任务,大语言模型优秀的生成能力能够实现更详细、更丰富的情感表达,而不仅仅完成简单的标签分类。EKBAS可以通过加入更强大的模型(如GPTs)来实现性能的进一步提升。

     

    Abstract: Pre-trained language models (PLMs), such as BERT, have achieved good results on many natural language processing (NLP) tasks. Recently, some studies have attempted to integrate factual knowledge into PLMs to adapt to various downstream tasks. For sentiment analysis tasks, sentiment knowledge, such as sentiment words, plays a significant role in determining the sentiment tendencies of texts. For Chinese sentiment analysis, historical stories and fables imbue words with richer connotations and more complex sentiments than those typically found in English, which makes sentiment knowledge injection necessary. But clearly, this knowledge has not been fully considered. In this paper, we propose EKBSA, a Chinese sentiment analysis model, which is based on the K-BERT model and utilizes a sentiment knowledge graph to achieve better results on sentiment analysis tasks. To construct a high-quality sentiment knowledge graph, we collect a large number of sentiment words by combining several existing sentiment lexica. Moreover, in order to understand texts better, we enhance local attention through syntactic analysis and direct to EKBSA focus more on syntactically relevant words. EKBSA is compatible with BERT and existing structural knowledge. Experimental results show that EKBSA achieves better performance on Chinese sentiment analysis tasks. Built upon EKBSA, we further change the general attention to the context attention and propose Context EKBSA, so that the model can adapt to sentiment analysis tasks in Chinese conversations and achieve good performance.

     

/

返回文章
返回