We use cookies to improve your experience with our site.

Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node

Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node

  • 摘要: 半马尔可夫条件随机场(Semi-CRF)已经在包括中文分词的多种分割任务上成功应用。使用Semi-CRF进行分词的优势在于其天然的对字段建模的能力。尽管理论上存在着这种优势,但由于Semi-CRF的计算复杂度是与句子长度成二次关系的,导致其并不是中文分词的任务的最佳选择。在这篇文章中,我们提出了一个简单有效的框架来帮助Semi-CRF在与句子长度成线性关系的计算复杂度下取得与条件随机场相接近的性能。具体来说,我们首先通过双向长短时记忆网络(BiLSTM)来建模汉字级别的上下文信息,然后使用简单有效的聚合层来表示字段的信息并作为Semi-CRF的特征输入。然后,为了使得Semi-CRF可以在线性时间复杂度情况下建模任务长度的句子,我们提出了中继半马尔可夫条件随机场(Semi-CRF-Relay)模型。通过使用Semi-CRF与Semi-CRF-Relay直接建模字段,我们可以非常容易地将词语特征加入到模型之中,并且仅仅通过加入预训练的词语词向量就能获取很好的性能。在4个广泛使用的中文分词数据集上,我们的方法取得了明显的性能增强。我们的代码以及使用到的预训练词向量发布在https://github.com/fastnlp/fastNLP/。

     

    Abstract: Semi-Markov conditional random fields (Semi-CRFs) have been successfully utilized in many segmentation problems, including Chinese word segmentation (CWS). The advantage of Semi-CRF lies in its inherent ability to exploit properties of segments instead of individual elements of sequences. Despite its theoretical advantage, Semi-CRF is still not the best choice for CWS because its computation complexity is quadratic to the sentence's length. In this paper, we propose a simple yet effective framework to help Semi-CRF achieve comparable performance with CRF-based models under similar computation complexity. Specifically, we first adopt a bi-directional long short-term memory (BiLSTM) on character level to model the context information, and then use simple but effective fusion layer to represent the segment information. Besides, to model arbitrarily long segments within linear time complexity, we also propose a new model named Semi-CRFRelay. The direct modeling of segments makes the combination with word features easy and the CWS performance can be enhanced merely by adding publicly available pre-trained word embeddings. Experiments on four popular CWS datasets show the effectiveness of our proposed methods. The source codes and pre-trained embeddings of this paper are available on https://github.com/fastnlp/fastNLP/.

     

/

返回文章
返回