计算机科学技术学报 ›› 2022,Vol. 37 ›› Issue (2): 309-319.doi: 10.1007/s11390-020-0326-4

所属专题: Artificial Intelligence and Pattern Recognition

• • 上一篇    下一篇

基于意图-槽关联建模的意图预测与槽填充联合方法

  

  • 收稿日期:2020-01-21 修回日期:2020-09-04 接受日期:2020-09-20 出版日期:2022-03-31 发布日期:2022-03-31

Intent-Slot Correlation Modeling for Joint Intent Prediction and Slot Filling

Jun-Feng Fan (樊骏锋), Mei-Ling Wang (汪美玲), Chang-Liang Li* (李长亮), Senior Member, CCF, Zi-Qiang Zhu (朱自强), and Lu Mao (毛璐)        

  1. AI Laboratory, KingSoft Corporation, Beijing 100190, China
  • Received:2020-01-21 Revised:2020-09-04 Accepted:2020-09-20 Online:2022-03-31 Published:2022-03-31
  • Contact: Chang-Liang Li E-mail:lichangliang@kingsoft.com
  • About author:Chang-Liang Li received his Ph.D. degree in pattern recognition and intelligence systems from Institute of Automation, Chinese Academy of Sciences, Beijing, in 2015. He is currently the principal of AI Laboratory, KingSoft Corporation, Beijing. His research interests include knowledge graph and machine translation.

1、研究背景(context)
口语理解系统(Spoken language understanding)是用于解析人类语言语义框架的常用方法,其在对话系统中发挥着重要的作用。口语理解系统主要由意图识别和槽填充两个基本任务组成。意图识别用于判断语言所表达的意图。槽填充则从语言中提取出关键信息作为自然语言查询的约束条件。
2、目的(Objective)
在语义框架解析中,槽位词和意图表现出很强的相关性。这体现在,在每个句子中,槽位词中蕴含的指示信息对句子所表达的意图起到决定作用,同时意图类别决定了槽位词的标注。然而,已有研究工作鲜有显式建模意图和槽位词之间的关联,导致这种关联信息并没有得到充分利用。本文关注意图和槽位词之间关联的显式建模,并提出一种基于意图-槽关联建模的意图预测与槽填充联合方法。所提模型将对口语理解系统的发展起到促进作用。
3、方法(Method)
首先,本文通过区分槽位词与其他词研究槽位词对意图的影响,进而将槽位词的识别看做一个序列标注任务,并提出通过双向长短时记忆(BiLSTM)模型解决该任务。其次,将槽位词识别的结果引入到基于注意力的意图预测和槽填充中,以优化语义解析的结果。另外,本文在槽填充任务中集成了槽-门控机制,以建模槽位词对意图的依赖关系。最后,本文利用联合优化训练方法实现槽位词识别、意图识别和槽填充三个任务。
4、结果(Result)
本文分别在ATIS与Snips数据集上进行了试验。所提模型在语义框架解析准确度方面分别在两个数据集上达到了86.4和78.5的成绩。实验结果表明,所提模型通过更加轻量的模型达到了目前该任务的最佳结果。此外,消融试验的结果进一步表明槽位词识别可有效促进模型的全局优化。
5、结论(Conclusions)

本文所提基于意图-槽关联建模的意图预测与槽填充联合方法,有效提高了语义框架解析的性能,而模型中设计的槽位词识别部分有效促进了意图识别、槽填充和语义框架解析。本文相信该工作将促进口语理解系统的发展和应用,并启发更多的相关研究。


关键词: 口语理解, 槽填充, 意图识别, 意图-槽关联, 槽位词识别

Abstract:

Slot filling and intent prediction are basic tasks in capturing semantic frame of human utterances. Slots and intent have strong correlation for semantic frame parsing. For each utterance, a specific intent type is generally determined with the indication information of words having slot tags (called as slot words), and in reverse the intent type decides that words of certain categories should be used to fill as slots. However, the Intent-Slot correlation is rarely modeled explicitly in existing studies, and hence may be not fully exploited. In this paper, we model Intent-Slot correlation explicitly and propose a new framework for joint intent prediction and slot filling. Firstly, we explore the effects of slot words on intent by differentiating them from the other words, and we recognize slot words by solving a sequence labeling task with the bi-directional long short-term memory (BiLSTM) model. Then, slot recognition information is introduced into attention-based intent prediction and slot filling to improve semantic results. In addition, we integrate the Slot-Gated mechanism into slot filling to model dependency of slots on intent. Finally, we obtain slot recognition, intent prediction and slot filling by training with joint optimization. Experimental results on the benchmark Air-line Travel Information System (ATIS) and Snips datasets show that our Intent-Slot correlation model achieves state-of-the-art semantic frame performance with a lightweight structure.


Key words: spoken language understanding, slot filling, intent prediction, Intent-Slot correlation, slot recognition

[1] Tur G, De Mori R. Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. John Wiley & Sons, 2011. DOI: 10.1002/9781119992691.
[2] Haffner P, Tür G, Wright J H. Optimizing SVMs for complex call classification. In Proc. the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, April 2003, pp.632-635. DOI: 10.1109/ICASSP.2003.1198860.
[3] Hu J, Wang G, Lochovsky F, Sun J T, Chen Z. Understanding user's query intent with Wikipedia. In Proc. the 18th International Conference on World Wide Web, April 2009, pp.471-480. DOI: 10.1145/1526709.1526773.
[4] Sarikaya R, Hinton G E, Ramabhadran B. Deep belief nets for natural language call-routing. In Proc. the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, May 2011, pp.5680-5683. DOI: 10.1109/ICASSP.2011.5947649.
[5] Raymond C, Riccardi G. Generative and discriminative algorithms for spoken language understanding. In Proc. the 8th Annual Conference of the International Speech Communication Association, August 2007, pp.1605-1608.
[6] Yao K, Peng B, Zhang Y, Yu D, Zweig G, Shi Y. Spoken language understanding using long short-term memory neural networks. In Proc. the 2014 IEEE Spoken Language Technology Workshop, Dec. 2014, pp.189-194. DOI: 10.1109/SLT.2014.7078572.
[7] Guo D, Tur G, Yih W T, Zweig G. Joint semantic utterance classification and slot filling with recursive neural networks. In Proc. the 2014 IEEE Spoken Language Technology Workshop, Dec. 2014, pp.554-559. DOI: 10.1109/SLT.2014.7078634.
[8] Hakkani-Tür D, Tür G, Celikyilmaz A, Chen Y N, Gao J, Deng L, Wang Y Y. Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In Proc. the 17th Annual Conference of the International Speech Communication Association, Sept. 2016, pp.715-719. DOI: 10.21437/Interspeech.2016-402.
[9] Chen Y N, Hakanni-Tür D, Tur G, Celikyilmaz A, Guo J, Deng L. Syntax or semantics? Knowledge-guided joint semantic frame parsing. In Proc. the 2016 IEEE Spoken Language Technology Workshop, Dec. 2016, pp.348-355. DOI: 10.1109/SLT.2016.7846288.
[10] Wang Y, Shen Y, Jin H. A bi-model based RNN semantic frame parsing model for intent detection and slot filling. In Proc. the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2018, pp.309-314. DOI: 10.18653/v1/N18-2050.
[11] Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling. In Proc. the 17th Annual Conference of the International Speech Communication Association, Sept. 2016, pp.685-689. DOI: 10.21437/Interspeech.2016-1352.
[12] Goo C W, Gao G, Hsu Y K, Huo C L, Chen T C, Hsu K W, Chen Y N. Slot-gated modeling for joint slot filling and intent prediction. In Proc. the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2018, pp.753-757. DOI: 10.18653/v1/N18-2118.
[13] Liu B, Lane I. Recurrent neural network structured output prediction for spoken language understanding. In Proc. NIPS Workshop on Machine Learning for Spoken Language Understanding and Interactions, Dec. 2015.
[14] Chen Q, Zhuo Z, Wang W. BERT for joint intent classification and slot filling. arXiv:1902.10909, 2019. https://arxiv.org/abs/1902.10909, August 2020.
[15] Li C, Li L, Qi J. A self-attentive model with gate mechanism for spoken language understanding. In Proc. the 2018 Conference on Empirical Methods in Natural Language Processing, October 31-November 4, 2018, pp.3824-3833. DOI: 10.18653/v1/D18-1417.
[16] Zhang C, Li Y, Du N, Fan W, Philip S Y. Joint slot filling and intent detection via capsule neural networks. In Proc. the 57th Annual Meeting of the Association for Computational Linguistics, July 28-August 2, 2019, pp.5259-5267. DOI: 10.18653/v1/P19-1519.
[17] E H H, Niu P, Chen Z, Song M. A novel bi-directional interrelated model for joint intent detection and slot filling. In Proc. the 57th Annual Meeting of the Association for Computational Linguistics, July 28-August 2, 2019, pp.5467-5471. DOI: 10.18653/v1/P19-1544.
[18] Schuster M, Paliwal K K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681. DOI: 10.1109/78.650093.
[19] Tur G, Hakkani-Tür D, Heck L. What is left to be understood in ATIS? In Proc. the 2010 IEEE Spoken Language Technology Workshop, Dec. 2010, pp.19-24. DOI: 10.1109/SLT.2010.5700816.
[20] Coucke A, Saade A, Ball A et al. Snips voice platform: An embedded spoken language understanding system for private-by-design voice interfaces. arXiv:1805.10190, 2018. https://arxiv.org/abs/1805.10190, August 2020.
[1] 孙鹏飞, 欧阳亚文, 宋定杰, 戴新宇. 基于自监督任务增强的小样本意图识别[J]. 计算机科学技术学报, 2022, 37(3): 527-538.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 孙钟秀; 商陆军;. DMODULA:A Distributed Programming Language[J]. , 1986, 1(2): 25 -31 .
[2] 王海鹰;. A Framework for Command Recovery in User Interface[J]. , 1990, 5(3): 296 -301 .
[3] 金芝;. The Structure and Semantics of an Object-Oriented Logic Programming Language: SCKE[J]. , 1995, 10(1): 74 -84 .
[4] 黄雄;. Some Undecidable Problems on Approximability of NP Optimization Problems[J]. , 1996, 11(2): 126 -132 .
[5] 李伟; 陈祖舜; 马晨宇; 何克忠; 王田苗;. Fuzzy Logic Based Behavior Fusion for Navigation of an Intelligent Mobile Robot[J]. , 1996, 11(4): 385 -394 .
[6] 周景洲;. A Neural Network Model Based on Logical Operations[J]. , 1998, 13(5): 464 -470 .
[7] Colin Stirling;. Playing Games and Proving Properties ofConcurrent Systems[J]. , 1998, 13(6): 482 .
[8] Pierre-Louts Curien;. Explicit Substitutions: A Short Survey[J]. , 1998, 13(6): 562 -563 .
[9] 王怀民; 吴泉源;. A Formal Framework of Multi-Agent Systems with Requirement/Service Cooperative Style[J]. , 2000, 15(2): 106 -115 .
[10] 章文嵩; 金士尧; 吴泉源;. LinuxDirector: A Connection Director for Scalable Internet Services[J]. , 2000, 15(6): 560 -571 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: