Journal of Computer Science and Technology ›› 2022, Vol. 37 ›› Issue (2): 309-319.doi: 10.1007/s11390-020-0326-4

Special Issue: Artificial Intelligence and Pattern Recognition

• Artificial Intelligence and Pattern Recognition • Previous Articles     Next Articles

Intent-Slot Correlation Modeling for Joint Intent Prediction and Slot Filling

Jun-Feng Fan (樊骏锋), Mei-Ling Wang (汪美玲), Chang-Liang Li* (李长亮), Senior Member, CCF, Zi-Qiang Zhu (朱自强), and Lu Mao (毛璐)        

  1. AI Laboratory, KingSoft Corporation, Beijing 100190, China
  • Received:2020-01-21 Revised:2020-09-04 Accepted:2020-09-20 Online:2022-03-31 Published:2022-03-31
  • Contact: Chang-Liang Li E-mail:lichangliang@kingsoft.com
  • About author:Chang-Liang Li received his Ph.D. degree in pattern recognition and intelligence systems from Institute of Automation, Chinese Academy of Sciences, Beijing, in 2015. He is currently the principal of AI Laboratory, KingSoft Corporation, Beijing. His research interests include knowledge graph and machine translation.

Slot filling and intent prediction are basic tasks in capturing semantic frame of human utterances. Slots and intent have strong correlation for semantic frame parsing. For each utterance, a specific intent type is generally determined with the indication information of words having slot tags (called as slot words), and in reverse the intent type decides that words of certain categories should be used to fill as slots. However, the Intent-Slot correlation is rarely modeled explicitly in existing studies, and hence may be not fully exploited. In this paper, we model Intent-Slot correlation explicitly and propose a new framework for joint intent prediction and slot filling. Firstly, we explore the effects of slot words on intent by differentiating them from the other words, and we recognize slot words by solving a sequence labeling task with the bi-directional long short-term memory (BiLSTM) model. Then, slot recognition information is introduced into attention-based intent prediction and slot filling to improve semantic results. In addition, we integrate the Slot-Gated mechanism into slot filling to model dependency of slots on intent. Finally, we obtain slot recognition, intent prediction and slot filling by training with joint optimization. Experimental results on the benchmark Air-line Travel Information System (ATIS) and Snips datasets show that our Intent-Slot correlation model achieves state-of-the-art semantic frame performance with a lightweight structure.


Key words: spoken language understanding; slot filling; intent prediction; Intent-Slot correlation; slot recognition ;

[1] Tur G, De Mori R. Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. John Wiley & Sons, 2011. DOI: 10.1002/9781119992691.
[2] Haffner P, Tür G, Wright J H. Optimizing SVMs for complex call classification. In Proc. the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, April 2003, pp.632-635. DOI: 10.1109/ICASSP.2003.1198860.
[3] Hu J, Wang G, Lochovsky F, Sun J T, Chen Z. Understanding user's query intent with Wikipedia. In Proc. the 18th International Conference on World Wide Web, April 2009, pp.471-480. DOI: 10.1145/1526709.1526773.
[4] Sarikaya R, Hinton G E, Ramabhadran B. Deep belief nets for natural language call-routing. In Proc. the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, May 2011, pp.5680-5683. DOI: 10.1109/ICASSP.2011.5947649.
[5] Raymond C, Riccardi G. Generative and discriminative algorithms for spoken language understanding. In Proc. the 8th Annual Conference of the International Speech Communication Association, August 2007, pp.1605-1608.
[6] Yao K, Peng B, Zhang Y, Yu D, Zweig G, Shi Y. Spoken language understanding using long short-term memory neural networks. In Proc. the 2014 IEEE Spoken Language Technology Workshop, Dec. 2014, pp.189-194. DOI: 10.1109/SLT.2014.7078572.
[7] Guo D, Tur G, Yih W T, Zweig G. Joint semantic utterance classification and slot filling with recursive neural networks. In Proc. the 2014 IEEE Spoken Language Technology Workshop, Dec. 2014, pp.554-559. DOI: 10.1109/SLT.2014.7078634.
[8] Hakkani-Tür D, Tür G, Celikyilmaz A, Chen Y N, Gao J, Deng L, Wang Y Y. Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM. In Proc. the 17th Annual Conference of the International Speech Communication Association, Sept. 2016, pp.715-719. DOI: 10.21437/Interspeech.2016-402.
[9] Chen Y N, Hakanni-Tür D, Tur G, Celikyilmaz A, Guo J, Deng L. Syntax or semantics? Knowledge-guided joint semantic frame parsing. In Proc. the 2016 IEEE Spoken Language Technology Workshop, Dec. 2016, pp.348-355. DOI: 10.1109/SLT.2016.7846288.
[10] Wang Y, Shen Y, Jin H. A bi-model based RNN semantic frame parsing model for intent detection and slot filling. In Proc. the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2018, pp.309-314. DOI: 10.18653/v1/N18-2050.
[11] Liu B, Lane I. Attention-based recurrent neural network models for joint intent detection and slot filling. In Proc. the 17th Annual Conference of the International Speech Communication Association, Sept. 2016, pp.685-689. DOI: 10.21437/Interspeech.2016-1352.
[12] Goo C W, Gao G, Hsu Y K, Huo C L, Chen T C, Hsu K W, Chen Y N. Slot-gated modeling for joint slot filling and intent prediction. In Proc. the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2018, pp.753-757. DOI: 10.18653/v1/N18-2118.
[13] Liu B, Lane I. Recurrent neural network structured output prediction for spoken language understanding. In Proc. NIPS Workshop on Machine Learning for Spoken Language Understanding and Interactions, Dec. 2015.
[14] Chen Q, Zhuo Z, Wang W. BERT for joint intent classification and slot filling. arXiv:1902.10909, 2019. https://arxiv.org/abs/1902.10909, August 2020.
[15] Li C, Li L, Qi J. A self-attentive model with gate mechanism for spoken language understanding. In Proc. the 2018 Conference on Empirical Methods in Natural Language Processing, October 31-November 4, 2018, pp.3824-3833. DOI: 10.18653/v1/D18-1417.
[16] Zhang C, Li Y, Du N, Fan W, Philip S Y. Joint slot filling and intent detection via capsule neural networks. In Proc. the 57th Annual Meeting of the Association for Computational Linguistics, July 28-August 2, 2019, pp.5259-5267. DOI: 10.18653/v1/P19-1519.
[17] E H H, Niu P, Chen Z, Song M. A novel bi-directional interrelated model for joint intent detection and slot filling. In Proc. the 57th Annual Meeting of the Association for Computational Linguistics, July 28-August 2, 2019, pp.5467-5471. DOI: 10.18653/v1/P19-1544.
[18] Schuster M, Paliwal K K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681. DOI: 10.1109/78.650093.
[19] Tur G, Hakkani-Tür D, Heck L. What is left to be understood in ATIS? In Proc. the 2010 IEEE Spoken Language Technology Workshop, Dec. 2010, pp.19-24. DOI: 10.1109/SLT.2010.5700816.
[20] Coucke A, Saade A, Ball A et al. Snips voice platform: An embedded spoken language understanding system for private-by-design voice interfaces. arXiv:1805.10190, 2018. https://arxiv.org/abs/1805.10190, August 2020.
[1] YAN Pengju (燕鹏举), ZHENG Fang (郑方), SUN Hui (孙辉) and XU Mingxing (徐明星). Spontaneous Speech Parsing in Travel Information Inquiring and Booking Systems [J]. , 2002, 17(6): 0-0.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Sun Zhongxiu; Shang Lujun;. DMODULA:A Distributed Programming Language[J]. , 1986, 1(2): 25 -31 .
[2] Wang Haiying;. A Framework for Command Recovery in User Interface[J]. , 1990, 5(3): 296 -301 .
[3] Jin Zhi;. The Structure and Semantics of an Object-Oriented Logic Programming Language: SCKE[J]. , 1995, 10(1): 74 -84 .
[4] Jiamg Xiong;. Some Undecidable Problems on Approximability of NP Optimization Problems[J]. , 1996, 11(2): 126 -132 .
[5] Li Wei; Chen Zushun; Ma Chenyu; He Kezhong; Wang Tianmiao;. Fuzzy Logic Based Behavior Fusion for Navigation of an Intelligent Mobile Robot[J]. , 1996, 11(4): 385 -394 .
[6] Zhou Jingzhou;. A Neural Network Model Based on Logical Operations[J]. , 1998, 13(5): 464 -470 .
[7] Colin Stirling;. Playing Games and Proving Properties ofConcurrent Systems[J]. , 1998, 13(6): 482 .
[8] Pierre-Louts Curien;. Explicit Substitutions: A Short Survey[J]. , 1998, 13(6): 562 -563 .
[9] WANG Huaimin; WU Quanyuan;. A Formal Framework of Multi-Agent Systems with Requirement/Service Cooperative Style[J]. , 2000, 15(2): 106 -115 .
[10] ZHANG Wensong; JIN Shiyao; WU Quanyuan;. LinuxDirector: A Connection Director for Scalable Internet Services[J]. , 2000, 15(6): 560 -571 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved