计算机科学技术学报 ›› 2022,Vol. 37 ›› Issue (3): 527-538.doi: 10.1007/s11390-022-2029-5

所属专题: Artificial Intelligence and Pattern Recognition

• • 上一篇    下一篇

基于自监督任务增强的小样本意图识别

  

  • 收稿日期:2021-11-19 修回日期:2022-03-10 接受日期:2022-04-08 出版日期:2022-05-30 发布日期:2022-05-30

Self-Supervised Task Augmentation for Few-Shot Intent Detection

Peng-Fei Sun (孙鹏飞), Ya-Wen Ouyang (欧阳亚文), Ding-Jie Song (宋定杰), and Xin-Yu Dai* (戴新宇), Member, CCF        

  1. National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China
  • Received:2021-11-19 Revised:2022-03-10 Accepted:2022-04-08 Online:2022-05-30 Published:2022-05-30
  • Contact: Xin-Yu Dai E-mail:daixinyu@nju.edu.cn
  • About author:Xin-Yu Dai received his B.E. and Ph.D. degrees in computer science from Nanjing University, Nanjing, in 1999 and 2005, respectively. He has been on leave from Nanjing University, Nanjing, from August 2010 to September 2011 to visit EECS Department and Statistics Department at UC Berkeley, Berkeley. He is currently a professor with the National Key Laboratory for Novel Software Technology and Department of Computer Science and Technology at Nanjing University, Nanjing. His research interests include natural language processing and knowledge engineering.
  • Supported by:
    This work was partially supported by the National Natural Science Foundation of China under Grant Nos. 61936012 and 61976114.

1、研究背景(context):近年来,元学习已经成为解决小样本学习问题的主流范式。但这些方法的卓越性能依赖于足够的元训练任务。在元训练任务不足的情况下,元学习器可能会过拟合这些元训练任务,这限制了它们的泛化能力。不幸的是,上述问题在小样本意图识别应用中更为明显。这是因为现实场景中有很多意图,为每个意图收集足够的数据并构造大量元训练任务是不切实际的。
2、目的(Objective):本章提出了一种新颖的基于自监督任务增强模型——STAM,该模型通过在元训练阶段生成大量的元训练任务,并添加两个额外的学习目标,将自监督学习与元学习相结合,以提高元学习在元任务不足情况下的泛化能力。
3、方法(Method):STAM 模型先是通过静态增强和动态增强生成语义相似但具有不同特征的元训练任务。除了任务增强策略外,我们还在STAM模型中添加两个辅助损失将自监督学习集成到元学习中。目标是利用数据本身的结构信息作为监督信号,学习到可迁移的特征,以提高元学习在元任务不足场景下的泛化能力。
4、结果(Result & Findings):本文在4个数据集上进行了实验,实验结果发现本文提出的STAM与其他先进的方法相比,STAM具有更好的性能。具体来说,相对于每个数据集的最佳基线,STAM在1-shot和5-shot下的平均准确率,分别提高了1.3%和2.2%。
5、结论(Conclusions):本文提出了一种新的STAM模型来克服元训练任务不足导致的潜在过拟合问题。通过广泛的实验,表明本文所提出的STAM模型在四个公开数据集上实现了最先进的性能。在未来的工作中,我们将探索 STAM模型在其他领域的有效性。

关键词: 自监督学习, 任务增强, 元学习, 小样本意图识别

Abstract: Few-shot intent detection is a practical challenge task, because new intents are frequently emerging and collecting large-scale data for them could be costly. Meta-learning, a promising technique for leveraging data from previous tasks to enable efficient learning of new tasks, has been a popular way to tackle this problem. However, the existing meta-learning models have been evidenced to be overfitting when the meta-training tasks are insufficient. To overcome this challenge, we present a novel self-supervised task augmentation with meta-learning framework, namely STAM. Firstly, we introduce the task augmentation, which explores two different strategies and combines them to extend meta-training tasks. Secondly, we devise two auxiliary losses for integrating self-supervised learning into meta-learning to learn more generalizable and transferable features. Experimental results show that STAM can achieve consistent and considerable performance improvement to existing state-of-the-art methods on four datasets.

Key words: self-supervised learning, task augmentation, meta-learning, few-shot intent detection

[1] Yao H, Zhang L, Finn C. Meta-learning with fewer tasks through task interpolation. arXiv:2106.02695, 2021. https: //arxiv.org/abs/2106.02695, June 2021.

[2] Bansal T, Jha R, Munkhdalai T, McCallum A. Self-supervised meta-learning for few-shot natural language classification tasks. In Proc. the 2020 Conference on Empirical Methods in Natural Language Processing, November 2020, pp.522-534. DOI: 10.18653/v1/2020.emnlp-main.38.

[3] Yao H, Huang L K, Zhang L et al. Improving generalization in meta-learning via task augmentation. In Proc. the 38th International Conference on Machine Learning, July 2021, pp.11887-11897.

[4] Wang H, Deng Z H. Cross-domain few-shot classification via adversarial task augmentation. In Proc. the 30th International Joint Conference on Artificial Intelligence, August 2021, pp.1075-1081. DOI: 10.24963/ijcai.2021/149.

[5] Murty S, Hashimoto T, Manning C D. DReCa: A general task augmentation strategy for few-shot natural language inference. In Proc. the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2021, pp.1113-1125. DOI: 10.18653/v1/2021.naacl-main.88.

[6] Xia C, Zhang C, Yan X, Chang Y, Yu P S. Zeroshot user intent detection via capsule neural networks. In Proc. the 2018 Conference on Empirical Methods in Natural Language Processing, October 31-November 4, 2018, pp.3090-3099. DOI: 10.18653/v1/D18-1348.

[7] Edunov S, Ott M, Auli M, Grangier D. Understanding back-translation at scale. In Proc. the 2018 Conference on Empirical Methods in Natural Language Processing, October 31-November 4, 2018, pp.489-500. DOI: 10.18653/v1/D18-1045.

[8] Sennrich R, Haddow B, Birch A. Improving neural machine translation models with monolingual data. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.86-96. DOI: 10.18653/v1/P16-1009.

[9] Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2019, pp.4171-4186. DOI: 10.18653/v1/N19-1423.

[10] Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692, 2019. https://arxiv.org/abs/ 1907.11692, July 2021.

[11] Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V. Unsupervised cross-lingual representation learning at scale. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, July 2020, pp.8440-8451. DOI: 10.18653/v1/2020.acl-main.747.

[12] Zhang Y, He R, Liu Z, Lim K H, Bing L. An unsupervised sentence embedding method by mutual information maximization. In Proc. the 2020 Conference on Empirical Methods in Natural Language Processing, November 2020, pp.1601-1610. DOI: 10.18653/v1/2020.emnlp-main.124.

[13] Meng Y, Xiong C, Bajaj P, Tiwary S, Bennett P, Han J, Song X. COCO-LM: Correcting and contrasting text sequences for language model pretraining. In Proc. the 35th International Conference on Neural Information Processing Systems, December 2021.

[14] Gao T, Yao X, Chen D. SimCSE: Simple contrastive learning of sentence embeddings. In Proc. the 2021 Conference on Empirical Methods in Natural Language Processing, November 2021, pp.6894-6910. DOI: 10.18653/v1/2021.emnlp-main.552.

[15] Gidaris S, Bursuc A, Komodakis N, Pérez P, Cord M. Boosting few-shot visual learning with self-supervision. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, October 27-November 2, 2019, pp.8059-8068. DOI: 10.1109/ICCV.2019.00815.

[16] Su J C, Maji S, Hariharan B. When does self-supervision improve few-shot learning? In Proc. the 16th European Conference on Computer Vision, August 2020, pp.645-666. DOI: 10.1007/978-3-030-58571-6.

[17] Zhang M, Zhang J, Lu Z, Xiang T, Ding M, Huang S. IEPT: Instance-level and episode-level pretext tasks for few-shot learning. In Proc. the 9th International Conference on Learning Representations, May 2021.

[18] Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proc. the 34th International Conference on Machine Learning, August 2017, pp.1126-1135.

[19] Nichol A, Achiam J, Schulman J. On first-order meta-learning algorithms. arXiv:1803.02999, 2018. https:// arxiv.org/abs/1803.02999, October 2021.

[20] Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T. One-shot learning with memory-augmented neural networks. arXiv:1605.06065, 2016. https://arxiv.org/ abs/1605.06065, May 2021.

[21] Munkhdalai T, Yu H. Meta networks. In Proc. the 34th International Conference on Machine Learning, August 2017, pp.2554-2563.

[22] Koch G, Zemel R, Salakhutdinov R et al. Siamese neural networks for one-shot image recognition. In Proc. the 32nd International Conference on Machine Learning Deep Learning Workshop, July 2015.

[23] Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In Proc. the 2016 Annual Conference on Neural Information Processing Systems, December 2016, pp.3630-3638.

[24] Snell J, Swersky K, Zemel R S. Prototypical networks for few-shot learning. In Proc. the 2017 Annual Conference on Neural Information Processing Systems, December 2017, pp.4077-4087.

[25] Sung F, Yang Y, Zhang L, Xiang T, Torr P H, Hospedales T M. Learning to compare: Relation network for few-shot learning. In Proc. the 2018 IEEE Conference on Computer Vision and Pattern Recognition, June 2018, pp.1199-1208. DOI: 10.1109/CVPR.2018.00131.

[26] Nagabandi A, Clavera I, Liu S, Fearing R S, Abbeel P, Levine S, Finn C. Learning to adapt in dynamic, real-world environments through meta-reinforcement learning. In Proc. the 7th International Conference on Learning Representations, May 2019.

[27] Rakelly K, Zhou A, Finn C, Levine S, Quillen D. Efficient off-policy meta-reinforcement learning via probabilistic context variables. In Proc. the 36th International Conference on Machine Learning, June 2019, pp.5331-5340.

[28] Liu L, Zhou T, Long G, Jiang J, Zhang C. Learning to propagate for graph meta-learning. In Proc. the 2019 Annual Conference on Neural Information Processing Systems, December 2019, pp.1037-1048.

[29] Chen M, Zhang W, Zhang W, Chen Q, Chen H. Meta relational learning for few-shot link prediction in knowledge graphs. In Proc. the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, November 2019, pp.4216-4225. DOI: 10.18653/v1/D19-1431.

[30] Zhu K, Zhai W, Cao Y. Self-supervised tuning for few-shot segmentation. In Proc. the 29th International Joint Conference on Artificial Intelligence, July 2020, pp.1019-1025. DOI: 10.24963/ijcai.2020/142.

[31] Liu S, Davison A J, Johns E. Self-supervised generalisation with meta auxiliary learning. In Proc. the 2019 Annual Conference on Neural Information Processing Systems, December 2019, pp.1677-1687.

[32] Dopierre T, Gravier C, Subercaze J, Logerais W. Few-shot pseudo-labeling for intent detection. In Proc. the 28th International Conference on Computational Linguistics, December 2020, pp.4993-5003. DOI: 10.18653/v1/2020.coling-main.438.

[33] Casanueva I, Temčinas T, Gerz D, Henderson M, Vulić I. Efficient intent detection with dual sentence encoders. In Proc. the 2nd Workshop on Natural Language Processing for Conversational AI, July 2020, pp.38-45. DOI: 10.18653/v1/2020.nlp4convai-1.5.

[34] Kumar M, Kumar V, Glaude H, De Lichy C, Alok A, Gupta R. Protoda: Efficient transfer learning for few-shot intent classification. In Proc. the 2021 IEEE Spoken Language Technology Workshop, January 2021, pp.966-972. DOI: 10.1109/SLT48900.2021.9383495.

[35] Zhang J, Bui T, Yoon S, Chen X, Liu Z, Xia C, Tran Q H, Chang W, Yu P. Few-shot intent detection via contrastive pre-training and fine-tuning. In Proc. the 2021 Conference on Empirical Methods in Natural Language Processing, November 2021, pp.1906-1912. DOI: 10.18653/v1/2021.emnlp-main.144.

[36] Yuan P, Mobiny A, Jahanipour J, Li X, Cicalese P A, Roysam B, Patel V M, Dragan M, Nguyen H V. Few is enough: Task-augmented active meta-learning for brain cell classification. In Proc. the 2020 Medical Image Computing and Computer Assisted Intervention, October 2020, pp.367-377. DOI: 10.1007/978-3-030-59710-8.

[37] Wu L, Li J, Wang Y, Meng Q et al. R-drop: Regularized dropout for neural networks. In Proc. the 2021 Annual Conference on Neural Information Processing Systems, December 2021.

[38] Chen T, Kornblith S, Norouzi M, Hinton G. A simple framework for contrastive learning of visual representations. In Proc. the 37th International Conference on Machine Learning, July 2020, pp.1597-1607.

[39] Coucke A, Saade A, Ball A, Bluche T, Caulier A, Leroy D, Doumouro C, Gisselbrecht T, Caltagirone F, Lavril T, Primet M, Dureau J. Snips voice platform: An embedded spoken language understanding system for private-by-design voice interfaces. arXiv:1805.10190, 2018. https: //arxiv.org/abs/1805.10190, December 2021.

[40] Larson S, Mahendran A, Peper J J, Clarke C, Lee A, Hill P, Kummerfeld J K, Leach K, Laurenzano M A, Tang L, Mars J. An evaluation dataset for intent classification and out-of-scope prediction. In Proc. the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, November 2019, pp.1311-1316. DOI: 10.18653/v1/D19-1131.

[41] Liu X, Eshghi A, Swietojanski P, Rieser V. Benchmarking natural language understanding services for building conversational agents. In Proc. the 10th International Workshop on Spoken Dialogue Systems, April 2019, pp.165-183. DOI: 10.1007/978-981-15-9323-9.

[42] Bao Y, Wu M, Chang S, Barzilay R. Few-shot text classification with distributional signatures. In Proc. the 8th International Conference on Learning Representations, April 2020.

[43] Li Y, Zhang J. Semi-supervised meta-learning for cross-domain few-shot intent classification. In Proc. the 1st Workshop on Meta Learning and Its Applications to Natural Language Processing, August 2021, pp.67-75. DOI: 10.18653/v1/2021.metanlp-1.8.

[44] Dopierre T, Gravier C, Logerais W. A neural few-shot text classification reality check. In Proc. the 16th Conference of the European Chapter of the Association for Computational Linguistics, April 2021, pp.935-943. DOI: 10.18653/v1/2021.eacl-main.79.

[45] Kingma D P, Ba J. Adam: A method for stochastic optimization. In Proc. the 3rd International Conference on Learning Representations, May 2015.

[46] Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D. Supervised contrastive learning. In Proc. the 2020 Annual Conference on Neural Information Processing Systems, December 2020. pp.18661-18673.

[47] Ott M, Edunov S, Grangier D, Auli M. Scaling neural machine translation. In Proc. the 3rd Conference on Machine Translation: Research Papers, October 31-November 1, 2018, pp.1-9. DOI: 10.18653/v1/W18-6301.

[48] Ng N, Yee K, Baevski A, Ott M, Auli M, Edunov S. Facebook FAIR’s WMT19 news translation task submission. In Proc. the 4th Conference on Machine Translation, August 2019, pp.314-319. DOI: 10.18653/v1/W19-5333.

[1] 刘凡, 陈德龙, 周睿志, 杨赛, 许峰. 基于自监督音频动作同步性学习的音乐驱动的指挥动作生成[J]. 计算机科学技术学报, 2022, 37(3): 539-558.
[2] 方鹏飞, 李贤, 燕阳, 章帅, 康启越, 李晓飞, 蓝振忠. 连接自监督学习中的节点:给初学者的一篇简要综述[J]. 计算机科学技术学报, 2022, 37(3): 507-526.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 周笛;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] 陈世华;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] 王建潮; 魏道政;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[4] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] 郑国梁; 李辉;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .
[7] 黄学东; 蔡莲红; 方棣棠; 迟边进; 周立; 蒋力;. A Computer System for Chinese Character Speech Input[J]. , 1986, 1(4): 75 -83 .
[8] 许小曙;. Simplification of Multivalued Sequential SULM Network by Using Cascade Decomposition[J]. , 1986, 1(4): 84 -95 .
[9] 唐同诰; 招兆铿;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[10] 衷仁保; 邢林; 任朝阳;. An Interactive System SDI on Microcomputer[J]. , 1987, 2(1): 64 -71 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: