Recent Advances on Neural Headline Generation

Ayana; Shi-Qi Shen; Yan-Kai Lin; Cun-Chao Tu; Yu Zhao; Zhi-Yuan Liu; Mao-Song Sun

doi:10.1007/s11390-017-1758-3

Volume 32 Issue 4

July 2017

Turn off MathJax

Article Contents

Abstract

References

Journal of Computer Science and Technology > 2017 > 32(4): 768-784. > DOI: 10.1007/s11390-017-1758-3 CSTR: 32374.14.s11390-017-1758-3

Ayana, Shi-Qi Shen, Yan-Kai Lin, Cun-Chao Tu, Yu Zhao, Zhi-Yuan Liu, Mao-Song Sun. Recent Advances on Neural Headline Generation[J]. Journal of Computer Science and Technology, 2017, 32(4): 768-784. DOI: 10.1007/s11390-017-1758-3

Citation:

Previous Article Next Article

PDF

Recent Advances on Neural Headline Generation

Ayana^1,2,3,4 Student Member, CCF,
Shi-Qi Shen^1,2,3 Student Member, CCF,
Yan-Kai Lin^1,2,3,5 Student Member, CCF,
Cun-Chao Tu^1,2,3,5 Student Member, CCF,
Yu Zhao^1,2,3 CCF,
Zhi-Yuan Liu^1,2,3,5, , Senior Member, CCF,
Mao-Song Sun^1,2,3,5 Senior Member, CCF

1 Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China;
2 State Key Laboratory of Intelligent Technology and Systems, Tsinghua University, Beijing 100084, China;
3 Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China;
4 Department of Computer Information Management, Inner Mongolia University of Finance and Economics Hohhot 010000, China;
5 Jiangsu Collaborative Innovation Center for Language Ability, Jiangsu Normal University, Xuzhou 221009, China

Funds: This work is supported by the National Basic Research 973 Program of China under Grant No. 2014CB340501, the National Natural Science Foundation of China under Grant Nos. 61572273, 61532010, and Microsoft Research Asia under Grant No. FY17-RESTHEME-017.

More Information

Corresponding author:
Zhi-Yuan Liu E-mail: liuzy@tsinghua.edu.cn
Received Date: December 19, 2016
Revised Date: May 17, 2017
Published Date: July 04, 2017

Abstract

Abstract

Recently, neural models have been proposed for headline generation by learning to map documents to headlines with recurrent neural network. In this work, we give a detailed introduction and comparison of existing work and recent improvements in neural headline generation, with particular attention on how encoders, decoders and neural model training strategies alter the overall performance of the headline generation system. Furthermore, we perform quantitative analysis of most existing neural headline generation systems and summarize several key factors that impact the performance of headline generation systems. Meanwhile, we carry on detailed error analysis to typical neural headline generation systems in order to gain more comprehension. Our results and conclusions are hoped to benefit future research studies.
- neural network,
- headline generation,
- data analysis

FullText(HTML)

References (35)

References

[1]	Dorr B, Zajic D, Schwartz R. Hedge trimmer:A parse-andtrim approach to headline generation. In Proc. the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics on Text summarization workshop, Volume 5, May 2003, pp.1-8.
[2]	Chopra S, Auli M, Rush A M. Abstractive sentence summarization with attentive recurrent neural networks. In Proc. the Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, June 2016, pp.93-98.
[3]	Nallapati R, Zhou B, Santos C. Abstractive text summarization using sequence-to-sequence RNNS and beyond. http://aclweb.org/anthology/K/K16/K16-1028.pdf, May 2017.
[4]	Takase S, Suzuki J, Okazaki N, Hirao T, Nagata M. Neural headline generation on abstract meaning representation. In Proc. the Conference on Empirical Methods in Natural Language Processing, November 2016, pp.1054-1059.
[5]	Hu B, Chen Q, Zhu F. LCSTS:A large scale Chinese short text summarization dataset. In Proc. the Conference on Empirical Methods in Natural Language Processing, September 2015, pp.1967-1972.
[6]	Gu J, Lu Z, Li H, Li V O. Incorporating copying mechanism in sequence-to-sequence learning. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.1631-1640.
[7]	Rush A M, Chopra S, Weston J. A neural attention model for abstractive sentence summarization. In Proc. the Conference on Empirical Methods in Natural Language Processing, September 2015, pp.379-389.
[8]	Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is diffcult. IEEE Transactions on Neural Networks, 1994, 5(2):157-166.
[9]	Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. the Conference on Empirical Methods in Natural Language Processing, October 2014, pp.1724-1734.
[10]	Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8):1735-1780.
[11]	Kikuchi Y, Neubig G, Sasano R, Takamura H, Okumura M. Controlling output length in neural encoder-decoders. In Proc. the Conference on Empirical Methods in Natural Language Processing, November 2016, pp.1328-1338.
[12]	Miao Y, Blunsom P. Language as a latent variable:Discrete generative models for sentence compression. In Proc. the Conference on Empirical Methods in Natural Language Processing, November 2016, pp.319-328.
[13]	Schuster M, Paliwal K K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 1997, 45(11):2673-2681.
[14]	Bengio Y, Ducharme R, Vincent P, Jauvin C. A neural probabilistic language model. The Journal of Machine Learning Research, 2003, 3:1137-1155.
[15]	Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. In Proc. ICLR, May 2015.
[16]	Shen S, Cheng Y, He Z, He W, Wu H, Sun M, Liu Y. Minimum risk training for neural machine translation. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.1683-1692.
[17]	Ranzato M, Chopra S, Auli M, Zaremba W. Sequence level training with recurrent neural networks. In Proc. ICLR, May 2016.
[18]	Och F J. Minimum error rate training in statistical machine translation. In Proc. the 41st Annual Meeting on Association for Computational Linguistics, July 2003, pp.160-167.
[19]	Smith D A, Eisner J. Minimum risk annealing for training log-linear models. In Proc. the COLING/ACL Main Conference Poster Sessions, July 2006, pp.787-794.
[20]	Gao J, He X, Yih W, Deng L. Learning continuous phrase representations for translation modeling. In Proc. the 52nd Annual Meeting of the Association for Computational Linguistics, June 2014.
[21]	Lin C Y. ROUGE:A package for automatic evaluation of summaries. In Proc. the Workshop on Text Summarization Branches Out, July 2004.
[22]	Gulcehre C, Ahn S, Nallapati R, Zhou B, Bengio Y. Pointing the unknown words. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.140-149.
[23]	Vinyals O, Fortunato M, Jaitly N. Pointer networks. In Proc. Advances in Neural Information Processing Systems, Dec. 2015, pp.2692-2700.
[24]	Jean S, Cho K, Memisevic R, Bengio Y. On using very large target vocabulary for neural machine translation. In Proc. the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, July 2015, pp.1-10.
[25]	Vogel S, Ney H, Tillmann C. HMM-based word alignment in statistical translation. In Proc. the 16th Conference on Computational Linguistics, Aug. 1996, pp.836-841.
[26]	Tillmann C, Vogel S, Ney H, Zubiaga A. A DP-based search using monotone alignments in statistical translation. In Proc. the 35th Annual Meeting of the Association for Computational Linguistics, July 1997, pp.289-296.
[27]	Yu L, Buys J, Blunsom P. Online segment to segment neural transduction. In Proc. the Conference on Empirical Methods in Natural Language Processing, November 2016, pp.1307-1316.
[28]	Banko M, Mittal V O, Witbrock M J. Headline generation based on statistical translation. In Proc. the 38th Annual Meeting of the Association for Computational Linguistics, Oct. 2000, pp.318-325.
[29]	Napoles C, Gormley M, van Durme B. Annotated Gigaword. In Proc. the Joint Workshop on Automatic Knowledge Base Construction and Web-Scale Knowledge Extraction, June 2012, pp.95-100.
[30]	Zeiler M D. ADADELTA:An adaptive learning rate method. arXiv:1212.5701, 2012. https://arxiv.org/abs/12-12.5701, May 2017.
[31]	Luong T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation. In Proc. the Conference on Empirical Methods in Natural Language Processing, September 2015, pp.1412-1421.
[32]	Zajic D, Dorr B, Schwartz R. BBN/UMD at DUC-2004:Topiary. In Proc. the HLT-NAACL Document Understanding Workshop, Jan. 2004, pp.112-119.
[33]	Cheng J, Lapata M. Neural summarization by extracting sentences and words. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, Aug. 2016.
[34]	Cao Z, Li W, Li S, Wei F, Li Y. AttSum:Joint learning of focusing and summarization with neural attention. In Proc. the 26th International Conference on Computational Linguistics, December 2016, pp.547-556.
[35]	Allamanis M, Peng H, Sutton C. A convolutional attention network for extreme summarization of source code. In Proc. the 33rd International Conference on Machine Learning, June 2016, pp.2091-2100.

Relative Articles

[1]	Yi-Ting Chen, Lai-Long Luo, De-Ke Guo, Qian He. Carbon-Aware Energy Cost Optimization of Data Analytics Across Geo-Distributed Data Centers[J]. Journal of Computer Science and Technology, 2025, 40(3): 654-670. DOI: 10.1007/s11390-025-4636-4
[2]	Monidipa Das, Soumya K. Ghosh. Data-Driven Approaches for Spatio-Temporal Analysis: A Survey of the State-of-the-Arts[J]. Journal of Computer Science and Technology, 2020, 35(3): 665-696. DOI: 10.1007/s11390-020-9349-0
[3]	Zheng-Hao Jin, Haiyang Shi, Ying-Xin Hu, Li Zha, Xiaoyi Lu. CirroData: Yet Another SQL-on-Hadoop Data Analytics Engine with High Performance[J]. Journal of Computer Science and Technology, 2020, 35(1): 194-208. DOI: 10.1007/s11390-020-9536-z
[4]	Zhuo-Ran Liu, Yang Liu. Exploiting Unlabeled Data for Neural Grammatical Error Detection[J]. Journal of Computer Science and Technology, 2017, 32(4): 758-767. DOI: 10.1007/s11390-017-1757-4
[5]	Guo-Dao Sun, Ying-Cai Wu, Rong-Hua Liang, Shi-Xia Liu. A Survey of Visual Analytics Techniques and Applications: State-of-the-Art Research and Future Challenges[J]. Journal of Computer Science and Technology, 2013, 28(5): 852-867. DOI: 10.1007/s11390-013-1383-8
[6]	LUO Jianhua, ZHUANG Tiange. Reduction of Artifacts in Images from MR Truncated Data Using Singularity Spectrum Analysis[J]. Journal of Computer Science and Technology, 2000, 15(4): 360-367.
[7]	Zhou Jingzhou. A Neural Network Model Based on Logical Operations[J]. Journal of Computer Science and Technology, 1998, 13(5): 464-470.
[8]	Qin Kaihuai. Neural Network Methods for NURBS Curve and Surface Interpolation[J]. Journal of Computer Science and Technology, 1997, 12(1): 76-89.
[9]	Wei Naihong, Yang Shiyuan, Tong Shibai. A Neural Network Appraoch to Fault Diagnosis in Analog Circuits[J]. Journal of Computer Science and Technology, 1996, 11(6): 542-550.
[10]	Hou Luoming. A General and Formal Method for the Program Static Analysis[J]. Journal of Computer Science and Technology, 1987, 2(2): 115-123.

Supplements (0)

Cited By

Get Citation

PDF

XML

Article views (39) PDF downloads (1152)

Indexed in:

Recent Advances on Neural Headline Generation

Abstract

References

Related Articles

Catalog

Related

Home

Overview

Resources

Contents

Indexed in:

Recent Advances on Neural Headline Generation

Abstract

References

Related Articles

Catalog

Related

Home

Overview

Resources

Contents

Export File

Citation

Format

Content