|
Journal of Computer Science and Technology ›› 2022, Vol. 37 ›› Issue (3): 699-718.doi: 10.1007/s11390-021-1076-7
Special Issue: Artificial Intelligence and Pattern Recognition
• Regular Paper • Previous Articles Next Articles
Rong-Yu Cao1,2 (曹荣禹), Student Member, CCF, Yi-Xuan Cao1,2 (曹逸轩), Member, CCF, IEEE, Gan-Bin Zhou3 (周干斌), and Ping Luo1,2,4 (罗平), Senior Member, CCF, Member, IEEE
[1] Bloechle J L. Physical and logical structure recognition of pdf documents [PhD Thesis]. University of Fribourg, 2010. [2] Mao S, Rosenfeld A, Kanungo T. Document structure analysis algorithms: A literature survey. In Proc. the 2003 Document Recognition and Retrieval X, Jan. 2003, pp.197-207. DOI: 10.1117/12.476326. [3] Pembe F C, Gungor T. Heading-based sectional hierarchy identification for HTML documents. In Proc. the 22nd International Symposium on Computer and Information Sciences, Nov. 2007. DOI: 10.1109/ISCIS.2007.4456839. [4] Geva M, Berant J. Learning to search in long documents using document structure. In Proc. the 27th International Conference on Computational Linguistics, Aug. 2018, pp.161-176. [5] Howard T, Bruce C. Inference networks for document retrieval. ACM SIGIR Forum, 2017, 51(2): 124-147. DOI: 10.1145/3130348.3130361. [6] Summers K. Automatic discovery of logical document structure [PhD Thesis]. Cornell University, 1998. [7] Luong M T, Nguyen T D, Kan M Y. Logical structure recovery in scholarly articles with rich document features. International Journal of Digital Library Systems, 2010, 1(4): 1-23. DOI: 10.4018/jdls.2010100101. [8] Pembe F C, Güngör T. A tree-based learning approach for document structure analysis and its application to Web search. Natural Language Engineering, 2014, 21(4): 569-605. DOI: 10.1017/S1351324914000023. [9] Ramakrishnan C, Patnia A, Hovy E, Burns G A. Layout-aware text extraction from full-text pdf of scientific articles. Source Code for Biology Medicine, 2012, 7(1): Article No. 7. DOI: 10.1186/1751-0473-7-7. [10] Manabe T, Tajima K. Extracting logical hierarchical structure of HTML documents based on headings. Proceedings of the VLDB Endowment, 2015, 8(12): 1606-1617. DOI: 10.14778/2824032.2824058. [11] Rahman M M, Finin T. Understanding the logical and semantic structure of large documents. arXiv:1709.00770, 2017. https://arxiv.org/abs/1709.00770, April 2021. [12] Bentabet N I, Juge R, Ferradans S. Table-of-contents generation on contemporary documents. In Proc. the 2019 International Conference on Document Analysis and Recognition, Sept. 2019, pp. 100-107. DOI: 10.1109/ICDAR.2019.00025. [13] Conway A. Page grammars and page parsing: A syntactic approach to document layout recognition. In Proc. the 2nd International Conference on Document Analysis and Recognition, Oct. 1993, pp.761-764. DOI: 10.1109/ICDAR.1993.395626. [14] Tsujimoto S, Asada H. Understanding multi-articled documents. In Proc. the 10th International Conference on Pattern Recognition, June 1990, pp.124-133. DOI: 10.1109/ICPR.1990.118163. [15] Constantin A, Pettifer S, Voronkov A. PDFX: Fully-automated PDF-to-XML conversion of scientific literature. In Proc. the 2013 ACM Symposium on Document Engineering, Sept. 2013, pp.177-180. DOI: 10.1145/2494266.2494271. [16] Tkaczyk D, Szostek P, Fedoryszak M, Dendek P J, Bolikowski. CERMINE: Automatic extraction of structured metadata from scientific literature. International Journal on Document Analysis and Recognition, 2015, 18(4): 317-335. DOI: 10.1007/s10032-015-0249-8. [17] Summers K. Toward a taxonomy of logical document structures. In Proc. the Dartmouth Institute for Advanced Graduate Studies: Electronic Publishing and the Information Superhighway, May 30-June 2, 1995, pp.124-133. [18] Baird H S, Jones S E, Fortune S J. Image segmentation by shape-directed covers. In Proc. the 10th International Conference on Pattern Recognition, June 1990, pp.820-825. DOI: 10.1109/ICPR.1990.118223. [19] Nagy G, Seth S, Viswanathan M. A prototype document image analysis system for technical journals. Computer, 1992, 25(7): 10-22. DOI: 10.1109/2.144436. [20] Kopec G E, Chou P A. Document image decoding using Markov source models. In Proc. the 1993 IEEE International Conference on Acoustics Speech and Signal Processing, April 1993, pp.85-88. DOI: 10.1109/ICASSP.1993.319753. [21] Xiao Y, Yumer E, Asente P, Kraley M, Kifer D, Giles C L. Learning to extract semantic structure from documents using multimodal fully convolutional neural network. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.4342-4351. DOI: 10.1109/CVPR.2017.462. [22] Augusto Borges Oliveira D, Palhares Viana M. Fast CNN-based document layout analysis. In Proc. the 2017 IEEE International Conference on Computer Vision Workshops, Oct. 2017, pp.1173-1180. DOI: 10.1109/ICCVW.2017.142. [23] Wong K Y, Casey R G, Wahl F M. Document analysis system. IBM Journal of Research and Development, 1982, 26(6): 647-656. DOI: 10.1147/rd.266.0647. [24] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In Proc. the 3rd International Conference on Learning Representations, May 2015. [25] Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In Proc. the 2015 IEEE International Conference on Computer Vision, Dec. 2015, pp.1520-1528. DOI: 10.1109/ICCV.2015.178. [26] He D, Cohen S, Price B, Kifer D, Giles C L. Multi-scale multi-task FCN for semantic page segmentation and table detection. In Proc. the 14th IAPR International Conference on Document Analysis and Recognition, Nov. 2017, pp.254-261. DOI: 10.1109/ICDAR.2017.50. [27] Schuster M, Paliwal K K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681. DOI: 10.1109/78.650093. [28] Zhou G, Luo P, Cao R, Xiao Y, Lin F, Chen B, He Q. Tree-structured neural machine for linguistics-aware sentence generation. In Proc. the 32nd AAAI Conference on Artificial Intelligence, February 2018, pp.5722-5729. [29] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. In Proc. the 27th International Conference on Neural Information Processing Systems, December 2014, pp.3104-3112. [30] Tan Z, Wang M, Xie J, Chen Y, Shi X. Deep semantic role labeling with self-attention. In Proc. the 32nd AAAI Conference on Artificial Intelligence, Feb. 2018, pp.4929-4936. [31] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I. Attention is all you need. In Proc. the 31st International Conference on Neural Information Processing, December 2017, pp.5998-6008. [32] Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In Proc. the 2013 International Conference on Learning Representations, May 2013. [33] Lin M, Chen Q, Yan S. Network in network. arXiv:1, 2013. https://arxiv.org/abs/1312.4400, Jan. 2021. [34] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proc. the 32nd International Conference on Machine Learning, July 2015, pp.448-456. [35] Nair V, Hinton G E. Rectified linear units improve restricted Boltzmann machines. In Proc. the 27th International Conference on Machine Learning, Jun. 2010, pp.807-814. [36] He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In Proc. the IEEE International Conference on Computer Vision, Dec. 2015, pp.1026-1034. DOI: 10.1109/ICCV.2015.123. [37] Kingma D P, Ba J. Adam: A method for stochastic optimization. In Proc. the 3rd International Conference on Learning Representations, May 2015. [38] Sergeev A, Del Balso M. Horovod: Fast and easy distributed deep learning in TensorFlow. arXiv:1802.05799, 2018. https://arxiv.org/abs/1802.05799, Jan. 2021. [39] Friedman J H. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 2001, 29(5): 1189-1232. DOI: 10.1214/aos/1013203451. |
No related articles found! |
|
|