? 基于对象区域增强网络的场景解析
Journal of Computer Science and Technology
Quick Search in JCST
 Advanced Search 
      Home | PrePrint | SiteMap | Contact Us | Help
 
Indexed by   SCIE, EI ...
Bimonthly    Since 1986
Journal of Computer Science and Technology 2017, Vol. 32 Issue (4) :683-700    DOI: 10.1007/s11390-017-1751-x
Special Issue on Deep Learning << Previous Articles | Next Articles >>
基于对象区域增强网络的场景解析
Xin-Yu Ou1,2,3, Member, CCF, IEEE, Ping Li1,*, He-Fei Ling1, Member, CCF, ACM, IEEE, Si Liu2, Member, CCF, ACM, IEEE, Tian-Jiang Wang1, Member, CCF, ACM, IEEE, Dan Li1
1 School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China;
2 Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100091, China;
3 Cadres Online Learning Institute of Yunnan Province, Yunnan Open University, Kunming 650223, China
Objectness Region Enhancement Networks for Scene Parsing
Xin-Yu Ou1,2,3, Member, CCF, IEEE, Ping Li1,*, He-Fei Ling1, Member, CCF, ACM, IEEE, Si Liu2, Member, CCF, ACM, IEEE, Tian-Jiang Wang1, Member, CCF, ACM, IEEE, Dan Li1
1 School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China;
2 Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100091, China;
3 Cadres Online Learning Institute of Yunnan Province, Yunnan Open University, Kunming 650223, China

摘要
参考文献
相关文章
Download: [PDF 3951KB]  
摘要 近年来,语义分割得到了飞速的发展,然而现有的方法主要解决如何更好地识别和解析对象或实例。本文的任务是利用深度学习技术解决场景的语义理解问题。与现有方法不同,本文的目标不是构建一个全新的解析网络,而是通过提出一些有效的技术来推动现有场景解析算法的发展。对象区域增强是第一个有效的技术。它利用检测模块去生成带类别概率的对象区域,这些区域被用来直接对基准模型的解析特征进行加权。“额外背景类”作为一个特殊的类别被设计在检测和解析任务中收集困难像素和对象从而提高系统整体性能。在场景解析任务中,额外背景类依然有利于在训练中改进模型的性能。但是,在测试阶段一些像素可能被分配到这个并不存在的类别中。黑洞填充技术被提出用于避免这种算法缺陷带来的错分类问题。本文将这两个技术集成到整个解析框架中用于生成最终的解析结果。最终的统一框架被称为对象增强网络(OENet)。与当前的工作相比,OENet有效地改进了基准模型的性能,在MIT SceneParsing150数据集上获得了38.4的IoU和77.9%的像素准确率。方法的有效性同样在Cityscapes上得到验证。
关键词对象区域增强   黑洞填充策略   场景解析   实例增强   对象区域建议     
Abstract: Semantic segmentation has recently witnessed rapid progress, but existing methods only focus on identifying objects or instances. In this work, we aim to address the task of semantic understanding of scenes with deep learning. Different from many existing methods, our method focuses on putting forward some techniques to improve the existing algorithms, rather than to propose a whole new framework. Objectness enhancement is the first effective technique. It exploits the detection module to produce object region proposals with category probability, and these regions are used to weight the parsing feature map directly. "Extra background" category, as a specific category, is often attached to the category space for improving parsing result in semantic and instance segmentation tasks. In scene parsing tasks, extra background category is still beneficial to improve the model in training. However, some pixels may be assigned into this nonexistent category in inference. Black-hole filling technique is proposed to avoid the incorrect classification. For verifying these two techniques, we integrate them into a parsing framework for generating parsing result. We call this unified framework as Objectness Enhancement Network (OENet). Compared with previous work, our proposed OENet system effectively improves the performance over the original model on SceneParse150 scene parsing dataset, reaching 38.4 mIoU (mean intersectionover-union) and 77.9% accuracy in the validation set without assembling multiple models. Its effectiveness is also verified on the Cityscapes dataset.
Keywordsobjectness region enhancement   black-hole filling   scene parsing   instance enhancement   objectness region proposal     
Received 2016-12-20;
本文基金:

This work was supported by the Joint Funds of the National Natural Science Foundation of China under Grant No. U1536203, the National Natural Science Foundation of China under Grant Nos. 61572493, 61572214, and 61502185, the Major Scientific and Technological Innovation Project of Hubei Province of China under Grant No. 2015AAA013, the Open Project Program of the National Laboratory of Pattern Recognition of China under Grant No. 201600035, the Key Program of the Natural Science Foundation of the Open University of China under Grant No. G16F3702Z, and the Young Scientists Fund of the Natural Science Foundation of the Open University of China under Grant No. G16F2505Q.

通讯作者: Ping Li     Email: lpshome@hust.edu.cn
About author: Xin-Yu Ou received his B.E. degree in electronic information science and technology and M.S. degree in software engineering from Yunnan University (YNU), Kunming, in 2004 and 2009 respectively. He is a Ph.D. candidate in computer science and technology of Huazhong University of Science and Technology (HUST), Wuhan, and is also a visiting Ph.D. in Institute of Information Engineering, Chinese Academy of Sciences (CASIIE), Beijing. He is an associate professor in Yunnan Open University (YNOU), Kunming. His research interests include deep learning, image retrieval, and object detection and recognition.
引用本文:   
Xin-Yu Ou, Ping Li, He-Fei Ling, Si Liu, Tian-Jiang Wang, Dan Li.基于对象区域增强网络的场景解析[J]  Journal of Computer Science and Technology , 2017,V32(4): 683-700
Xin-Yu Ou, Ping Li, He-Fei Ling, Si Liu, Tian-Jiang Wang, Dan Li.Objectness Region Enhancement Networks for Scene Parsing[J]  Journal of Computer Science and Technology, 2017,V32(4): 683-700
链接本文:  
http://jcst.ict.ac.cn:8080/jcst/CN/10.1007/s11390-017-1751-x
Copyright 2010 by Journal of Computer Science and Technology