Weakly- and Semi-Supervised Fast Region-Based CNN for Object Detection
-
Abstract
Learning an effective object detector with little supervision is an essential but challenging problem in computer vision applications. In this paper, we consider the problem of learning a deep convolutional neural network (CNN) based object detector using weakly-supervised and semi-supervised information in the framework of fast region-based CNN (Fast R-CNN). The target is to obtain an object detector as accurate as the fully-supervised Fast R-CNN, but it requires less image annotation effort. To solve this problem, we use weakly-supervised training images (i.e., only the image-level annotation is given) and a few proportions of fully-supervised training images (i.e., the bounding box level annotation is given), that is a weakly- and semi-supervised (WASS) object detection setting. The proposed solution is termed as WASS R-CNN, in which there are two main components. At first, a weakly-supervised R-CNN is firstly trained; after that semi-supervised data are used for finetuning the weakly-supervised detector. We perform object detection experiments on the PASCAL VOC 2007 dataset. The proposed WASS R-CNN achieves more than 85% of a fully-supervised Fast R-CNN's performance (measured using mean average precision) with only 10% of fully-supervised annotations together with weak supervision for all training images. The results show that the proposed learning framework can significantly reduce the labeling efforts for obtaining reliable object detectors.
-
-