›› 2016,Vol. 31 ›› Issue (3): 489-500.doi: 10.1007/s11390-016-1642-6

所属专题: Artificial Intelligence and Pattern Recognition

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

用于食品识别和分析的深度卷积神经网络多任务学习

Xi-Jin Zhang, Student Member, CCF, Yi-Fan Lu, Student Member, CCF, and Song-Hai Zhang*, Member, CCF, ACM   

  1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
  • 收稿日期:2015-12-01 修回日期:2016-03-09 出版日期:2016-05-05 发布日期:2016-05-05
  • 通讯作者: Song-Hai Zhang E-mail:shz@tsinghua.edu.cn
  • 作者简介:Xi-Jin Zhang is currently a Ph.D. student at Tsinghua University, Beijing. He received his B.S. degree in electronic and information engineering from Xidian University, Xi'an, in 2014. His research interests include image and video processing, computer vision, and machine learning.
  • 基金资助:

    This work was supported by the National High Technology Research and Development 863 Program of China under Grant No. 2013AA013903, the National Natural Science Foundation of China under Grant No. 61373069, the Research Grant of Beijing Higher Institution Engineering Research Center, and the Tsinghua University Initiative Scientific Research Program.

Multi-Task Learning for Food Identification and Analysis with Deep Convolutional Neural Networks

Xi-Jin Zhang, Student Member, CCF, Yi-Fan Lu, Student Member, CCF, and Song-Hai Zhang*, Member, CCF, ACM   

  1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
  • Received:2015-12-01 Revised:2016-03-09 Online:2016-05-05 Published:2016-05-05
  • Contact: Song-Hai Zhang E-mail:shz@tsinghua.edu.cn
  • Supported by:

    This work was supported by the National High Technology Research and Development 863 Program of China under Grant No. 2013AA013903, the National Natural Science Foundation of China under Grant No. 61373069, the Research Grant of Beijing Higher Institution Engineering Research Center, and the Tsinghua University Initiative Scientific Research Program.

在这篇论文中, 我们提出了一种利用深层卷积神经网络从菜品图像中同时识别菜品类别, 菜品中的原料和菜品的烹调方法的多任务系统。我们建立了一个有360种菜品的数据集, 其中每种菜品至少有500张图像。为了减少从互联网上收集的数据中的噪声, 建立数据集时数据集中的异常数据被用基于深层卷积神经网络特征训练得到的单类别SVM检测出来并清除掉。我们同时训练了一个菜品识别器, 一个烹调方法识别器和一个多标签的菜品原料检测器, 它们共享深层网络结构中较低的几层。我们提出的系统比基于手工选取特征训练的传统方法识别准确率更高, 且可以被用于没有被包括在训练数据集中的新菜品种类, 为用户提供可供参考的菜品原料和烹调方法信息。

Abstract: In this paper, we proposed a multi-task system that can identify dish types, food ingredients, and cooking methods from food images with deep convolutional neural networks. We built up a dataset of 360 classes of different foods with at least 500 images for each class. To reduce the noises of the data, which was collected from the Internet, outlier images were detected and eliminated through a one-class SVM trained with deep convolutional features. We simultaneously trained a dish identifier, a cooking method recognizer, and a multi-label ingredient detector. They share a few low-level layers in the deep network architecture. The proposed framework shows higher accuracy than traditional method with handcrafted features, and the cooking method recognizer and ingredient detector can be applied to dishes which are not included in the training dataset to provide reference information for users.

[1] Yang S, Chen M, Pomerleau D, Sukthankar R. Food recognition using statistics of pairwise local features. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2010, pp.2249-2256.

[2] Retna Swami M S S K, Karuppiah M. Optimal feature extraction using greedy approach for random image components and subspace approach in face recognition. Journal of Computer Science and Technology, 2013, 28(2): 322-328.

[3] Hall P, Cai H, Wu Q, Corradi T. Crossdepiction problem: Recognition and synthesis of photographs and artwork. Computational Visual Media, 2015, 1(2): 91-103.

[4] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In Proc. the 26th Conference on Neural Information Processing Systems (NIPS), December 2012, pp.1106-1114.

[5] Ghosh S, Laksana E, Scherer S, Morency L P. A multi-label convolutional neural network approach to cross-domain action unit detection. In Proc. IEEE Int. Conf. Affective Computing and Intelligent Interaction (ACII), May 2015, pp.609-615.

[6] Li S, Liu Z Q, Chan A. Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network. International Journal of Computer Vision, 2015, 113(1): 19-36.

[7] Chen M Y, Yang Y H, Ho C J, Wang S H, Liu S M, Chang E, Yeh C H, Ouhyoung M. Automatic Chinese food identification and quantity estimation. In Proc. SIGGRAPH Asia 2012 Technical Briefs, November 2012, pp.29:1-29:4.

[8] Kagaya H, Aizawa K, Ogawa M. Food detection and recognition using convolutional neural network. In Proc. the 22nd ACM International Conference on Multimedia, November 2014, pp.1085-1088.

[9] Kawano Y, Yanai K. Food image recognition with deep convolutional features. In Proc. the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, September 2014, pp.589-593.

[10] Chen M, Dhingra K, Wu W, Yang L, Sukthankar R, Yang J. PFID: Pittsburgh fastfood image dataset. In Proc. the 16th IEEE International Conference on Image Processing (ICIP), November 2009, pp.289-292.

[11] Hoashi H, Joutou T, Yanai K. Image recognition of 85 food categories by feature fusion. In Proc. IEEE International Symposium on Multimedia (ISM), December 2010, pp.296-301.

[12] Joutou T, Yanai K. A food image recognition system with multiple kernel learning. In Proc. the 16th IEEE International Conference on Image Processing (ICIP), November 2009, pp.285-288.

[13] Matsuda Y, Hoashi H, Yanai K. Recognition of multiplefood images by detecting candidate regions. In Proc. IEEE International Conference on Multimedia and Expo (ICME), July 2012, pp.25-30.

[14] Kawano Y, Yanai K. Real-time mobile food recognition system. In Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), June 2013.

[15] Bosch M, Zhu F, Khanna N, Boushey C J, Delp E J. Combining global and local features for food identification in dietary assessment. In Proc. the 18th IEEE International Conference on Image Processing (ICIP), September 2011, pp.1789-1792.

[16] Maruyama T, Kawano Y, Yanai K. Realtime mobile recipe recommendation system using food ingredient recognition. In Proc. the 2nd ACM International Workshop on Interactive Multimedia on Mobile and Portable Devices, Oct. 29-Nov. 2, 2012, pp.27-34.

[17] Wang C, Huang K Q. VFM: Visual feedback model for robust object recognition. Journal of Computer Science and Technology, 2015, 30(2): 325-339.

[18] Yang X, Kim S, Xing E P. Heterogeneous multitask learning with joint sparsity constraints. In Proc. the 23rd Annual Conference on Neural Information Processing Systems (NIPS), December 2009, pp.2151-2159.

[19] Wang X, Fouhey D F, Gupta A. Designing deep networks for surface normal estimation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015, pp.539-547.

[20] Amer M, Goldstein M, Abdennadher S. Enhancing oneclass support vector machines for unsupervised anomaly detection. In Proc. the ACM SIGKDD Workshop on Outlier Detection and Description, August 2013, pp.8-15.

[21] Nair V, Hinton G E. Rectified linear units improve restricted boltzmann machines. In Proc. the 27th International Conference on Machine Learning (ICML), June 2010, pp.807-814.

[22] Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature, 1986, 323(6088): 533-536.

[23] Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv: 1207.0580, 2012. http://arxiv.org/abs/1207.0580, Mar. 2016.

[24] Branson S, Beijbom O, Belongie S. Efficient large-scale structured learning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2013, pp.1806-1813.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 金志权; 柳诚飞; 孙钟秀; 周晓方; 陈佩佩; 顾建明;. Design and Implementation of a Heterogeneous Distributed Database System[J]. , 1990, 5(4): 363 -373 .
[2] 蔡士杰; 张福炎;. A Fast Algorithm for Polygon Operations[J]. , 1991, 6(1): 91 -96 .
[3] 史维更;. Reconnectable Network with Limited Resources[J]. , 1991, 6(3): 243 -249 .
[4] 田捷;. The Geometric Continuity of Rational Bezler Triangular Surfaces[J]. , 1991, 6(4): 383 -388 .
[5] 许建国; 郭玉钗; 林宗楷;. HEPAPS:A PCB Automatic Placement System[J]. , 1992, 7(1): 39 -46 .
[6] 王义和; 洪家荣;. AECAM:An Extension Matrix Algorithm on a Cellular Automata Machine[J]. , 1992, 7(1): 88 -91 .
[7] 徐庆云; 王能斌;. Concurrency Control Mechanism of Complex Objects[J]. , 1992, 7(4): 305 -310 .
[8] 邓铁清; 吴泉源; 王志英;. A New Integrated System of Logic Programming and Relational Database[J]. , 1993, 8(1): 58 -67 .
[9] 吴信东;. Inductive Learning[J]. , 1993, 8(2): 22 -36 .
[10] 马军; 马绍汉;. An O(k~2n~2) Algorithm to Find a k-Partition in a k-Connected Graph[J]. , 1994, 9(1): 86 -91 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: