›› 2015, Vol. 30 ›› Issue (3): 453-466.doi: 10.1007/s11390-015-1536-z

Special Issue: Artificial Intelligence and Pattern Recognition

• Special Section on Computational Visual Media • Previous Articles     Next Articles

PhotoPrev: Unifying Context and Content Cues to Enhance Personal Photo Revisitation

Li Jin(金力), Gang-Li Liu(刘钢利), Liang Zhao(赵靓), Ling Feng(冯铃), Senior Member, IEEE   

  1. Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology Tsinghua University, Beijing 100084, China
  • Received:2014-12-01 Revised:2015-03-16 Online:2015-05-05 Published:2015-05-05
  • About author:Li Jin received his Bachelor's degree in computer science and technology from Xidian University, Xi'an, in 2012. He is currently a Ph.D. candidate in the Department of Computer Science and Technology, Tsinghua University, Beijing. His research interests include context-aware data management and context-based information refinding.
  • Supported by:

    The work was supported by the National Natural Science Foundation of China under Grant Nos. 61373022, 61073004, and the National Basic Research 973 Program of China under Grant No. 2011CB302203-2.

personal photos on the smart phones is a common yet uneasy task for users due to the large volume of photos taken in daily life. Inspired by the human memory and its natural recall characteristics, we build a personal photo re-visitation tool, PhotoPrev, to facilitate users to re-find previous photos through associated memory cues. To mimic users' episodic memory recall, we present a way to automatically generate an abundance of related contextual metadata (e.g. weather, temperature, etc) and organize them as context lattices for each photo in a life cycle. Meanwhile, photo content (e.g. object,text) is extracted and managed in a weighted term list, which corresponds to semantic memory. A TA-based top-k photo re-visitation algorithm for context-and content-based keyword search on a personal photo collection, together with a user feedback mechanism, are also given. We evaluate the scalability on a large synthetic data, and a 12-week user study demonstrates the feasibility and effectiveness of our photo re-visitation strategies.

[1] Naaman M, Song Y J, Paepcke A et al. Automatic organization for digital photographs with geographic coordinates. In Proc. the 4th ACM/IEEE Joint Conference on Digital Libraries, June 2004, pp.53–62.

[2] Naaman M, Harada S, Wang Y et al. Context data in geo-referenced digital photo collections. In Proc. the 12th ACM International Conference on Multimedia, Oct. 2004, pp.196–203.

[3] Cao L, Luo J, Kautz H et al. Annotating collections of photos using hierarchical event and scene models. In Proc. the 21st IEEE Conference on Computer Vision and Pattern Recognition, June 2008.

[4] Joshi D, Luo J. Inferring generic activities and events from image content and bags of geo-tags. In Proc. the 7th International Conference on Content-Based Image and Video Retrieval, July 2008, pp.37–46.

[5] VianaW, Filho J B, Gensel J et al. PhotoMap —Automatic spatiotemporal annotation for mobile photos. In Proc. the 7th Int. Symp. Web and Wireless Geographical Information Systems, Nov. 2007, pp.187-201.

[6] Viana W, Hammiche S, Villanova-Oliver M et al. Photo context as a bag of words. In Proc. the 10th IEEE International Symposium on Multimedia, Dec. 2008, pp.310-315.

[7] Crandall D, Felzenszwalb P, Huttenlocher D. Spatial priors for part-based recognition using statistical models. In Proc. the 18th IEEE Conference on Computer Vision and Pattern Recognition, June 2005, pp.10-17.

[8] Dalal N, Triggs B. Histograms of oriented gradients for human detection. In Proc. the 18th IEEE Conference on Computer Vision and Pattern Recognition, June 2005, pp.886-893.

[9] Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. In Proc. the 21st IEEE Conference on Computer Vision and Pattern Recognition, June 2008.

[10] Felzenszwalb P F, Huttenlocher D P. Pictorial structures for object recognition. International Journal of Computer Vision, 2005, 61(1): 55-79.

[11] Hu J, Pei J, Tang J. How can I index my thousands of photos effectively and automatically? An unsupervised feature selection approach. In Proc. the 14th SIAM International Conference on Data Mining, Apr. 2014, pp.136-144.

[12] Zhou W, Li H, Lu Y et al. Encoding spatial context for large-scale partial-duplicate web image retrieval. Journal of Computer Science and Technology, 2014, 29(5): 837-848.

[13] Shotton J, Winn J, Rother C et al. Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision, 2009, 81(1): 2-23.

[14] Hu S, Chen T, Xu K et al. Internet visual media processing: A survey with graphics and vision applications. The Visual Computer, 2013, 29(5): 393-405.

[15] Frome A, Singer Y, Malik J. Image retrieval and classification using local distance functions. In Proc. Neural Information Processing Systems, Dec. 2006, pp.417-424.

[16] Russell B C, Torralba A, Liu C et al. Object recognition by scene alignment. In Proc. Neural Information Processing Systems, Dec. 2007, pp.1241-1248.

[17] Russell B C, Torralba A, Murphy K P et al. LabelMe: A database and web-based tool for image annotation. International Journal of Computer Vision, 2008, 77(1/2/3): 157-173.

[18] Liu C, Yuen J, Torralba A. Nonparametric scene parsing via label transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(12): 2368-2382.

[19] Liu C, Yuen J, Torralba A. Sift flow: Dense correspondence across different scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(5): 978-994.

[20] Cao W, Liu N, Kong Q et al. Content-based image retrieval using high-dimensional information geometry. SCIENCE CHINA Information Sciences, 2014, 57(7): 1-11.

[21] Gllavata J, Ewerth R, Freisleben B. Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In Proc. the 17th International Conference on Pattern Recognition, Aug. 2004, pp.425-428.

[22] Chen X, Yuille A L. Detecting and reading text in natural scenes. In Proc. the 17th IEEE Conference on Computer Vision and Pattern Recognition, June 2004, pp.366-373.

[23] Ye Q, Huang Q, Gao Wet al. Fast and robust text detection in images and video frames. Image and Vision Computing, 2005, 23(6): 565-576.

[24] Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform. In Proc. the 23rd IEEE Conference on Computer Vision and Pattern Recognition, June 2010, pp.2963-2970.

[25] Lee J, Lee P, Lee S et al. AdaBoost for text detection in natural scene. In Proc. the 12th International Conference on Document Analysis and Recognition, Sept. 2011, pp.429-434.

[26] Matas J, Chum O, Urban M et al. Robust wide baseline stereo from maximally stable extremal regions. Image and Vision Computing, 2004, 22(10): 761-767.

[27] Neumann L, Matas J. Real-time scene text localization and recognition. In Proc. the 25th IEEE Conference on Computer Vision and Pattern Recognition, June 2012, pp.3538-3545.

[28] Zhang X, Lin Z, Sun F et al. Transform invariant text extraction. The Visual Computer, 2013, 30(4): 401-415.

[29] Chen T, Chen M, Tan P et al. Sketch2Photo: Internet image montage. ACM Transactions on Graphics, 2009, 28(5): Article No. 124.

[30] Lee Y, Zitnick C L, Cohen M F. ShadowDraw: Real-time user guidance for freehand drawing. ACM Transactions on Graphics, 2011, 30(4): Article No. 27.

[31] Ellis H C. Fundamentals of Human Memory and Cognition (3rd edition). William C. Brown Press, 1983.

[32] Rubin D C,Wenzel A E. One hundred years of forgetting: A quantitative description of retention. Psychological Review, 1996, 103(4): 734-760.

[33] Tulving E. What is episodic memory? Current Directions in Psychological Science, 1993, 2(3): 67-70.

[34] Wiggs C L, Weisberg J, Martin A. Neural correlates of semantic and episodic memory retrieval. Neuropsychologia, 1999, 37(1): 103-118.

[35] Ding Y, Li X. Time weight collaborative filtering. In Proc. the 14th ACM International Conference on Information and Knowledge Management, Oct. 2005, pp.485-492.

[36] Fagin R, Lotem A, Naor M. Optimal aggregation algorithms for middleware. In Proc. the 20th ACM SIGMODSIGACT-SIGART Symposium on Principles of Database Systems, May 2001, pp.102-113.

[37] Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. the 18th International Conference on Machine Learning, June 28–July 1, 2001, pp.282-289.
No related articles found!
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved