Journal of Computer Science and Technology ›› 2022, Vol. 37 ›› Issue (3): 641-651.doi: 10.1007/s11390-022-2146-1

Special Issue: Artificial Intelligence and Pattern Recognition; Computer Graphics and Multimedia

• Special Section of CVM 2022 • Previous Articles     Next Articles

Learn Robust Pedestrian Representation Within Minimal Modality Discrepancy for Visible-Infrared Person Re-Identification

Yu-Jie Liu (刘玉杰), Member, CCF, Wen-Bin Shao* (邵文斌), and Xiao-Rui Sun (孙晓瑞)        

  1. College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China
  • Received:2022-01-06 Revised:2022-04-14 Accepted:2022-04-14 Online:2022-05-30 Published:2022-05-30
  • Contact: Wen-Bin Shao E-mail:wbShao@s.upc.edu.cn
  • About author:Wen-Bin Shao received his B.S. degree in computer science and technology from Shandong University of Technology, Zibo, in 2020. He is currently a Master student at College of Computer Science and Technology, China University of Petroleum (East China), Qingdao. His research interests include person re-identification, computer vision, and multimedia.
  • Supported by:
    This work was supported by the National Key Research and Development Program of China under Grant No. 2019YFF0301800, the National Natural Science Foundation of China under Grant No. 61379106, and the Shandong Provincial Natural Science Foundation under Grant Nos. ZR2013FM036 and ZR2015FM011.

Visible-infrared person re-identification has attracted extensive attention from the community due to its potential great application prospects in video surveillance. There are huge modality discrepancies between visible and infrared images caused by different imaging mechanisms. Existing studies alleviate modality discrepancies by aligning modality distribution or extracting modality-shared features on the original image. However, they ignore a key solution, i.e., converting visible images to gray images directly, which is efficient and effective to reduce modality discrepancies. In this paper, we transform the cross-modality person re-identification task from visible-infrared images to gray-infrared images, which is named as the minimal modality discrepancy. In addition, we propose a pyramid feature integration network (PFINet) which mines the discriminative refined features of pedestrian images and fuses high-level and semantically strong features to build a robust pedestrian representation. Specifically, PFINet first performs the feature extraction from concrete to abstract and the top-down semantic transfer to obtain multi-scale feature maps. Second, the multi-scale feature maps are inputted to the discriminative-region response module to emphasize the identity-discriminative regions by the spatial attention mechanism. Finally, the pedestrian representation is obtained by the feature integration. Extensive experiments demonstrate the effectiveness of PFINet which achieves the rank-1 accuracy of 81.95% and mAP of 74.49% on the multi-all evaluation mode of the SYSU-MM01 dataset.

Key words: person re-identification; modality discrepancy; discriminative feature;

[1] Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi S C. Deep learning for person re-identification: A survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence. DOI: 10.1109/TPAMI.2021.3054775.

[2] Zeng M, Yao B, Wang Z J, Shen Y, Li F, Zhang J, Lin H, Guo M. CATIRI: An efficient method for content-and-text based image retrieval. Journal of Computer Science and Technology, 2019, 34(2): 287-304. DOI: 10.1007/s11390-019-1911-2.

[3] Sun Y, Zheng L, Yang Y, Tian Q, Wang S. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.480-496. DOI: 10.1007/978-3-030-01225-0.

[4] Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q, Jiang W, Zhang C, Sun J. AlignedReID: Surpassing human-level performance in person re-identification. arXiv:1711.08184, 2017. https://arxiv.org/pdf/1711.08184.pdf, Jan. 2022.

[5] Zhong Z, Zheng L, Cao D, Li S. Re-ranking person re-identification with k-reciprocal encoding. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.1318-1327. DOI: 10.1109/CVPR.2017.389.

[6] Wu A, Zheng W S, Yu H X, Gong S, Lai J. RGB-infrared cross-modality person re-identification. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.5380-5389. DOI: 10.1109/ICCV.2017.575.

[7] Dai P, Ji R, Wang H, Wu Q, Huang Y. Cross-modality person re-identification with generative adversarial training. In Proc. the 27th International Joint Conference on Artificial Intelligence, July 2018, pp.677-683. DOI: 10.24963/ijcai.2018/94.

[8] Wang G A, Zhang T, Cheng J, Liu S, Yang Y, Hou Z. RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27-Nov. 2, 2019, pp.3623-3632. DOI: 10.1109/ICCV.2019.00372.

[9] Wang G A, Zhang T, Yang Y, Cheng J, Chang J, Liang X, Hou Z G. Cross-modality paired-images generation for RGB-infrared person re-identification. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.12144-12151. DOI: 10.1609/aaai.v34i07.6894.

[10] Zhao Z, Liu B, Chu Q, Lu Y, Yu N. Joint color-irrelevant consistency learning and identity-aware modality adaptation for visible-infrared cross modality person re-identification. In Proc. the 35th Conference on Artificial Intelligence, Feb. 2021, pp.3520-3528.

[11] Chen Y, Wan L, Li Z, Jing Q, Sun Z. Neural feature search for RGB-infrared person re-identification. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021, pp.587-597. DOI: 10.1109/CVPR46437.2021.00065.

[12] Lu Y, Wu Y, Liu B, Zhang T, Li B, Chu Q, Yu N. Cross-modality person re-identification with shared-specific feature transfer. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020, pp.13376-13386. DOI: 10.1109/CVPR42600.2020.01339.

[13] Zhu Y, Yang Z, Wang L, Zhao S, Hu X, Tao D. Hetero-center loss for cross-modality person re-identification. Neurocomputing, 2020, 386: 97-109. DOI: 10.1016/j.neucom.2019.12.100.

[14] Wu Q, Dai P, Chen J, Lin C W, Wu Y, Huang F, Zhong B, Ji R. Discover cross-modality nuances for visible-infrared person re-identification. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021, pp.4330-4339. DOI: 10.1109/CVPR46437.2021.00431.

[15] Ding S, Lin L, Wang G, Chao H. Deep feature learning with relative distance comparison for person re-identification. Pattern Recognition, 2015, 48(10): 2993-3003. DOI: 10.1016/j.patcog.2015.04.005.

[16] Chen W, Chen X, Zhang J, Huang K. Beyond triplet loss: A deep quadruplet network for person re-identification. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.403-412. DOI: 10.1109/CVPR.2017.145.

[17] Hermans A, Beyer L, Leibe B. In defense of the triplet loss for person re-identification. arXiv:1703.07737, 2017. https://arxiv.org/pdf/1703.07737.pdf, Jan. 2022.

[18] Zheng L, Zhang H, Sun S, Chandraker M, Yang Y, Tian Q. Person re-identification in the wild. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.1367-1376. DOI: 10.1109/CVPR.2017.357.

[19] Qian X, Fu Y, Jiang Y G, Xiang T, Xue X. Multi-scale deep learning architectures for person re-identification. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.5399-5408. DOI: 10.1109/ICCV.2017.577.

[20] Sun Y, Zheng L, Deng W, Wang S. SVDNet for pedestrian retrieval. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.3800-3808. DOI: 10.1109/ICCV.2017.410.

[21] Guo J, Yuan Y, Huang L, Zhang C, Yao J G, Han K. Beyond human parts: Dual part-aligned representations for person re-identification. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27-Nov. 2, 2019, pp.3642-3651. DOI: 10.1109/ICCV.2019.00374.

[22] Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X. Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.1077-1085. DOI: 10.1109/CVPR.2017.103.

[23] Gao S, Wang J, Lu H, Liu Z. Pose-guided visible part matching for occluded person ReID. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2020, pp.11744-11752. DOI: 10.1109/CVPR42600.2020.01176.

[24] Ge Y, Zhu F, Chen D et al. Self-paced contrastive learning with hybrid memory for domain adaptive object Re-ID. In Proc. the Annual Conference on Neural Information Processing Systems, Dec. 2020.

[25] Ge Y, Chen D, Li H. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv:2001.01526, 2020. https://arxiv. org/pdf/2001.01526.pdf, Jan. 2022.

[26] Chen H, Wang Y, Lagadec B, Dantcheva A, Bremond F. Joint generative and contrastive learning for unsupervised person re-identification. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2021, pp.2004-2013. DOI: 10.1109/CVPR46437.2021.00204.

[27] Wang Z, Wang Z, Zheng Y, Chuang Y Y, Satoh S. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2019, pp.618-626. DOI: 10.1109/CVPR.2019.00071.

[28] Ye M, Lan X, Wang Z, Yuen P C. Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Transactions on Information Forensics and Security, 2020, 15: 407-419. DOI: 10.1109/TIFS.2019.2921454.

[29] Hao Y, Wang N, Li J, Gao X. HSME: Hypersphere manifold embedding for visible thermal person re-identification. In Proc. the AAAI Conference on Artificial Intelligence, January 27-February 1, 2019, pp.8385-8392. DOI: 10.1609/aaai.v33i01.33018385.

[30] Ye M, Lan X, Leng Q, Shen J. Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Transactions on Image Processing, 2020, 29: 9387-9399. DOI: 10.1109/TIP.2020.2998275.

[31] Jia M, Zhai Y, Lu S, Ma S, Zhang J. A similarity inference metric for RGB-infrared cross-modality person re-identification. In Proc. the 29th International Joint Conference on Artificial Intelligence, Jan. 2021, pp.1026-1032. DOI: 10.24963/ijcai.2020/143.

[32] Ye M, Shen J, J Crandall D, Shao L, Luo J. Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.229-247. DOI: 10.1007/978-3-030-58520-4.

[33] Li D, Wei X, Hong X, Gong Y. Infrared-visible cross-modal person re-identification with an X modality. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.4610-4617. DOI: 10.1609/aaai.v34i04.5891.

[34] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2018, pp.7132-7141. DOI: 10.1109/CVPR.2018.00745.

[35] Woo S, Park J, Lee J Y, Kweon I S. Cbam: Convolutional block attention module. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.3-19. DOI: 10.1007/978-3-030-01234-2.

[36] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I. Attention is all you need. In Proc. the 2017 Annual Conference on Neural Information Processing Systems, Dec. 2017, pp.5998-6008.

[37] Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H. Dual attention network for scene segmentation. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2019, pp.3146-3154. DOI: 10.1109/CVPR.2019.00326.

[38] Lin T Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S. Feature pyramid networks for object detection. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.2117-2125. DOI: 10.1109/CVPR.2017.106.

[39] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, June 2016, pp.770-778. DOI: 10.1109/CVPR.2016.90.

[40] Nguyen D T, Hong H G, Kim K W, Park K R. Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors, 2017, 17(3): Article No. 605. DOI: 10.3390/s17030605.

[41] Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014. https://arxiv.org/pdf/ 1412.6980.pdf, Jan. 2022.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Zhou Di;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] Chen Shihua;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] Wang Jianchao; Wei Daozheng;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Zheng Guoliang; Li Hui;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .
[7] Huang Xuedong; Cai Lianhong; Fang Ditang; Chi Bianjin; Zhou Li; Jiang Li;. A Computer System for Chinese Character Speech Input[J]. , 1986, 1(4): 75 -83 .
[8] Xu Xiaoshu;. Simplification of Multivalued Sequential SULM Network by Using Cascade Decomposition[J]. , 1986, 1(4): 84 -95 .
[9] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[10] Zhong Renbao; Xing Lin; Ren Zhaoyang;. An Interactive System SDI on Microcomputer[J]. , 1987, 2(1): 64 -71 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved