We use cookies to improve your experience with our site.
Hua-Peng Wei, Ying-Ying Deng, Fan Tang, Xing-Jia Pan, Wei-Ming Dong. A Comparative Study of CNN- and Transformer-Based Visual Style Transfer[J]. Journal of Computer Science and Technology, 2022, 37(3): 601-614. DOI: 10.1007/s11390-022-2140-7
Citation: Hua-Peng Wei, Ying-Ying Deng, Fan Tang, Xing-Jia Pan, Wei-Ming Dong. A Comparative Study of CNN- and Transformer-Based Visual Style Transfer[J]. Journal of Computer Science and Technology, 2022, 37(3): 601-614. DOI: 10.1007/s11390-022-2140-7

A Comparative Study of CNN- and Transformer-Based Visual Style Transfer

  • Vision Transformer has shown impressive performance on the image classification tasks. Observing that most existing visual style transfer (VST) algorithms are based on the texture-biased convolution neural network (CNN), here raises the question of whether the shape-biased Vision Transformer can perform style transfer as CNN. In this work, we focus on comparing and analyzing the shape bias between CNN- and transformer-based models from the view of VST tasks. For comprehensive comparisons, we propose three kinds of transformer-based visual style transfer (Tr-VST) methods (Tr-NST for optimization-based VST, Tr-WCT for reconstruction-based VST and Tr-AdaIN for perceptual-based VST). By engaging three mainstream VST methods in the transformer pipeline, we show that transformer-based models pre-trained on ImageNet are not proper for style transfer methods. Due to the strong shape bias of the transformer-based models, these Tr-VST methods cannot render style patterns. We further analyze the shape bias by considering the influence of the learned parameters and the structure design. Results prove that with proper style supervision, the transformer can learn similar texture-biased features as CNN does. With the reduced shape bias in the transformer encoder, Tr-VST methods can generate higher-quality results compared with state-of-the-art VST methods.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return