A Comparative Study of CNN- and Transformer-Based Visual Style Transfer

Hua-Peng Wei; Ying-Ying Deng; Fan Tang; Xing-Jia Pan; Wei-Ming Dong

doi:10.1007/s11390-022-2140-7

Hua-Peng Wei, Ying-Ying Deng, Fan Tang, Xing-Jia Pan, Wei-Ming Dong. A Comparative Study of CNN- and Transformer-Based Visual Style Transfer[J]. Journal of Computer Science and Technology, 2022, 37(3): 601-614. DOI: 10.1007/s11390-022-2140-7

Citation:

A Comparative Study of CNN- and Transformer-Based Visual Style Transfer

Abstract

Abstract

Vision Transformer has shown impressive performance on the image classification tasks. Observing that most existing visual style transfer (VST) algorithms are based on the texture-biased convolution neural network (CNN), here raises the question of whether the shape-biased Vision Transformer can perform style transfer as CNN. In this work, we focus on comparing and analyzing the shape bias between CNN- and transformer-based models from the view of VST tasks. For comprehensive comparisons, we propose three kinds of transformer-based visual style transfer (Tr-VST) methods (Tr-NST for optimization-based VST, Tr-WCT for reconstruction-based VST and Tr-AdaIN for perceptual-based VST). By engaging three mainstream VST methods in the transformer pipeline, we show that transformer-based models pre-trained on ImageNet are not proper for style transfer methods. Due to the strong shape bias of the transformer-based models, these Tr-VST methods cannot render style patterns. We further analyze the shape bias by considering the influence of the learned parameters and the structure design. Results prove that with proper style supervision, the transformer can learn similar texture-biased features as CNN does. With the reduced shape bias in the transformer encoder, Tr-VST methods can generate higher-quality results compared with state-of-the-art VST methods.

FullText(HTML)

References (42)

Relative Articles

Supplements (2)

Cited By

A Comparative Study of CNN- and Transformer-Based Visual Style Transfer

Abstract

Catalog

Export File

Citation

Format

Content