|
|
|
|
|
|
Given a few style examples of Amedeo portraiture (top left), our deformable few-shot face cartoonization model can translate input images with high quality in style and high fidelity in identity, and more importantly, with favorable shape deformation (e.g., face-elongation). Existing few-shot learning approaches, such as [1] (denoted as psp+RSSA), MTG [2], and JoJoGAN [3], either fall into overfitting or fail to reproduce this characteristic deformation presented in the examples. More target styles we experimented with are shown on the right and were all trained with only 10 style examples. |
Cartoonizing portrait images is a stylish and eye-catching application in both computer vision and graphics. We aimed to train a face cartoonization model using very few (e.g., 5~10) style examples. The main difficulty in this challenging task lies in producing stylizations of high quality while preserving the identity of the input, particularly when the style examples contain strong exaggerations. To address this, we propose a novel cross-domain center loss for few-shot generative adversarial network (GAN) adaptation, which forces the distribution of the target domain to be similar to that of the source. We then employ it to solve this few-shot problem along with a two-stage strategy. Stage I generates an intermediate cartoonization for the input, where we first stylize the individual facial components locally and then deform them to mimic the desired exaggeration under the guidance of landmarks. Stage II focuses on global refinement of the intermediate image. First, we adapt a pretrained StyleGAN model using the proposed cross-domain center loss to the target domain defined by a few examples. Subsequently, the intermediate cartoonization from Stage I can be holistically refined through GAN inversion. The generative power of StyleGAN guarantees high image quality, while the local translation and landmark-guided deformation applied to facial components provide high identity fidelity. Experiments show that the proposed method outperforms state-of-the-art few-shot stylization approaches both qualitatively and quantitatively. |
| |
| |
| |
Qualitative comparison with state-of-the-art few-shot stylization methods on realistic styles (Sty#1-#4). FewAda and RSSA seemly fall into overfitting, showing limited diversity across the styles. MTG and JoJoGAN faithfully preserve the input identity but lose some style features exhibited in the examples. Our results are visually similar to those of MTG and JoJoGAN, and more fine stylistic details can be seen if checked carefully. Again, FewAda and RSSA lose the identity of the input. MTG and JoJoGAN still faithfully preserve the input identity while missing the characteristic style features mentioned above. Our method, in contrast, reproduces these stylistic features plausibly while preserving the identity as much as possible. Best viewed zoomed-in. |
| |
Qualitative comparison with state-of-the-art few-shot stylization methods on exaggerated styles (Sty#5-#8). Prominent characteristic features are shown in the examples, e.g., elongation of faces in Amedeo (Sty#5), widening of eyes in Toonify (Sty#6), rounding curve of facial components in Female Cartoon (Sty#7) and over-heavy contours in Fernand Leger (Sty#8). Best viewed zoomed-in. |
Yang, Zhou. Shengshu, Li. Hui, Huang. Deformable Few-shot Face Cartoonization via Local to Global Translation. Springer Journal of Computational Visual Media, 2023. |
AcknowledgementsThis template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here. |