WAN2.2 VACE Native Workflow with ComfyUI 　「WAN2.2 VACEをNativeで使用する」

Date: 2025.9.18

in English

A VACE model for WAN2.2 has been released, but I could only find Kijai wrapper workflows and could not find a native version, so I decided to put together one.

I'm not sure if this is the perfect solution, since I've just replaced the VACE workflow for WAN 2.1 with WAN 2.2.

The workflow is distributed and explained below.

1. WAN2.2 VACE Native Workflow Basic Model

[Download Link] WAN22_VACE_Basic_Native_workflow.json

The image above shows the basic form of the WAN2.2 VACE Native workflow. Since the model uses GGUF, the text encoder also uses GGUF. If you want to use the regular version, please switch the loader.

The structure is almost the same as the WAN2.2 Native workflow explained in a previous article. If you are unfamiliar with WAN2.2 Native, please refer to the following article.

WAN2.2 Native workflow, 2 or 3 Ksampler with ComfyUI "WAN2.2 Native workflow explanation" | BLOG MITTIMI

There are three major differences between the normal WAN2.2 workflow and the VACE workflow.

1. Model

The model to be loaded is a model exclusive to WAN2.2 VACE. The model used here is the QuantStack version Q8_0_GGUF model.

QuantStack/Wan2.2-VACE-Fun-A14B-GGUF at main

When using VACE with the Kijai workflow, it is used in conjunction with the WAN2.2 base model like LoRA, but the Native version is an integrated model of the base model and VACE.

2. WanVaceToVideo node

By introducing the VACE-specific WanVaceToVideo node, you can control images and videos.

3. TrimVideoLatent node

When generating a video using VACE, noise appears in the first few frames. This node has the function to remove the first frames.

In addition, try removing or changing the values of lightweight LoRA, NAG, Sampler, etc. as you like.

2. Usage examples

Now let's actually try it out.

Please perform the same movements as in the video on the right in the image on the left. The workflow link is also provided below.

[Download Link] WAN22_VACE_ImageReplaceCanny_workflow.json

In this workflow, the movement of the video is roughly extracted as a line drawing using Canny and connected to the control_video of the WanVaceToVideo node. This ensures that the image connected to the reference_image node will have the same movement.

Here are the results.

This workflow is based on the official ComfyUI workflow for WAN2.1 VACE. The structure is almost the same, so you can easily create a workflow for WAN2.2 by comparing and replacing the flows. You can also understand what VACE can do.

ComfyUI Wan2.1 VACE Video Examples - ComfyUI

Next, we will show an example of creating a video by inputting the face image and background image mentioned above.

[Download Link] WAN22_VACE_Combine_workflow.json

As you can see from the workflow, multiple images are input and combined into a single image, and that data is used as the reference_image to generate a video.

But here's the result. It didn't turn out to be a very good video.

Another approach is to first combine two images into one using Qwen Image Edit, then convert it into a video using WAN2.2's I2V (not VACE), as shown below.

As you can see, just because VACE can do anything doesn't mean you have to do everything with it, and I think it's best to use it on a case-by-case basis.

日本語解説（in Japanese）

WAN2.2用のVACEモデルが登場しましたが、Kijai wrapperのワークフローばかりでNative版のフローを見つけられなかったため組んでみました。

WAN2.1用VACEのワークフローをWAN2.2に置き換えた形なので、これが完ぺきな正解なのかは分かりません。とりあえず動いてはいます。

以下でワークフローの配布と解説を行っています。

１．WAN2.2 VACE Native Workflow の基本型

[Download Link] WAN22_VACE_Basic_Native_workflow.json

上の画像がWAN2.2 VACE Nativeワークフローの基本的な形になります。モデルはGGUFを使用しているのでテキストエンコーダーもGGUFのものを使っています。通常版を使用する場合はローダーを切り替えてください。

構造は以前記事で解説したWAN2.2 Nativeのワークフローとほとんど変わりません。WAN2.2 Nativeについて分からない場合は以下の記事を参照ください。

WAN2.2 Native workflow, 2 or 3 Ksampler with ComfyUI 　「WAN2.2 Nativeワークフローの解説」｜BLOG MITTIMI

通常のWAN2.2ワークフローとVACEワークフローの大きな相違点は３つ。

１．モデル

読み込むモデルがWAN2.2 VACE専用のモデルとなっています。今回使用したのはQuantStack版のQ8_0_GGUFモデルです。

Kijai版ワークフローでVACEを扱う場合は、LoRAのようにWAN2.2ベースモデルと併用する形なのですが、Native版ではベースモデルとVACEの一体型モデルになります。

２．WanVaceToVideoノード

VACE特有のWanVaceToVideoノードを導入することにより、画像や動画をコントロールできるようになります。

３．TrimVideoLatentノード

VACEで動画を生成すると開始数フレームにノイズが乗ります。このノードは開始フレームを削除する機能を持っています。

その他、軽量化LoRAやNAG、Sampler等はお好みで外したり数値を変えたりしてみてください。

２．使用例

それでは実際に動かしてみます。

右の動画と同じ動きを、左の画像にしてもらいます。ワークフローのリンクも下に貼っています。

[Download Link] WAN22_VACE_ImageReplaceCanny_workflow.json

このワークフローでは動画の動きをCannyでざっくり線画抽出し、WanVaceToVideoノードのcontrol_videoへ繋いでいます。こうすることで、reference_imageに繋いだ画像が、同様の動きをする仕組みとなっています。

結果はこちらです。

このワークフローはComfyUI公式のWAN2.1 VACE用のものを参考にしています。構造はほとんど同じなので、フローを見比べながら差し替えていけば簡単にWAN2.2用のワークフローが作れます。VACEで何ができるかも理解することができます。

次は、先ほどの顔画像と背景画像の２つを入力して動画を作成する例です。

[Download Link] WAN22_VACE_Combine_workflow.json

ワークフローをみるとわかる通り、複数の画像を入力しそれを１枚の画像として結合。そのデータをreference_imageとして動画生成を行います。

しかし結果はこちら。あまり良い感じの動画にはなりませんでした。

別のアプローチとして、まず２枚の画像をQwen Image Editで１枚の画像にし、WAN2.2のI2V（VACEではない）で動画化したものが以下です。

このように、VACEでなんでも出来るからと言ってすべてを行う必要はなく、ケースバイケースで使い分けていけば良いと思います。

WAN2.2 VACE Native Workflow with ComfyUI 「WAN2.2 VACEをNativeで使用する」