
Ever dreamed of creating your own AI talking avatar that actually speaks, sings, and reacts? With this easy-to-follow ComfyUI workflow, you’ll learn how to go from a static image to a fully lip-synced, voice-cloned avatar. No need for paid cloud tools—we’re going local and open-source!
Whether you're making YouTube Shorts, building virtual influencers, or telling creative stories—this guide will walk you through every step.

- How to use FramePack to animate avatars frame-by-frame
- Integrate Flux LoRA (ACE++) for expressive face styles
- Use Latent Sync 1.5 for accurate lip synchronization
- Apply F5TTS for realistic voice cloning
- Enhance your output with upscaling and frame blending



- Generate the Base Character Image
Use your preferred model in ComfyUI (e.g., Flux ACE++) to create a clean, frontal face with a neutral expression. - Load the FramePack F1 Node
Install FramePack F1 to animate the static image using motion vectors. You can adjust "expression frames" for subtle emotion control. - Add Lip Sync Using LatentSync 1.5
Feed in your audio (TTS or recorded). LatentSync aligns mouth shapes with the waveform for realistic speech matching. - Use F5TTS for Voice Cloning (Optional)
Clone voices using text-to-speech models like Bark or F5TTS. Paste your script, and export audio to sync with LatentSync. - Export Video & Upscale
Once you're happy with the animation, use VideoCombine or FlowFrames to upscale and interpolate the final result to 60 FPS.


[]Content Creators: Wanting to build custom AI avatars for video content.
[]ComfyUI Users: Curious about multi-model animation pipelines.
[]Developers: Exploring local voice cloning and syncing tools.
[]AI Experimenters: Looking for local, cost-effective solutions.

This tutorial bridges the gap between static image generation and dynamic, realistic AI avatars. You can now create 10+ second talking or singing videos—entirely offline, with stunning results. No API keys. No hidden fees.
Perfect for prototyping before using paid services—or going full local for creative independence.

Reply below or join the discussion to share your avatar results and ask for help. Let’s keep pushing the boundaries of local AI together!
Bonus Tip 💡 said: