Extending video clips with AI on a 12 GB GPU — six models compared
You have a short machine/action clip and want it longer. 'Extending' a clip is really image-to-video: take the last frame and generate a plausible continuation. I fed the same shot to six models — Wan2.2-TI2V-5B, Wan2.2-I2V-A14B (GGUF), Stable Video Diffusion, LTX-Video and CogVideoX-5B — all on a single 12 GB RTX 4070 Ti. The interesting part isn't which looks best; it's the memory-offload tricks that make a 14B video model (and 720p) fit in 12 GB at all.