I think it would be great if we had a text + video-to-video model that could VFX while preserving the human element, like the actor’s performance.