Chinese AI startup Shengshu launches image-to-video tool, rivaling Sora

BEIJING — Beijing-based Shengshu Technology on Wednesday said that its artificial intelligence-powered text-to-video tool Vidu will now be able to generate videos by combining multiple images.

Vidu already allows users worldwide to create 8-second clips based on written prompts. While OpenAI — the maker of ChatGPT — in February revealed that its AI model Sora could generate one-minute videos from text, it has yet to release that publicly.

Vidu’s new AI feature can combine three pictures — such as a shirt, person and moped — into a video of the person wearing the shirt and driving the moped through a scene, Shengshu said.

Other platforms claim they can turn text or images into videos using AI, but the quality of output varies. The breakthrough that Shengshu claims is the ability to take three unique images and integrate them with visual consistency into an AI-generated video.

“Very early on we pinpointed [visual consistency] as the problem, and wanted to solve it well,” Fan Bao, chief …