Stability.AI grows its family of foundation models as it recently unveils its new AI video generator, Stable Video Diffusion.
This is a logical step in the firm’s evolution, as AI video generation is the next immediate target of most artificial intelligence development and its applications in multimedia content creation.
The new tool uses Stable Diffusion’s technology to turn images into short videos with very good results and is just the beginning of multiple upcoming applications in AI video. However, it seems to carry some of the risks that its older sister, Stable Diffusion, in terms of potential harmful use and copyright infringement.
Stability.AI's latest development, Stable Video Diffusion, was announced recently, although it is currently in research preview and not yet available for commercial use.
It is a foundational model for generative AI video based on the firm’s flagship text-to-image model, Stable Diffusion, that converts still images into short-form motion pictures.
Stable Video Diffusion is formed by two image-to-video models, SVD and SVD-XT, that generate 4 and 25 frames, respectively, at customizable frame rates between 3 and 30 frames per second. They enable creating 2-to-5-second long videos from a reference still picture.
The models are said to be trained initially with datasets of millions of images and videos and in a second stage with a curated selection of up to one million files.
You can access the new AI video generative models on the company’s GitHub repository.
The firm states that its latest video generative model has multiple applications in media, entertainment, education, and marketing.
One capability you can already explore is generating multi-view videos from a single reference image –-think of 360° videos.
However, they have several applications in the pipeline, intending to build an ecosystem just like they did with their image generative model. One upcoming feature is the ever-desired text-to-video generation, which already has a waitlist open.
As for the current uses, Stable Video Diffusion shows pretty high-quality results, comparable to or even better than earlier-released tools. But at this time, the model isn’t capable of synthesizing human faces correctly or legible text, among other limitations.
As the model is in research preview, to use it, you must agree to the company’s terms of use that establish the technology is intended for educational and artistic purposes and gives you a non-commercial license.
Their agreement also mentions disapproved uses, such as “factual or true representations of people or events.” But, as pointed out by The Verge, the software doesn’t seem to have built-in safeguards to prevent that and other potentially malicious uses.
Furthermore, Stability.AI didn’t inform the data origin used in the model’s training. There’s enough base to suspect the datasets are at least partially made from content scraped from the web –just like it happened with their image model. So, there is no certainty about the copyright status of the models themselves and any content generated through them.
It’s worth noting that both these concerns stem from experience: Stable Diffusion was largely widespread in the dark web and used for more than one questionable if not openly malicious intent –such as misinformation through deepfake imagery– and companies of the size of Getty Images are currently suing Stability.AI for unauthorizedly using its intellectual property in their datasets.
Without taking away from the new model’s potential for creative workflows and commercial endeavors, these are for sure aspects to consider.
Do you plan on trying the new Stable Video Diffusion? How would you use it in your workflow in the future? Share your thoughts!
THE AUTHOR
Ivanna Attie
I am an experienced author with expertise in digital communication, stock media, design, and creative tools. I have closely followed and reported on AI developments in this field since its early days. I have gained valuable industry insight through my work with leading digital media professionals since 2014.
AI Secrets is a platform for tech decision-makers to learn about AI technology. Our team includes experts such as Amos Struck (20+ yrs ICT, Stock Photo, AI), Ivanna Attie (expert in digital comms, design, stock media), and more who share their views on AI.
Get AI news in your inbox & join thousands of engineers and managers using AI to boost sales and grow market share