Skip to content
NAB Show NAB Show New York
  • Home
  • Discover
    • 2024 NAB Show Amplified
    • Content Production
    • Artificial Intelligence
    • Creator Economy
    • Virtual Production
    • Content Delivery
    • Industry Research
  • Learn
    • 2024 NAB Show Videos
    • 2023 NAB Show New York Videos
    • Creator Lab Videos
    • Demo Days
    • 5 Minutes With…
  • Connect
    • About NAB Amplify
    • Community
    • Subscribe
    • Advertise
    • Media Kit
    • Contact
  • Sign Up
  • Sign In
To See More Search Results, Hit Enter...
Showing 1–10 of 235 for “nab”
< Previous
  • 1
  • 2
  • 3
Next >
October 14, 2022

Make-A-Video: Text-to-Video Generation’s Next… Generation?

author
Adrian Pennington
The text prompt for this was “a confused grizzly bear in calculus class.”
The text prompt for this Make-A-Video was “a confused grizzly bear in calculus class” (and that looks about right).

READ MORE: Introducing Make-A-Video: An AI system that generates videos from text (Meta)

The inevitable has happened, albeit a little sooner than expected. After all the hoopla surrounding text-to-image AI generators in recent months, Meta is first out of the gate with a text-to-video version.

Perhaps Meta wanted to establish some headline leadership in this space, since the results aren’t ready for primetime.

But as developments in text-to-image generation has shown, by the time you read this the technology will already have advanced.

Meta is only giving a glimpse to the public at the tech it calls Make-A-Video. It’s still being researched with no hint of a commercial release.



ALSO ON NAB AMPLIFY:

Dall-e Courtesy of OpenAI artificial intelligence AI

What Will DALL-E Mean for the Future of Creativity?



“Generative AI research is pushing creative expression forward by giving people tools to quickly and easily create new content,” Meta stated in a blog post announcing the new AI tool. “With just a few words or lines of text, Make-A-Video can bring imagination to life and create one-of-a-kind videos full of vivid colors and landscapes.”

In a Facebook post, Meta CEO Mark Zuckerberg described the work as “amazing progress,” adding, “It’s much harder to generate video than photos because beyond correctly generating each pixel, the system also has to predict how they’ll change over time.”

Examples on Make-A-Video’s announcement page include “a young couple walking in heavy rain” and “a teddy bear painting a portrait.” It also showcases Make-A-Video’s ability to take a static source image and animate it. For example, a still photo of a sea turtle, once processed through the AI model, can appear to be swimming.

The Make-A-Video prompt here was: “A teddy bear painting a portrait” Image courtesy of Meta
The Make-A-Video prompt here was: “A teddy bear painting a portrait” Video courtesy of Meta

The key technology behind Make-A-Video — and why it has arrived sooner than some experts anticipated — is that it builds off existing work with text-to-image synthesis used with image generators like OpenAI’s DALL-E. Meta announced its own text-to-image AI model in July.



ALSO ON NAB AMPLIFY:

AI Can Produce Visuals We Can’t Even Imagine, So Maybe We Should Just Enjoy It



According to Benj Edwards at Arts Technica, instead of training the Make-A-Video model on labeled video data (for example, captioned descriptions of the actions depicted), Meta instead took image synthesis data (still images trained with captions) and applied unlabeled video training data so the model learns a sense of where a text or image prompt might exist in time and space. It can then predict what comes after the image and display the scene in motion for a short period.

READ MORE: Meta announces Make-A-Video, which generates video from text (Arts Technica)

In Meta’s white paper, “Make-A-Video: Text-To-Video Generation Without Text-Video Data,” the researchers note that Make-A-Video is training on pairs of images and captions as well as unlabeled video footage. Training content was sourced from two datasets which, together, contain millions of videos spanning hundreds of thousands of hours of footage. This includes stock video footage created by sites like Shutterstock and scraped from the web.

The Verge’s James Vincent shares other examples, but notes that they were all provided by Meta. “That means the clips could have been cherry-picked to show the system in its best light,” he says. “The videos are clearly artificial, with blurred subjects and distorted animation, but still represent a significant development in the field of AI content generation.”

The clips are no longer than five seconds (16 frames of video) at a resolution of 64 by 64 pixels, which are then boosted in size using a separate AI model to 768 by 768. They contain no audio but span a huge range of prompts.

READ MORE: Meta’s new text-to-video AI generator is like DALL-E for video (The Verge)

The researchers note that the model has many technical limitations beyond blurry footage and disjointed animation. For example, their training methods are unable to learn information that might only be inferred by a human watching a video — e.g., whether a video of a waving hand is going left to right or right to left. Other problems include generating videos longer than five seconds, videos with multiple scenes and events, and higher resolution.

The researchers are also aware of walking into a minefield of controversy. Make-A-Video has “learnt and likely exaggerated social biases, including harmful ones,” while all AI-generated video content from the AI contains a watermark to “help ensure viewers know the video was generated with AI and is not a captured video.”

READ MORE: Make-A-Video: Text-To-Video Generation Without Text-Video Data (Meta)



ALSO ON NAB AMPLIFY:

content creator creator economy NFT Vermeer

Recognizing Ourselves in AI-Generated Art



Cracking the code to create photorealistic video on demand — and then drive it with a narrative — is exercising other minds too.

Chinese researchers are behind another text-to-video model named CogVideo, OpenAI is also thought to be working on one, and no doubt there are numerous other initiatives in the works.


EXPLORING ARTIFICIAL INTELLIGENCE:

With nearly half of all media and media tech companies incorporating Artificial Intelligence into their operations or product lines, AI and machine learning tools are rapidly transforming content creation, delivery and consumption. Find out what you need to know with these essential insights curated from the NAB Amplify archives:
  • This Will Be Your 2032: Quantum Sensors, AI With Feeling, and Life Beyond Glass
  • Learn How Data, AI and Automation Will Shape Your Future
  • Where Are We With AI and ML in M&E?
  • How Creativity and Data Are a Match Made in Hollywood/Heaven
  • How to Process the Difference Between AI and Machine Learning

Are you interested in contributing ideas, suggestions or opinions? We’d love to hear from you. Email us here.

  • Content Creation
  • Intelligent Content
  • Management and Systems
  • Creator Economy
  • Al / Machine Learning

Subscribe

for more content like this sent directly to your inbox:

Sign Up
  • #NABAmplify
Editors
  • More NAB:
  • NAB Amplify
  • NAB Show
  • NAB Show New York
  • Policy
  • Privacy Policy
  • Terms of Use
  • Code of Conduct
  • Cookie Policy
  • Contact Us
  • Advertising & Thought Leadership
  • Technical Difficulties
  • Cookie Preferences
  • RSS Feed
The Angle Newsletter

Weekly editorial newsletter covering the latest content, events and more taking place on NAB Amplify.

Subscribe

The thoughts and opinions expressed on NAB Amplify do not constitute official statements or positions by the National Association of Broadcasters.

© 2025 National Association of Broadcasters. All Rights Reserved.