Unlocking the Power of Visual Storytelling: Transforming Videos through Text Editing
[
The Future of Video Editing: Rewrite Videos By Editing Text
Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. The last few years have been an amazing ride when it comes to research works for creating facial reenactments for real characters. Beyond just transferring our gestures to a video footage of an existing talking head,
Controlling their gestures like video game characters and full-body movement transfer are also a possibility. With WaveNet and its many variants, we can also learn someone’s way of speaking, write a piece of text and make an audio waveform where we can impersonate them using their own voice.
So, what else is there to do in this domain? Are we done? No-no, not at all! Hold on to your papers, because with this amazing new technique, what we can do is look at the transcript of a talking head video, remove parts of it or add to it, just as we would edit any piece of text – and, this technique produces both the audio and a matching video of this person uttering these words. Check this out.
How It Works
It works by looking through the video collecting small sounds that can be used to piece together this new word that we’ve added to the transcript. The authors demonstrate this by adding the word “fox” to the transcript. This can be pieced together by the “v” which appears in the word “viper”, and taking “ox” as a part of another word found in the footage. As a result, one can make the character say “fox” even without hearing her uttering this word before. Then, we can look for not only the audio occurrences for these sounds, but the video footage of how they are being said, and in the paper, a technique is proposed to blend these video assets together.
User Study Results
Finally, we can provide all this information to a neural renderer that synthesizes a smooth video of this talking head. A user study conducted showed that the edited videos were often confused with the real ones, indicating the effectiveness of this new technique. The ability to edit video transcripts opens up new possibilities for digital storytelling.
The bar is getting lower, making it easier to produce these kinds of videos while making it harder to distinguish real from edited footage. Ethical considerations are also important to consider when using these techniques.
Thank you for watching and stay tuned for more exciting updates in the world of AI-powered video editing. The future of video manipulation is here!
0:16 lol. Talking head is also slang for non-technical person, like a politician.
FAKE NEWs are real!
If you're super worried about this, it just shows that you don't realize how much you're already lied to.
Very dangerous.
Game of thrones fan needs to get this and re do seasons 6-8.
This is some Uchiha level Genjutsu
On a lighter note, this could be useful for modders. With this I'd be a lot easier to write new lines for vanilla voices.
0:31 His legs…Jesus Christ his legs
every catfishers dream
0:24 krass
Nice, Humanity created internet, video recordings and voice recordings, to make information transmission easier.
In a few years information transmission wont be reliable thanks to something Humanity invented 😂
Nation that is weaponize this kind of tech will rule the world
think of all the memes we can make with this
ytp sentence mixing anyone
does something like this exist but for audio only
So are live videos safe for now? And when we get to that point, possible solution will be only public ledger?
2nd generation YTP creator
Do you know how many girls will be contacting me now just needing a little cash to come over and see me? Awww i'm really F###ed now…………..
Пелевин все ближе и ближе
Wow this is amazing but scary at the same time
Uu, scary..
So NN-assisted sentence mixing.
Soo, can we stop making it now, it's not necessary for this to exist.
I can't wait to get my hands on something like this for podcast editing – there are times when there is signal dropout or blatant errors that I'd like to correct.
Turning academic abstracts into videos may be the coolest thing I've ever seen on YouTube. Thank you.
Does this mean that we're going to stop getting youtube videos where it randomly cuts to the dude wearing different clothing in a different room going "hey guys so while I was editing the video…"