LivePortrait: A fast, controllable portrait animation model

203 points by cleardusk a year ago

We are excited to announce the release of our video-driven portrait animation model! This model can vividly animate a single portrait, achieving a generation speed of 12.8ms on an RTX 4090 GPU with `torch.compile` from PyTorch. And, we are actively updating and improving this repo!

Related Resources: - Homepage: https://liveportrait.github.io

- Paper: https://arxiv.org/abs/2407.03168

- Code: https://github.com/KwaiVGI/LivePortrait

- Jupyter: https://github.com/camenduru/LivePortrait-jupyter

- ComfyUI: https://github.com/kijai/ComfyUI-LivePortraitKJ and https://github.com/shadowcz007/comfyui-liveportrait

We hope you give it a try and enjoy!

throwaway0665 a year ago

The videos on your homepage are encoded with HEVC which can't be viewed in Firefox. Please consider using an open codec like AV1.

cleardusk a year ago

Thanks for the reminder. The homepage is a GitHub page and does not support git lfs, so I have compressed the files as much as possible to reduce their size. We consider re-encode the mp4 files to x264, and provide a packed zip of the homepage.
- cleardusk a year ago
  
  I scanned the open codec page: https://en.wikipedia.org/wiki/List_of_open-source_codecs I'm a little confused, is H.265 not OPEN? :-(
  - npteljes a year ago
    
    It's a bit of a mess. The implementation of a codec, that is, and encoder or a decoder can be open source, despite the format itself not being open. H265 does have open implementation, but the format itself is not open. The opposite can be true as well, there are proprietary encoders to open formats for example. Actual list of open video formats: https://en.wikipedia.org/wiki/List_of_open_file_formats#Vide...
    What OP meant is that they would like an open format on the website, which, then, can be viewed in any modern browser. I think that caniuse is a good resource in this regard.
    https://caniuse.com/av1
    https://caniuse.com/hevc
    WebM with VP9 video is a good general browser target I think:
    https://caniuse.com/webm
    But funnily enough, even though h264 is not open, it's a widely decoded video format as well:
    https://caniuse.com/mpeg4
    
    adzm a year ago
    
    This is exactly why I am not convinced that VVC is going to be useful; seems to have little advantage over AV1, as well as being late to the party in the first place.
    
    npteljes a year ago
    
    Well yeah, they wanna rent, so they have to develop these things. It also depends on what business deals they make in the background. If the format is secured in some applications, that might cement it as a quasi-standard, which then they can leverage for further popularity.
    I hope open standards keep winning. Overall, everyone wins with the infrastructure being openly accessible, especially the common folk.

smusamashah a year ago

This is amazing. I can immediately see it being used by StableDiffusion and other generative image communities. It gives life to those lifeless faces and it doesn't look outstandingly odd. Not to my eyes at least.

Edit: it's definitely being used already https://www.reddit.com/r/StableDiffusion/comments/1dvepjx/li...

vergessenmir a year ago

It will allow for more realistic emotions in current SD Model merges and fine tunes by generating frames correctly labelled with their associated emotions.
Most SD1.x/SDXL models images depict humans with the same expression so the frames generated by LivePortrait will help with training datasets.
I believe the Pixar animators in Toy Story 1 used facial expressions /emotions database called F.A.C.S to make the characters more humanly relatable.
It's not clear if the "expressions" will generalise to new faces

vessenes a year ago

This is .. remarkably fast. Fast as in a quick response to Microsoft’s announcement earlier this year, and as in low latency. I love it.

I’d love to see a database of facial expression videos that’s used for some sort of standardized test expression testing.. are you guys aware of one?

nestorD a year ago

The "generalization to animals" part seems like it opens a lot of interesting avenues!

homarp a year ago

did you mamage to make it work? the model can't find my cat face
- homarp a year ago
  
  never mind. https://github.com/KwaiVGI/LivePortrait/issues/20#issuecomme... explains it needs custom fine tuning to work

jokethrowaway a year ago

- Fast! - Getting some unstable results, the head keeps moving up and down, just a few pixel, maybe it needs some stabilization - Single frame rendering are quite good, a bit cartoony though - No lip syncing - Head rotation is bad, it deforms the head completely

column a year ago

There's a typo right at the beginning of your paper's page: exsiting

cleardusk a year ago

Fixed! h_h

brcmthrowaway a year ago

Is this how the Luma dream machine works?

baobabKoodaa a year ago

no
- brcmthrowaway a year ago
  
  How does that work

FlyNestor a year ago

looks good

BoredPositron a year ago

For everyone wanting to use this commercially be wary of the insightface models licensing...

cleardusk a year ago

https://github.com/deepinsight/insightface?tab=readme-ov-fil... ---------- The code of InsightFace is released under the MIT License. There is no limitation for both academic and commercial usage.
- homarp a year ago
  
  that is the code. the weights are non commercial
  Both manual-downloading models from our github repo and auto-downloading models with our python-library follow the above license policy(which is for non-commercial research purposes only).
  - cleardusk a year ago
    
    Understood. The core dependency of InsightFace in LivePortrait is the face detection algo. The face detection is easily to be replaced with self-developed or MIT-licensed model.
    
    cchance a year ago
    
    Exactly just replace it with any segmentation model lol, FastSAM or a YOLO model can find the face lol. No reason to be using insight for that.
    
    homarp a year ago
    
    alt models https://paperswithcode.com/task/face-detection