LivePortrait: A fast, controllable portrait animation model

github.com

203 points by cleardusk 3 months ago

We are excited to announce the release of our video-driven portrait animation model! This model can vividly animate a single portrait, achieving a generation speed of 12.8ms on an RTX 4090 GPU with `torch.compile` from PyTorch. And, we are actively updating and improving this repo!

Related Resources: - Homepage: https://liveportrait.github.io

- Paper: https://arxiv.org/abs/2407.03168

- Code: https://github.com/KwaiVGI/LivePortrait

- Jupyter: https://github.com/camenduru/LivePortrait-jupyter

- ComfyUI: https://github.com/kijai/ComfyUI-LivePortraitKJ and https://github.com/shadowcz007/comfyui-liveportrait

We hope you give it a try and enjoy!

throwaway0665 3 months ago

The videos on your homepage are encoded with HEVC which can't be viewed in Firefox. Please consider using an open codec like AV1.

  • cleardusk 3 months ago

    Thanks for the reminder. The homepage is a GitHub page and does not support git lfs, so I have compressed the files as much as possible to reduce their size. We consider re-encode the mp4 files to x264, and provide a packed zip of the homepage.

    • cleardusk 3 months ago

      I scanned the open codec page: https://en.wikipedia.org/wiki/List_of_open-source_codecs I'm a little confused, is H.265 not OPEN? :-(

      • npteljes 3 months ago

        It's a bit of a mess. The implementation of a codec, that is, and encoder or a decoder can be open source, despite the format itself not being open. H265 does have open implementation, but the format itself is not open. The opposite can be true as well, there are proprietary encoders to open formats for example. Actual list of open video formats: https://en.wikipedia.org/wiki/List_of_open_file_formats#Vide...

        What OP meant is that they would like an open format on the website, which, then, can be viewed in any modern browser. I think that caniuse is a good resource in this regard.

        https://caniuse.com/av1

        https://caniuse.com/hevc

        WebM with VP9 video is a good general browser target I think:

        https://caniuse.com/webm

        But funnily enough, even though h264 is not open, it's a widely decoded video format as well:

        https://caniuse.com/mpeg4

        • adzm 3 months ago

          This is exactly why I am not convinced that VVC is going to be useful; seems to have little advantage over AV1, as well as being late to the party in the first place.

          • npteljes 3 months ago

            Well yeah, they wanna rent, so they have to develop these things. It also depends on what business deals they make in the background. If the format is secured in some applications, that might cement it as a quasi-standard, which then they can leverage for further popularity.

            I hope open standards keep winning. Overall, everyone wins with the infrastructure being openly accessible, especially the common folk.

smusamashah 3 months ago

This is amazing. I can immediately see it being used by StableDiffusion and other generative image communities. It gives life to those lifeless faces and it doesn't look outstandingly odd. Not to my eyes at least.

Edit: it's definitely being used already https://www.reddit.com/r/StableDiffusion/comments/1dvepjx/li...

  • vergessenmir 3 months ago

    It will allow for more realistic emotions in current SD Model merges and fine tunes by generating frames correctly labelled with their associated emotions.

    Most SD1.x/SDXL models images depict humans with the same expression so the frames generated by LivePortrait will help with training datasets.

    I believe the Pixar animators in Toy Story 1 used facial expressions /emotions database called F.A.C.S to make the characters more humanly relatable.

    It's not clear if the "expressions" will generalise to new faces

vessenes 3 months ago

This is .. remarkably fast. Fast as in a quick response to Microsoft’s announcement earlier this year, and as in low latency. I love it.

I’d love to see a database of facial expression videos that’s used for some sort of standardized test expression testing.. are you guys aware of one?

jokethrowaway 3 months ago

- Fast! - Getting some unstable results, the head keeps moving up and down, just a few pixel, maybe it needs some stabilization - Single frame rendering are quite good, a bit cartoony though - No lip syncing - Head rotation is bad, it deforms the head completely

column 3 months ago

There's a typo right at the beginning of your paper's page: exsiting

42lux 3 months ago

For everyone wanting to use this commercially be wary of the insightface models licensing...

  • cleardusk 3 months ago

    https://github.com/deepinsight/insightface?tab=readme-ov-fil... ---------- The code of InsightFace is released under the MIT License. There is no limitation for both academic and commercial usage.

    • homarp 3 months ago

      that is the code. the weights are non commercial

      Both manual-downloading models from our github repo and auto-downloading models with our python-library follow the above license policy(which is for non-commercial research purposes only).

      • cleardusk 3 months ago

        Understood. The core dependency of InsightFace in LivePortrait is the face detection algo. The face detection is easily to be replaced with self-developed or MIT-licensed model.