cverse-avatar-inference-adapter v0.1.0
adapter
capsule://quake0day/[email protected]
Provides adapters for real-time avatar generation models, specifically FlashHead and SoulX-LiveAct. This capsule integrates these complex model implementations into the core inference service.
Owns
- Base avatar plugin interface
- FlashHead model implementation and utilities
- SoulX-LiveAct model implementation and utilities
Does not own
- The core inference service
- Other AI model types (ASR, LLM, TTS, etc.)
AI orientation
An AI agent working on this capsule would focus on improving avatar generation quality, optimizing model performance (e.g., latency, VRAM usage), or integrating new avatar models. It requires deep knowledge of real-time video generation, computer vision, and machine learning frameworks (e.g., PyTorch).
Avoid
- Modifying the core inference service or other AI model types.
Provides
library:avatar-plugin— Avatar model implementations conforming to the inference core's plugin interface.
Dependencies
Capsules
cverse-inference-core>=0.1.0
Invariants (must always hold)
- Avatar plugins must generate video streams from audio/visual inputs.
- Plugins must adhere to the defined avatar plugin interface.
- Output video must be synchronized with audio.
Source files (114)
Click any file to view its content; the path on the right shows where the file lands when this capsule is installed.
src/inference/plugins/avatar/__init__.py→plugins/avatar/__init__.pysrc/inference/plugins/avatar/base.py→plugins/avatar/base.pysrc/inference/plugins/avatar/flash_head_plugin.py→plugins/avatar/flash_head_plugin.pysrc/inference/plugins/avatar/live_act_plugin.py→plugins/avatar/live_act_plugin.pysrc/inference/plugins/avatar/warmup.py→plugins/avatar/warmup.pysrc/models/flash_head/audio_analysis/torch_utils.py→models/flash_head/audio_analysis/torch_utils.pysrc/models/flash_head/audio_analysis/wav2vec2.py→models/flash_head/audio_analysis/wav2vec2.pysrc/models/flash_head/demos/generate_video.py→models/flash_head/demos/generate_video.pysrc/models/flash_head/demos/gradio_app.py→models/flash_head/demos/gradio_app.pysrc/models/flash_head/demos/gradio_app_streaming.py→models/flash_head/demos/gradio_app_streaming.pysrc/models/flash_head/demos/scripts/inference_script_multi_gpu_pro.sh→models/flash_head/demos/scripts/inference_script_multi_gpu_pro.shsrc/models/flash_head/demos/scripts/inference_script_single_gpu_lite.sh→models/flash_head/demos/scripts/inference_script_single_gpu_lite.shsrc/models/flash_head/demos/scripts/inference_script_single_gpu_pro.sh→models/flash_head/demos/scripts/inference_script_single_gpu_pro.shsrc/models/flash_head/generate_video.py→models/flash_head/generate_video.pysrc/models/flash_head/gradio_app.py→models/flash_head/gradio_app.pysrc/models/flash_head/gradio_app_streaming.py→models/flash_head/gradio_app_streaming.pysrc/models/flash_head/inference.py→models/flash_head/inference.pysrc/models/flash_head/inference_script_multi_gpu_pro.sh→models/flash_head/inference_script_multi_gpu_pro.shsrc/models/flash_head/inference_script_single_gpu_lite.sh→models/flash_head/inference_script_single_gpu_lite.shsrc/models/flash_head/inference_script_single_gpu_pro.sh→models/flash_head/inference_script_single_gpu_pro.shsrc/models/flash_head/ltx_video/__init__.py→models/flash_head/ltx_video/__init__.pysrc/models/flash_head/ltx_video/ltx_vae.py→models/flash_head/ltx_video/ltx_vae.pysrc/models/flash_head/ltx_video/models/__init__.py→models/flash_head/ltx_video/models/__init__.pysrc/models/flash_head/ltx_video/models/autoencoders/__init__.py→models/flash_head/ltx_video/models/autoencoders/__init__.pysrc/models/flash_head/ltx_video/models/autoencoders/causal_conv3d.py→models/flash_head/ltx_video/models/autoencoders/causal_conv3d.pysrc/models/flash_head/ltx_video/models/autoencoders/causal_video_autoencoder.py→models/flash_head/ltx_video/models/autoencoders/causal_video_autoencoder.pysrc/models/flash_head/ltx_video/models/autoencoders/conv_nd_factory.py→models/flash_head/ltx_video/models/autoencoders/conv_nd_factory.pysrc/models/flash_head/ltx_video/models/autoencoders/dual_conv3d.py→models/flash_head/ltx_video/models/autoencoders/dual_conv3d.pysrc/models/flash_head/ltx_video/models/autoencoders/pixel_norm.py→models/flash_head/ltx_video/models/autoencoders/pixel_norm.pysrc/models/flash_head/ltx_video/models/autoencoders/vae.py→models/flash_head/ltx_video/models/autoencoders/vae.pysrc/models/flash_head/ltx_video/models/autoencoders/vae_encode.py→models/flash_head/ltx_video/models/autoencoders/vae_encode.pysrc/models/flash_head/ltx_video/models/autoencoders/video_autoencoder.py→models/flash_head/ltx_video/models/autoencoders/video_autoencoder.pysrc/models/flash_head/ltx_video/models/transformers/__init__.py→models/flash_head/ltx_video/models/transformers/__init__.pysrc/models/flash_head/ltx_video/models/transformers/attention.py→models/flash_head/ltx_video/models/transformers/attention.pysrc/models/flash_head/ltx_video/models/transformers/embeddings.py→models/flash_head/ltx_video/models/transformers/embeddings.pysrc/models/flash_head/ltx_video/models/transformers/symmetric_patchifier.py→models/flash_head/ltx_video/models/transformers/symmetric_patchifier.pysrc/models/flash_head/ltx_video/models/transformers/transformer3d.py→models/flash_head/ltx_video/models/transformers/transformer3d.pysrc/models/flash_head/ltx_video/utils/__init__.py→models/flash_head/ltx_video/utils/__init__.pysrc/models/flash_head/ltx_video/utils/diffusers_config_mapping.py→models/flash_head/ltx_video/utils/diffusers_config_mapping.pysrc/models/flash_head/ltx_video/utils/prompt_enhance_utils.py→models/flash_head/ltx_video/utils/prompt_enhance_utils.pysrc/models/flash_head/ltx_video/utils/skip_layer_strategy.py→models/flash_head/ltx_video/utils/skip_layer_strategy.pysrc/models/flash_head/ltx_video/utils/torch_utils.py→models/flash_head/ltx_video/utils/torch_utils.pysrc/models/flash_head/README.md→models/flash_head/README.mdsrc/models/flash_head/src/distributed/usp_device.py→models/flash_head/src/distributed/usp_device.pysrc/models/flash_head/src/modules/flash_head_model.py→models/flash_head/src/modules/flash_head_model.pysrc/models/flash_head/src/pipeline/flash_head_pipeline.py→models/flash_head/src/pipeline/flash_head_pipeline.pysrc/models/flash_head/utils/cpu_face_handler.py→models/flash_head/utils/cpu_face_handler.pysrc/models/flash_head/utils/facecrop.py→models/flash_head/utils/facecrop.pysrc/models/flash_head/utils/utils.py→models/flash_head/utils/utils.pysrc/models/flash_head/wan/modules/__init__.py→models/flash_head/wan/modules/__init__.pysrc/models/flash_head/wan/modules/vae.py→models/flash_head/wan/modules/vae.pysrc/models/SoulX-LiveAct/demo.py→models/SoulX-LiveAct/demo.pysrc/models/SoulX-LiveAct/fp8_gemm.py→models/SoulX-LiveAct/fp8_gemm.pysrc/models/SoulX-LiveAct/generate.py→models/SoulX-LiveAct/generate.pysrc/models/SoulX-LiveAct/kokoro/__init__.py→models/SoulX-LiveAct/kokoro/__init__.pysrc/models/SoulX-LiveAct/kokoro/__main__.py→models/SoulX-LiveAct/kokoro/__main__.pysrc/models/SoulX-LiveAct/kokoro/custom_stft.py→models/SoulX-LiveAct/kokoro/custom_stft.pysrc/models/SoulX-LiveAct/kokoro/istftnet.py→models/SoulX-LiveAct/kokoro/istftnet.pysrc/models/SoulX-LiveAct/kokoro/model.py→models/SoulX-LiveAct/kokoro/model.pysrc/models/SoulX-LiveAct/kokoro/modules.py→models/SoulX-LiveAct/kokoro/modules.pysrc/models/SoulX-LiveAct/kokoro/pipeline.py→models/SoulX-LiveAct/kokoro/pipeline.pysrc/models/SoulX-LiveAct/lightx2v/__init__.py→models/SoulX-LiveAct/lightx2v/__init__.pysrc/models/SoulX-LiveAct/lightx2v/models/__init__.py→models/SoulX-LiveAct/lightx2v/models/__init__.pysrc/models/SoulX-LiveAct/lightx2v/models/video_encoders/__init__.py→models/SoulX-LiveAct/lightx2v/models/video_encoders/__init__.pysrc/models/SoulX-LiveAct/lightx2v/models/video_encoders/hf/__init__.py→models/SoulX-LiveAct/lightx2v/models/video_encoders/hf/__init__.pysrc/models/SoulX-LiveAct/lightx2v/models/video_encoders/hf/wan/__init__.py→models/SoulX-LiveAct/lightx2v/models/video_encoders/hf/wan/__init__.pysrc/models/SoulX-LiveAct/lightx2v/models/video_encoders/hf/wan/vae.py→models/SoulX-LiveAct/lightx2v/models/video_encoders/hf/wan/vae.pysrc/models/SoulX-LiveAct/lightx2v/utils/__init__.py→models/SoulX-LiveAct/lightx2v/utils/__init__.pysrc/models/SoulX-LiveAct/lightx2v/utils/envs.py→models/SoulX-LiveAct/lightx2v/utils/envs.pysrc/models/SoulX-LiveAct/lightx2v/utils/utils.py→models/SoulX-LiveAct/lightx2v/utils/utils.pysrc/models/SoulX-LiveAct/lightx2v_platform/__init__.py→models/SoulX-LiveAct/lightx2v_platform/__init__.pysrc/models/SoulX-LiveAct/lightx2v_platform/base/__init__.py→models/SoulX-LiveAct/lightx2v_platform/base/__init__.pysrc/models/SoulX-LiveAct/lightx2v_platform/base/global_var.py→models/SoulX-LiveAct/lightx2v_platform/base/global_var.pysrc/models/SoulX-LiveAct/model_liveact/attention.py→models/SoulX-LiveAct/model_liveact/attention.pysrc/models/SoulX-LiveAct/model_liveact/model_memory.py→models/SoulX-LiveAct/model_liveact/model_memory.pysrc/models/SoulX-LiveAct/model_liveact/model_memory_sp.py→models/SoulX-LiveAct/model_liveact/model_memory_sp.pysrc/models/SoulX-LiveAct/README.md→models/SoulX-LiveAct/README.mdsrc/models/SoulX-LiveAct/requirements.txt→models/SoulX-LiveAct/requirements.txtsrc/models/SoulX-LiveAct/src/audio_analysis/torch_utils.py→models/SoulX-LiveAct/src/audio_analysis/torch_utils.pysrc/models/SoulX-LiveAct/src/audio_analysis/wav2vec2.py→models/SoulX-LiveAct/src/audio_analysis/wav2vec2.pysrc/models/SoulX-LiveAct/src/utils.py→models/SoulX-LiveAct/src/utils.pysrc/models/SoulX-LiveAct/src/vram_management/__init__.py→models/SoulX-LiveAct/src/vram_management/__init__.pysrc/models/SoulX-LiveAct/src/vram_management/layers.py→models/SoulX-LiveAct/src/vram_management/layers.pysrc/models/SoulX-LiveAct/templates/index.html→models/SoulX-LiveAct/templates/index.htmlsrc/models/SoulX-LiveAct/util_liveact.py→models/SoulX-LiveAct/util_liveact.pysrc/models/SoulX-LiveAct/wan/__init__.py→models/SoulX-LiveAct/wan/__init__.pysrc/models/SoulX-LiveAct/wan/configs/__init__.py→models/SoulX-LiveAct/wan/configs/__init__.pysrc/models/SoulX-LiveAct/wan/configs/shared_config.py→models/SoulX-LiveAct/wan/configs/shared_config.pysrc/models/SoulX-LiveAct/wan/configs/wan_i2v_14B.py→models/SoulX-LiveAct/wan/configs/wan_i2v_14B.pysrc/models/SoulX-LiveAct/wan/configs/wan_t2v_14B.py→models/SoulX-LiveAct/wan/configs/wan_t2v_14B.pysrc/models/SoulX-LiveAct/wan/configs/wan_t2v_1_3B.py→models/SoulX-LiveAct/wan/configs/wan_t2v_1_3B.pysrc/models/SoulX-LiveAct/wan/distributed/__init__.py→models/SoulX-LiveAct/wan/distributed/__init__.pysrc/models/SoulX-LiveAct/wan/distributed/fsdp.py→models/SoulX-LiveAct/wan/distributed/fsdp.pysrc/models/SoulX-LiveAct/wan/distributed/xdit_context_parallel.py→models/SoulX-LiveAct/wan/distributed/xdit_context_parallel.pysrc/models/SoulX-LiveAct/wan/first_last_frame2video.py→models/SoulX-LiveAct/wan/first_last_frame2video.pysrc/models/SoulX-LiveAct/wan/image2video.py→models/SoulX-LiveAct/wan/image2video.pysrc/models/SoulX-LiveAct/wan/modules/__init__.py→models/SoulX-LiveAct/wan/modules/__init__.pysrc/models/SoulX-LiveAct/wan/modules/attention.py→models/SoulX-LiveAct/wan/modules/attention.pysrc/models/SoulX-LiveAct/wan/modules/clip.py→models/SoulX-LiveAct/wan/modules/clip.pysrc/models/SoulX-LiveAct/wan/modules/model.py→models/SoulX-LiveAct/wan/modules/model.pysrc/models/SoulX-LiveAct/wan/modules/t5.py→models/SoulX-LiveAct/wan/modules/t5.pysrc/models/SoulX-LiveAct/wan/modules/tokenizers.py→models/SoulX-LiveAct/wan/modules/tokenizers.pysrc/models/SoulX-LiveAct/wan/modules/vace_model.py→models/SoulX-LiveAct/wan/modules/vace_model.pysrc/models/SoulX-LiveAct/wan/modules/vae.py→models/SoulX-LiveAct/wan/modules/vae.pysrc/models/SoulX-LiveAct/wan/modules/xlm_roberta.py→models/SoulX-LiveAct/wan/modules/xlm_roberta.pysrc/models/SoulX-LiveAct/wan/text2video.py→models/SoulX-LiveAct/wan/text2video.pysrc/models/SoulX-LiveAct/wan/utils/__init__.py→models/SoulX-LiveAct/wan/utils/__init__.pysrc/models/SoulX-LiveAct/wan/utils/fm_solvers.py→models/SoulX-LiveAct/wan/utils/fm_solvers.pysrc/models/SoulX-LiveAct/wan/utils/fm_solvers_unipc.py→models/SoulX-LiveAct/wan/utils/fm_solvers_unipc.pysrc/models/SoulX-LiveAct/wan/utils/prompt_extend.py→models/SoulX-LiveAct/wan/utils/prompt_extend.pysrc/models/SoulX-LiveAct/wan/utils/qwen_vl_utils.py→models/SoulX-LiveAct/wan/utils/qwen_vl_utils.pysrc/models/SoulX-LiveAct/wan/utils/utils.py→models/SoulX-LiveAct/wan/utils/utils.pysrc/models/SoulX-LiveAct/wan/utils/vace_processor.py→models/SoulX-LiveAct/wan/utils/vace_processor.pysrc/models/SoulX-LiveAct/wan/vace.py→models/SoulX-LiveAct/wan/vace.py
Plus capsule.yaml and
install.json.