graph TD A[Input: Custom Avatar PNG] -->|Avatar Image| D B[Input: Voice-over MP3] -->|Audio File| C C[Silo 1: Audio Preparation] -->|Run Skill #198| E[Get Transcription with Timings] E -->|Timestamp Data| F[Silo 2: Base Video Generation] B -->|Audio Feed| F F -->|Run Skill #168| G[Generate Base Talking Head MP4] G -->|Base Video| H[Silo 3: Avatar Integration] D[Avatar Image Storage] -->|PNG Asset| H H -->|Run Skill #199| I[Overlay Custom Avatar] I -->|Final Output| J[Rendered MP4 with Custom Avatar] style A fill:#f9f,stroke:#333 style B fill:#f9f,stroke:#333 style J fill:#9f9,stroke:#333