graph TD
A[Input: Custom Avatar PNG] -->|Avatar Image| D
B[Input: Voice-over MP3] -->|Audio File| C
C[Silo 1: Audio Preparation] -->|Run Skill #198| E[Get Transcription with Timings]
E -->|Timestamp Data| F[Silo 2: Base Video Generation]
B -->|Audio Feed| F
F -->|Run Skill #168| G[Generate Base Talking Head MP4]
G -->|Base Video| H[Silo 3: Avatar Integration]
D[Avatar Image Storage] -->|PNG Asset| H
H -->|Run Skill #199| I[Overlay Custom Avatar]
I -->|Final Output| J[Rendered MP4 with Custom Avatar]
style A fill:#f9f,stroke:#333
style B fill:#f9f,stroke:#333
style J fill:#9f9,stroke:#333