graph TD
START[User Input: Avatar PNG + MP3 + Transcription]
subgraph SILO_1[SILO 1: Audio Preparation]
A1[Receive MP3 URL] -->
A2[Run Skill #198: Get Transcription] -->
A3[Generate Timing Data]
end
subgraph SILO_2[SILO 2: Video Generation]
B1[Receive Avatar PNG] -->
B2[Combine with MP3 & Transcription] -->
B3[Run Skill #168: Generate Video] -->
B4[Create Base MP4]
end
subgraph SILO_3[SILO 3: Quality Verification]
C1[Input Base MP4] -->
C2[Run Skill #202: Extract Thumbnails] -->
C3[Verify Lip-sync Quality] -->
C4[Generate Final MP4]
end
START --> SILO_1
A3 --> B2
B4 --> C1
C4 --> END[Final Output: Talking Head MP4]
style START fill:#f9f,stroke:#333
style END fill:#9f9,stroke:#333
style SILO_1 fill:#e6f3ff
style SILO_2 fill:#e6ffe6
style SILO_3 fill:#ffe6e6