graph TD A[MP3 URL Input] --> B[Convert to WAV] B --> C[Generate Waveform] C --> D[Process Audio] subgraph SILO1[Audio Preparation] B C end D --> E[Get Transcription] D --> F[Extract Timing Data] subgraph SILO2[Timing Analysis] E F end E --> G[Combine Audio Data] F --> G H[Avatar Reference Image] --> I[Generate Viseme Map] G --> I subgraph SILO3[Viseme Mapping] I end I --> J[Create JSON Animation Data] J --> K[Final Lipsync Output] style SILO1 fill:#e6f3ff,stroke:#333,stroke-width:2px style SILO2 fill:#f0fff0,stroke:#333,stroke-width:2px style SILO3 fill:#fff0f0,stroke:#333,stroke-width:2px