graph TD subgraph LipSyncWizard A[MP3 Input] --> B[Convert to WAV] B --> C[Generate Transcription] B --> D[Create Waveform] subgraph Silo1[Audio Analysis] B C D end C --> E[Analyze Speech Patterns] D --> E subgraph Silo2[Phoneme Extraction] E --> F[Map Transcription to Patterns] F --> G[Convert to Phoneme Sequence] end G --> H[Convert to Viseme Positions] subgraph Silo3[Viseme Mapping] H --> I[Generate Timing Data] I --> J[Create JSON Structure] end end A --> |Input| LipSyncWizard J --> K[Final JSON Output] style Silo1 fill:#f9f,stroke:#333 style Silo2 fill:#bbf,stroke:#333 style Silo3 fill:#bfb,stroke:#333