graph TD
subgraph LipSyncWizard
A[MP3 Input] --> B[Convert to WAV]
B --> C[Generate Transcription]
B --> D[Create Waveform]
subgraph Silo1[Audio Analysis]
B
C
D
end
C --> E[Analyze Speech Patterns]
D --> E
subgraph Silo2[Phoneme Extraction]
E --> F[Map Transcription to Patterns]
F --> G[Convert to Phoneme Sequence]
end
G --> H[Convert to Viseme Positions]
subgraph Silo3[Viseme Mapping]
H --> I[Generate Timing Data]
I --> J[Create JSON Structure]
end
end
A --> |Input| LipSyncWizard
J --> K[Final JSON Output]
style Silo1 fill:#f9f,stroke:#333
style Silo2 fill:#bbf,stroke:#333
style Silo3 fill:#bfb,stroke:#333