graph TD A[Input: MP3 URL] --> B[SILO 1: AUDIO PREPARATION] B --> C[Convert MP3 to WAV] C --> D[WAV File URL] D --> E[SILO 2: SPEECH ANALYSIS] E --> F[Generate Transcription] E --> G[Create Waveform] F --> H[Transcription with Timings] G --> I[Waveform Image] H --> J[SILO 3: WAVEFORM ANALYSIS] I --> J J --> K[Analyze with GPT Vision] J --> L[Extract Beatpoints] K --> M[Amplitude Data] L --> N[Rhythm Markers] M --> O[SILO 4: DATA SYNTHESIS] N --> O H --> O O --> P[LLM Processing] P --> Q[Final JSON Output] style B fill:#f9f,stroke:#333 style E fill:#f9f,stroke:#333 style J fill:#f9f,stroke:#333 style O fill:#f9f,stroke:#333