graph TD
A[Receive Video Input]
B[Extract MP3 from Video]
C[Initialize Speech-to-Text]
D[Process Audio Stream]
E[Generate Initial Transcript]
F[Add Timestamp Markers]
G[Format as JSON]
H[Validate JSON Structure]
I[Output Final Transcript]
A --> B
B --> C
C --> D
D --> E
E --> F
F --> G
G --> H
H --> |Valid|I
H --> |Invalid|G
subgraph Transcription Processor
B
C
D
E
F
G
H
end