Menu
NEW AGENT
MY AGENTS
ASSISTANTS
Step 1:
Topical Avatar Video Creator
1️⃣
Perfect output
- scan ALL
2️⃣ Add
output numbers
, then...
3️⃣ Add
Subagent Numbers
(work backwards
from output number!
)
4️⃣ Add
ACTUAL Skills
to subagent
✅ DONE..Copy x4 to Step 3...
SETTINGS
LOGOUT
What Shall We Build Next?
1
Describe
Describe your task
2
Refine
Refine the plan
3
SubAgents
Review all agents
4
Deploy
Deploy your agent
Sub Agent 1
Sub Agent 2
Sub Agent 3
Sub Agent 4
Sub Agent 5
Sub Agent 6
Sub Agent 7
Sub Agent 8
Let me break down the ScriptMaster subagent in detail: A) SUBAGENT SUMMARY: ScriptMaster generates a professionally-written, conversational video script optimized for voice-over delivery, based on the user's topic description and intended content. B) FINAL TASK OUTPUT: A text file containing a 100-300 word voice-over script, formatted with proper punctuation and natural speech patterns, optimized for AI voice synthesis, including clear paragraph breaks and any necessary pronunciation guides. C) SUBAGENT INPUT: - User's topic/description - Desired length (if specified) - Any specific tone/style requirements - Target audience information (if provided) D) SUBAGENT TASK SUMMARY: The workflow follows this sequence: 1. Research Phase: #216 (Research Topic Deeply) > Returns comprehensive research summary ↓ 2. Keyword Enhancement: #218 (Brainstorm Related Keywords) > Returns relevant keywords and context ↓ 3. Initial Script Generation: #171 (Write Voice Over Script Based On Instructions) > Returns draft script ↓ 4. Script Optimization: #190 (Write or rewrite text based on instructions) > Returns polished script [This final step specifically optimizes for voice synthesis compatibility] E) SILOS: SILO 1: RESEARCH & CONTEXT - Input: User's topic description - Skill #216: Deep research of topic - Skill #218: Keyword analysis - Output: Research summary + keyword context SILO 2: SCRIPT CREATION - Input: Research summary + keywords - Skill #171: Initial script generation - Output: Draft script SILO 3: OPTIMIZATION - Input: Draft script - Skill #190: Voice-over optimization - Output: Final polished script This structure ensures thorough research, proper context gathering, and multiple refinement stages to create a script that's both content-rich and optimized for voice synthesis and delivery. The separation into silos allows for quality control at each stage and ensures that the final script is both well-researched and technically optimized for the next stages in the larger workflow (voice synthesis and avatar animation).
SubAgent #1 - Diagram
Expand Diagram
I'll analyze SUBAGENT 2: "VoiceForge" and break it down according to the guidelines: A) SUBAGENT SUMMARY: VoiceForge converts a text script into a high-quality voice-over MP3 file, ensuring proper audio formatting and quality for lip-sync compatibility in the final talking head video. B) FINAL TASK OUTPUT: MP3 audio file (URL) containing clear voice-over narration of the script, with consistent volume levels, minimal background noise, and proper timing for lip-sync integration (typically 44.1kHz, 16-bit, stereo format). C) SUBAGENT INPUT: - Text script (100-300 words) from ScriptMaster - Voice style preference (if any, defaulting to neutral professional voice) E) SUBAGENT TASK SUMMARY: The workflow follows these steps: 1. Initial Conversion: text script > #170 (Turn Script Into Voice Over MP3) > initial MP3 URL 2. Audio Analysis & Quality Check: MP3 URL > #198 (Get Transcription Of MP3 With Timings) > transcription with timing MP3 URL > #179 (Create Visual Waveform Of 60 second Wav/mp3 File) > waveform image waveform image > #176 (Analyze An Image With GPT Vision & Return Text) > audio quality analysis 3. Audio Processing (if needed based on analysis): MP3 URL > #178 (Convert 1-20 MP3s to wav) > WAV file WAV file > #219 (Cut Wav/mp3 Audio into Multiple Pieces/Samples) > processed audio segments [Note: This step only executes if the quality analysis indicates issues] F) SILOS: SILO 1: INITIAL AUDIO GENERATION - Input: Text script - Skill: #170 (Turn Script Into Voice Over MP3) - Output: Initial MP3 URL SILO 2: QUALITY VERIFICATION - Input: Initial MP3 URL - Skills: #198, #179, #176 - Output: Quality analysis results SILO 3: AUDIO REFINEMENT (Conditional) - Input: MP3 URL (if quality check fails) - Skills: #178, #219 - Output: Final processed audio file The subagent concludes when either: a) The initial MP3 passes quality verification OR b) The audio refinement process completes successfully The final output is always a single MP3 file URL that meets the quality standards required for lip-sync integration in the final talking head video.
SubAgent #2 - Diagram
Expand Diagram
Let me break down the AvatarVision subagent flow in detail: A) SUBAGENT SUMMARY: AvatarVision generates a high-quality, themed AI avatar image that matches the video's topic and style, ensuring it's optimized for talking head animation with clear facial features and appropriate framing. B) FINAL TASK OUTPUT: A 1024x1024 transparent PNG file of a front-facing avatar with clear facial features, neutral expression, good lighting, and appropriate themed styling, saved with transparent background for animation compatibility. C) SUBAGENT INPUT: - Topic/theme of the video - Style preferences for avatar (eg. professional, casual, specific profession/character type) - Any specific visual requirements (gender, age range, ethnicity if specified) D) SUBAGENT TASK SUMMARY: Input > #223 (Generate optimized prompt) > #222 (Initial avatar generation) > #176 (Quality check) > #221 (Refinement if needed) > #191 (Final size optimization) > Output Detailed Flow: 1. Use #223 (Powerful LLM) to convert user requirements into an optimized image generation prompt that emphasizes: - Front-facing position - Clear facial features - Neutral expression - Professional lighting - Themed styling elements - Background removal compatibility 2. Use #222 (Make Image) to generate initial avatar using the optimized prompt 3. Use #176 (Analyze Image) to verify: - Facial clarity - Proper positioning - Theme appropriateness - Animation suitability 4. IF quality check fails, use #221 (Recreate Image) to refine based on specific feedback 5. Use #191 (Resize Image) to ensure final 1024x1024 dimension F) SILOS: SILO 1: PROMPT ENGINEERING - Input: Raw user requirements - Skill: #223 - Output: Optimized generation prompt SILO 2: IMAGE GENERATION - Input: Optimized prompt - Skill: #222 - Output: Initial avatar image SILO 3: QUALITY ASSURANCE - Input: Initial avatar image - Skills: #176 > #221 (if needed) - Output: Verified/refined avatar SILO 4: OPTIMIZATION - Input: Verified avatar - Skill: #191 - Output: Final sized avatar PNG This workflow ensures consistent, high-quality avatar generation with appropriate checks and refinements for talking head animation compatibility.
SubAgent #3 - Diagram
Expand Flow
Let me break down the LipSyncWizard subagent: A) SUBAGENT SUMMARY: A specialized component that analyzes voice-over audio to generate precise phoneme timing data and maps it to corresponding viseme (mouth shape) positions for synchronized avatar animation. B) FINAL TASK OUTPUT: A JSON data structure containing: - Timestamp-mapped phoneme data (millisecond precision) - Corresponding viseme positions (mouth shapes) - Head movement coordinates (x,y,z rotation values) - Facial expression markers Format: {timing: ms, phoneme: string, viseme: int, head_pos: [x,y,z], expression: float} C) SUBAGENT INPUT: - Voice-over MP3 URL - Transcription with word-level timing - Avatar reference image (to understand mouth/face structure) E) SUBAGENT TASK SUMMARY: 1. Convert MP3 to WAV for analysis > #178 (Convert MP3 to WAV) 2. Generate detailed audio waveform > #179 (Create Visual Waveform) 3. Extract precise audio timing data > #198 (Get Transcription with Timings) 4. Analyze audio characteristics > #180 (Extract Beatpoints & Tempo) 5. Generate viseme mapping data > #223 (LLM to process audio analysis and generate viseme mappings) F) SILOS: SILO 1: AUDIO PREPARATION - Input: MP3 URL - Skill #178: Convert to WAV - Skill #179: Generate waveform Output: WAV file + waveform visualization SILO 2: TIMING ANALYSIS - Input: WAV from Silo 1 - Skill #198: Get precise transcription - Skill #180: Extract timing data Output: Detailed timing data SILO 3: VISEME MAPPING - Input: Timing data + Avatar reference - Skill #223: Process data and generate viseme mappings Output: Final JSON animation data Note: The limitation here is that while we can generate the precise timing data, creating accurate viseme mappings requires a custom solution beyond the current skill set. The LLM (#223) can help structure the data, but additional development would be needed for precise mouth shape calculations.
4 Template & Links
Expand Flow
Let me break down the VideoAssemblerPro subagent following your guidelines: A) SUBAGENT SUMMARY: A specialized video assembly agent that combines an AI avatar image with voice-over audio and lip-sync data to create a synchronized talking head video output. B) FINAL TASK OUTPUT: An MP4 video file (1920x1080) featuring a speaking AI avatar with synchronized lip movements and natural head movements, duration matching the input audio file length, with clear audio quality at 48kHz. C) SUBAGENT INPUT: - PNG file URL of the AI-generated avatar image - MP3 file URL of the voice-over audio - Text transcription with timing data (for lip-sync alignment) D) SUBAGENT TASK SUMMARY: Input > #198 Get Transcription Of MP3 (With Timings) > #168 Generate Talking Head Video From MP3 & transcription > #199 Add Images & Videos On Top Of Existing MP4 > Final MP4 Output The flow works like this: 1. First, get precise timing data from the audio using #198 2. Use #168 to generate the base talking head video 3. Use #199 to enhance the video with any additional overlay elements F) SILOS: SILO 1: AUDIO PREPARATION - Input: MP3 voice-over file - Skill: #198 (Get Transcription Of MP3 With Timings) - Output: Precise transcription with timing data SILO 2: BASE VIDEO GENERATION - Input: MP3 + transcription - Skill: #168 (Generate Talking Head Video) - Output: Base MP4 with synchronized lip movements SILO 3: VIDEO ENHANCEMENT - Input: Base MP4 + avatar PNG - Skill: #199 (Add Images & Videos On Top Of Existing MP4) - Output: Final enhanced MP4 video Note: This workflow assumes skill #168 has built-in lip-sync capabilities. If it doesn't, we would need an additional specialized lip-sync processing step before video generation, but this isn't currently available in the skill list. The main advantage of this approach is that it uses proven skills (#168, #198, #199) rather than requiring new unproven ones, while still achieving the core objective of creating a synchronized talking head video.
5 Template & Links
Expand Flow
I notice that you've asked me to analyze "subagent 6" but in the provided list of subagents, there were only 5 subagents listed (ScriptMaster, VoiceForge, AvatarVision, LipSyncWizard, and VideoAssemblerPro). There is no subagent 6 to analyze. Would you like me to: 1. Analyze one of the existing 5 subagents in detail 2. Help design a new subagent 6 if there's an additional component needed 3. Focus on a different aspect of the workflow Please let me know which direction you'd like to pursue and I'll provide a detailed analysis following your requested format.
6 Template & Links
Expand Flow
Templates & Links Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
7 Template & Links
Expand Flow
Questions & Research Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
8 Template & Links
Expand Flow
Templates & Links Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
9 Template & Links
Expand Flow
Templates & Links Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
10 Template & Links
Expand Flow
Questions & Research Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
11 Template & Links
Expand Flow
Templates & Links Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
12 Template & Links
Expand Flow
Need To Start Afresh?
BACK TO REFINE
Tweaked & Good To Go?
PROCEED TO DEPLOY