Menu
NEW AGENT
MY AGENTS
ASSISTANTS
Step 1:
Topical Avatar Video Creator
1️⃣
Perfect output
- scan ALL
2️⃣ Add
output numbers
, then...
3️⃣ Add
Subagent Numbers
(work backwards
from output number!
)
4️⃣ Add
ACTUAL Skills
to subagent
✅ DONE..Copy x4 to Step 3...
SETTINGS
LOGOUT
What Shall We Build Next?
1
Describe
Describe your task
2
Refine
Refine the plan
3
SubAgents
Review all agents
4
Deploy
Deploy your agent
Sub Agent 1
Sub Agent 2
Sub Agent 3
Sub Agent 4
Sub Agent 5
Sub Agent 6
Sub Agent 7
Sub Agent 8
Let me break down the ScriptMaster subagent in detail: A) SUBAGENT SUMMARY: ScriptMaster generates a concise, well-structured video script optimized for voice-over delivery by first researching the topic thoroughly and then crafting a conversational script that fits within the 100-300 word target length. B) FINAL TASK OUTPUT: A single text file containing a professionally formatted voice-over script of 100-300 words, structured with clear paragraphs, proper punctuation, and natural speaking patterns suitable for voice-over recording. C) SUBAGENT INPUT: - User's topic/description - Any specific tone/style preferences - Target length preference (within 100-300 word range) E) SUBAGENT TASK SUMMARY: Here's the detailed flow using specific skills: 1. Research Phase: Input > #216 (Research Topic Deeply) > Comprehensive research data 2. Keyword Enhancement: Research data > #218 (Brainstorm Related Keywords) > Enhanced topic understanding 3. Script Generation: Enhanced topic understanding > #171 (Write Voice Over Script Based On Instructions) > Draft script 4. Script Refinement: Draft script > #190 (Write or rewrite text based on instructions) [with specific instruction to optimize for voice-over delivery] > Final polished script F) SILOS: SILO 1: RESEARCH & PREPARATION - Skill #216: Deep research of topic - Skill #218: Keyword analysis for topic coverage Output: Comprehensive research document SILO 2: SCRIPT CREATION - Skill #171: Initial script generation - Skill #190: Script optimization Output: Final voice-over ready script This workflow ensures that the script is: 1. Well-researched and accurate 2. Properly structured for voice-over 3. Within the target length 4. Natural and conversational in tone 5. Ready for the next subagent (VoiceForge) to process The double-pass approach (using both #171 and #190) ensures that the script is both content-rich and optimized for voice-over delivery, which is crucial for the final talking head video quality.
SubAgent #1 - Diagram
Expand Diagram
Let me break down the VoiceForge subagent flow in detail: A) SUBAGENT SUMMARY: VoiceForge converts a text script into a high-quality voice-over MP3 file, ensuring proper audio formatting for subsequent lip-sync processing. B) FINAL TASK OUTPUT: An MP3 file URL containing the voice-over audio, specifically formatted for lip-sync processing, with clear pronunciation and appropriate pacing (typically 130-150 words per minute), saved with consistent audio levels (-14 LUFS) and proper encoding (44.1kHz, 320kbps). C) SUBAGENT INPUT: - Primary Input: Text script (100-300 words) from ScriptMaster subagent - Secondary Input: Voice style preferences (if any specified in original user prompt) E) SUBAGENT TASK SUMMARY: The flow proceeds as: 1. Initial Input (script) > 2. #170 (Turn Script Into Voice Over MP3) > 3. #198 (Get Transcription Of MP3 With Timings) [to verify audio quality and timing] > 4. #179 (Create Visual Waveform Of 60 second Wav/mp3 File) [to verify audio levels] > 5. #176 (Analyze An Image With GPT Vision & Return Text) [to analyze waveform for quality control] > Final Output: Verified high-quality MP3 URL F) SILOS: SILO 1: AUDIO GENERATION - Input: Text script - Skill: #170 Turn Script Into Voice Over MP3 - Output: Initial MP3 URL SILO 2: QUALITY VERIFICATION - Input: MP3 from Silo 1 - Skills: * #198 Get Transcription With Timings * #179 Create Visual Waveform * #176 Analyze Waveform - Output: Quality verification data SILO 3: FINAL OUTPUT PREPARATION - Input: Verified MP3 from Silo 2 - Action: If quality checks pass, proceed with original MP3 - Output: Final MP3 URL with verified quality for lip-sync processing This structure ensures not just voice generation but also quality control through multiple verification steps, which is crucial for successful lip-sync processing in later stages.
SubAgent #2 - Diagram
Expand Diagram
Here's my complete analysis and workflow for the AvatarVision subagent: A) SUBAGENT SUMMARY: AvatarVision generates a high-quality, themed AI avatar image specifically designed for talking head videos, ensuring the avatar matches the video's topic and maintains professional quality suitable for lip-syncing. B) FINAL TASK OUTPUT: A square (1024x1024) transparent PNG file of a professional-looking avatar head/shoulders shot, with clear facial features (especially around the mouth area), suitable for lip-sync animation, saved with transparent background. C) SUBAGENT INPUT: - Topic/theme of the video - Style preferences for avatar (gender, age, profession, etc.) - Any specific visual requirements (clothing, accessories, background elements) E) SUBAGENT TASK SUMMARY: The workflow requires multiple stages to ensure the highest quality avatar: 1. Initial Avatar Generation: input > #223 (LLM prompt crafting) > refined prompt for avatar refined prompt > #222 (Make Image With Text) > initial avatar image 2. Avatar Analysis & Refinement: initial avatar > #176 (Analyze Image With GPT Vision) > analysis of facial features analysis > #223 (LLM prompt improvement) > refined prompt refined prompt > #222 (Make Image With Text) > improved avatar 3. Final Processing: improved avatar > #191 (Resize Image to 1024x1024) > resized avatar resized avatar > #182 (Create Transparent Background) > final avatar PNG F) SILOS: SILO 1: PROMPT ENGINEERING - Input: Raw topic/theme/requirements - Skill #223: Craft optimal prompt for avatar generation - Output: Refined prompt SILO 2: INITIAL GENERATION - Input: Refined prompt - Skill #222: Generate initial avatar - Output: First avatar attempt SILO 3: QUALITY CONTROL - Input: Initial avatar - Skill #176: Analyze facial features - Skill #223: Improve prompt based on analysis - Skill #222: Generate improved avatar - Output: Better avatar SILO 4: FINAL PROCESSING - Input: Improved avatar - Skill #191: Resize to correct dimensions - Skill #182: Create transparent background - Output: Final avatar PNG This workflow ensures we get a high-quality avatar that's: 1. Properly themed to the video content 2. Has clear facial features for animation 3. Is correctly sized and formatted 4. Has a transparent background for video processing
SubAgent #3 - Diagram
Expand Flow
Let me break down the LipSyncWizard subagent following the requested format: A) SUBAGENT SUMMARY: A specialized subagent that analyzes an audio file to generate precise phoneme timing data and maps it to corresponding viseme (mouth shape) configurations for realistic lip-sync animation. B) FINAL TASK OUTPUT: A structured JSON data file containing: - Timestamp markers (in milliseconds) - Corresponding phoneme identifications - Matching viseme configurations - Head position/movement data - Emotional markers/facial expressions - Amplitude/energy levels for emphasis C) SUBAGENT INPUT: - Voice-over MP3 file URL - Transcription with word-level timing - Avatar image reference (for mouth shape mapping) E) SUBAGENT TASK SUMMARY: The workflow proceeds as: 1. Convert MP3 to WAV for analysis #178 - Convert 1-20 MP3s to wav 2. Generate detailed audio analysis #179 - Create Visual Waveform Of 60 second Wav/mp3 File #180 - Extract Beatpoints & Tempo of MP3 #198 - Get Transcription Of MP3 (With Timings) 3. Process transcription data #223 - Powerful LLM Prompt-to-Text Response (Used to convert transcription into phoneme sequences) 4. Analyze audio patterns #188 - Extract 10 audio stems from mp3 (To separate speech components for better analysis) 5. Generate timing data #219 - Cut Wav/mp3 Audio into Multiple Pieces/Samples (To create precise phoneme segments) F) SILOS: SILO 1: AUDIO PREPARATION - Input: Original MP3 - Skills: #178, #179 - Output: WAV file + waveform analysis SILO 2: SPEECH ANALYSIS - Input: WAV file - Skills: #180, #198, #188 - Output: Detailed speech pattern data SILO 3: PHONEME MAPPING - Input: Speech pattern data - Skills: #223 (for phoneme identification), #219 - Output: JSON timing/phoneme data This subagent structure ensures comprehensive audio analysis and precise phoneme-to-viseme mapping, which is crucial for realistic lip-sync animation in the final video output. Note: This workflow makes the best use of available skills, though it's worth noting that some specialized lip-sync functionality might benefit from additional custom skills for more precise viseme mapping and facial animation data generation.
4 Template & Links
Expand Flow
I'll analyze SUBAGENT 5: "VideoAssemblerPro" and break it down according to your requirements. A) SUBAGENT SUMMARY: A video assembly system that combines an AI avatar image, voice-over audio, and lip-sync data to create a synchronized talking head MP4 video with natural-looking mouth movements. B) FINAL TASK OUTPUT: An MP4 video file (1080x1080 square format) featuring a single talking head avatar with synchronized lip movements matching the audio track, duration matching the input audio file length, with clear audio quality at 30fps. C) SUBAGENT INPUT: - PNG URL of the AI-generated avatar image - MP3 URL of the voice-over audio - Text transcription with timestamps (for lip-sync mapping) E) SUBAGENT TASK SUMMARY: 1. Input MP3 > #198 (Get Transcription Of MP3 With Timings) > Detailed timestamp data 2. MP3 URL + Transcription > #168 (Generate Talking Head Video From MP3 & transcription) > Initial MP4 3. Initial MP4 > #199 (Add Images & Videos On Top Of Existing MP4) [overlaying the custom avatar PNG] > Final MP4 F) SILOS: SILO 1: AUDIO PREPARATION - Input: MP3 voice-over file - Skill: #198 Get Transcription Of MP3 (With Timings) - Output: Precise transcription with timestamps SILO 2: BASE VIDEO GENERATION - Input: MP3 + Transcription from Silo 1 - Skill: #168 Generate Talking Head Video From MP3 & transcription - Output: Base talking head MP4 SILO 3: AVATAR INTEGRATION - Input: Base MP4 + Custom Avatar PNG - Skill: #199 Add Images & Videos On Top Of Existing MP4 - Output: Final rendered MP4 with custom avatar Note: While the original subagent description mentioned needing new skills for lip-sync data and video assembly, I've reconfigured the flow to use existing skills that can accomplish similar results. Skill #168 already handles basic lip-sync, and #199 can overlay the custom avatar. This approach may not provide as sophisticated animation as a dedicated lip-sync system, but it creates a functional talking head video using available skills.
5 Template & Links
Expand Flow
I notice that you've asked me to analyze "subagent 6" but no sixth subagent was defined in the original list. The original workflow only contained 5 subagents: 1. ScriptMaster 2. VoiceForge 3. AvatarVision 4. LipSyncWizard 5. VideoAssemblerPro Without a specific subagent to analyze, I cannot provide the detailed workflow breakdown you've requested. Would you like me to: 1. Analyze one of the existing five subagents instead? 2. Help define a new sixth subagent if one is needed? 3. Review the entire workflow to confirm if a sixth subagent might be beneficial? Please let me know which direction you'd like to take, and I'll be happy to provide the detailed analysis following your specified format.
6 Template & Links
Expand Flow
Templates & Links Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
7 Template & Links
Expand Flow
Questions & Research Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
8 Template & Links
Expand Flow
Templates & Links Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
9 Template & Links
Expand Flow
Templates & Links Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
10 Template & Links
Expand Flow
Questions & Research Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
11 Template & Links
Expand Flow
Templates & Links Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
12 Template & Links
Expand Flow
Need To Start Afresh?
BACK TO REFINE
Tweaked & Good To Go?
PROCEED TO DEPLOY