Custom Avatar Video Generator - Confirm & Copy Final Agent Code

Describe

Describe your task

Refine

Refine the plan

SubAgents

Review all agents

Deploy

Deploy your agent

Review Final Agent Copy

Expand

[
  {
    "taskID": 171,
    "semanticTitleOfTask": "Generate a concise 100–300 word voice-over script",
    "taskDescription": "This task uses skill #171 to read the user's topic and style instructions, then it writes a concise 100–300 word script suitable for voice-over delivery with a natural, conversational tone.",
    "inputDescription": "This task requires the user's topic and style instructions in text form (variable1), which guide the content, tone, and flow of the final script.",
    "inputRequired": [
      "variable1"
    ],
    "outputDescription": "A 100–300 word script that addresses the user's topic and specified style, structured for voice-over with clear paragraphs and a natural flow.",
    "outputName": "script_text",
    "promptInstruction": "You are an expert script writer for voice-over videos. Your task is to create a concise, engaging script (100-300 words) based on: {variable1}. Use a natural tone, a clear introduction, body, and conclusion, short sentences, and subtle transitions. Aim for a rhythm that sounds natural aloud. End with a conclusive statement."
  },
  {
    "taskID": 170,
    "semanticTitleOfTask": "Convert Text Script to Voiceover MP3 Output",
    "taskDescription": "This task uses text-to-speech conversion to transform a provided text script (script_text) into a synthesized MP3 audio file. It enables the Subagent to output a finalized voiceover track based on the user’s provided script.",
    "inputDescription": "This task requires a single input: script_text, containing the exact spoken content for which the audio voiceover will be generated. The text should be between 100 and 300 words in length.",
    "inputRequired": [
      "script_text"
    ],
    "outputDescription": "A single MP3 audio file containing a synthesized voiceover of the provided text.",
    "outputName": "voiceover_mp3",
    "promptInstruction": "no instruction"
  },
  {
    "taskID": 182,
    "semanticTitleOfTask": "Generate 1024×1024 Transparent Avatar Image",
    "taskDescription": "This task uses skill #182 to generate a 1024×1024 PNG image with a transparent background, based on an input description of the desired avatar. The final output is returned as avatar_image.",
    "inputDescription": "One input: variable2, a concise text describing the avatar (e.g., 'woman wearing a Christmas hat in a department store').",
    "inputRequired": [
      "variable2"
    ],
    "outputDescription": "A single 1024×1024 PNG image URL with the avatar, rendered in a professional headshot style suitable for talking head use.",
    "outputName": "avatar_image",
    "promptInstruction": "Please create a high-quality, professional headshot of the subject as described in 'variable2', with a transparent background, showing face and shoulders. The image should be photorealistic, well-lit, and suitable for a talking head avatar. Include requested clothing or accessories. Subject should face the camera with natural features, even lighting, and realistic detail suitable for lip-sync usage."
  },
  {
    "taskID": 198,
    "semanticTitleOfTask": "Generate Timed Audio Transcription",
    "taskDescription": "Processes an MP3 audio file and generates a detailed transcription with precise timestamp markers for each word or phrase spoken in the audio.",
    "inputDescription": "Takes in one input, the MP3 URL of the voiceover audio (voiceover_mp3). The system then applies speech-to-text AI to produce a text transcript with timestamps for each spoken segment.",
    "inputRequired": [
      "voiceover_mp3"
    ],
    "outputDescription": "A text-based transcription that includes timestamps for each word or phrase, suitable for lip-sync alignment.",
    "outputName": "voiceover_transcription",
    "promptInstruction": "no instruction"
  },
  {
    "taskID": 168,
    "semanticTitleOfTask": "Create Lip-Synced Talking Head Video",
    "taskDescription": "Combines an avatar image with the voiceover MP3 and the transcription to produce an MP4 video where the avatar mouth movements are synchronized with the spoken words.",
    "inputDescription": "Requires voiceover_mp3, voiceover_transcription, and avatar_image. These inputs are used by the lip-sync engine to generate the final talking head video.",
    "inputRequired": [
      "voiceover_mp3",
      "voiceover_transcription",
      "avatar_image"
    ],
    "outputDescription": "An MP4 video file featuring the avatar speaking the voiceover in sync with the audio.",
    "outputName": "talking_head_mp4",
    "promptInstruction": "no instruction"
  }
]

Happy? Now Copy-Paste To Proceed...

BACK TO SUBAGENTS

COPY TO CLIPBOARD