I want to build an AI agent that generates avatar talking head videos. I will enter a rough description of what the video should be about, along with the type of avatar image I want. Then the agent will first write a script for me (approx. 100-300 words). Secondly, it will turn that script into a voice over audio. Next, it will generate an image of my AI avatar (which should be topical and related to the theme of my video). Finally it will take this custom ai avatar image and the voice over file and generate the talking head video in question, and return the mp4 as the final output. In this way I can create custom AI avatar talking head videos on any topic, and with the avatar image themed to the topic.