TheAITraveler Improved
šŸ¤–

TheAITraveler Improved

Tags
Research
Software Development
Projects
Creator Economy
Published
August 9, 2023
Author
Randall Hand
URL
I posted a few weeks about about my AI Traveler project, how I built some scripts and tools to completely 100% automate a basic Youtube Channel.
Itā€™s been running automatically for about 2 weeks now and Iā€™ve made lots of little changes and tweaks, and I wanted to share my findings for anyone else playing in this space.
Itā€™s an interesting collection of AI & Technical quirks that sometimes disappoint, sometimes entertain.
Ā 
notion image
Ā 

Prompt Engineering

If youā€™ve played any in the new LLM space youā€™ve heard the term ā€œPrompt Engineeringā€. What is it? Wikipedia says:
Prompt engineering or prompting is the process of structuring sentences so that they can be interpreted and understood by a generative AI model in such a way that its output is in accord with the user's intentions.
So that doesnā€™t really help, does it? Let me give you a concrete example. In the first version of the script I used a prompt like:
Write about the following topic: {prompt}. Write in short sentences separated by . Write about {nLen} complete sentences.
Generally that worked. But there were a few main problems:
  1. Using a period as a separator only works if thatā€™s not in your title. Like ā€œSt. Peterā€™s Basilicaā€ -
  1. Even when told short sentences, it can generate some really long run-on compound sentences which donā€™t work well in this use case.
  1. Sometimes it can be too literal and literally return ā€œsentencesā€. Like, the single word. šŸ¤¦
Ā 
It took some tinkering, but I eventually rewrote the prompt to return the result as a JSON array of proper sentences. That made things much more structured and easy to work with. However, the structure of the JSON would vary just a bit from run to run. Sometimes you get a basic array. Sometimes you get a key-valuearray pair. sometimes you get just the array without the bounding braces. It took a combination of Prompt-work and Python code to build something robust to work, but itā€™s been a good week now without any failed executions.

Resolution

Iā€™m still running this on a Raspberry Pi 3B+, which was capped at 720P resolutions. Itā€™s my own fault, I grabbed one handy without looking too closely and it only had 1G of RAM. I switched to a 4G unit, and now it can generate 1080p videos.
  • 55s short takes 45minutes to encode
  • 3 minute video takes 2 hours to encode
Ā 
Which leads into the next topic:

API Limits

Google approved my request for a Quota increase, so I reconfigured the tools:
  • Generate a 55s short every 2 hours
  • Generate a 3 minute video 3x/day
Ā 
I also added a yake pass to the video description to generate hashtags for the video.
Ā 

Results

All of these together have made a noticable improvement to the quality of videos. However, itā€™s still far from perfect. Even some astute viewers have noticed comical items like this video:
A video of plants that only shows Bridges and buildings.
Iā€™m still far from anything profitable on the channel. Even with 1000+ subscribers Iā€™m under 5% of the viewer metrics required to even enable them. However, itā€™s been a fun project and Iā€™ve learned a lot about the capabilities of these systems.
With all of this running, my only real cost is the ChatGPT API usage, which comes to about $0.12/day, or less than $4/month .
If I keep working on it, I may try to replace some of the yake elements with ChatGPT. I would hope that can generate more relevant keywords instead of the current context-free system, but I would have to do some work to integrate that against my image search algorithms to handle query failures.
Ā