[00:00] So I brainstormed a plan with the AI, and I'm going to take my brainstorm with all of my opinions, copy it as Markdown, and then jump over to Stream Realtime. And this is just the UI saying, Microphone's active. So in the system instructions, I'm going to mark off my plan here, paste this in, and add an end marker. That's just a really common technique for when you want to section off something. And I'll say, your job is to challenge me and all of my opinions about this plan.
[00:22] I want to make sure I'm making the right decisions, so make me defend my stances and think of edge cases and scenarios that I may not have covered. I'll paste that in. And by default, audio will be turned on. You can pick whatever voice you want. I'll leave it on the default.
[00:35] And here's where it gets really uncomfortable, is you click the microphone and you start arguing with an AI. Let's do this. Okay, let's dive in. Why are you so set on a text-based interface when visual tools might offer a more intuitive way to understand complex automation workflows, especially for those who aren't comfortable with coding. Well my target audience is developers who are familiar with scripting and also I believe that text is more portable, that's easier to save and share and edit.
[01:04] So that's essentially why. Okay, that makes sense for developers. But even within that group, aren't there different levels of coding experience? How will you ensure that your text-based system is accessible to a wider range of developers, including those less familiar with complex scripting syntax. Well, the text-based, like, that's the point of this is we're trying to have it text-based, but at each level of the text have like little drop downs where you can select something from the text, like the API you want to use, or the thing you want to trigger.
[01:33] So it's both text-based, but while you're writing out the text, it becomes this prompting thing where it asks you to make selections, but the end result is just a big paragraph of text. Okay, so you're combining text with guided selections. That's interesting. But how do you balance providing enough options to be useful without overwhelming users with too many choices in those dropdowns? How do you determine what are the right options to present?
[01:59] I'll actually be borrowing a lot of features from like Zapier and the API services that they recommend and other services that are listed in those sorts of automations where things that you could ping from the desktop and like in Node scripts and so on. So essentially it'll be a large list of services, a large list of APIs in Node where you can write to the file system and modify things in the file system. Yeah, go ahead. So you'll be relying on existing services and APIs. That's a good starting point.
[02:35] But how will you handle cases where a user wants to automate something very specific or use a less common service that's not already included in your lists? Will you allow for custom inputs? And how would you handle them? I will allow for custom inputs. I was thinking that once you get to a point of the prompt, like it'll ask, the trigger I want is, and then a dropdown comes up.
[02:59] You'll have a predefined set of triggers, But then once you get to an API or something, at any point in the drop-down, you could hit escape and manually type in the API you wanted. Okay, so you're planning to allow users to escape the drop-downs and manually enter custom information at any point. That's a good way to provide flexibility. But how will you ensure that the AI can understand and work with these custom inputs, especially if they are less common or don't follow a standard format? That will definitely be a challenge.
[03:30] And I've been thinking about ways to, once someone puts in something we're not familiar with, we could go and like scrape the docs and bring it in so we could better understand the APIs for this. Ah, so you're thinking of dynamically scraping documentation for unfamiliar APIs. That's an ambitious, but very helpful idea. How would you handle? All right, I'm gonna go ahead and stop this.
[03:52] I believe you get the idea from here. Essentially, you have a back and forth where you're forced to continuously defend yourself and rethink your problem from different angles. And what I love about the voice conversation is it applies a pressure on you to respond. A pressure that doesn't quite come from text. An expectation that you can express your solution without really having to stop and think about it.
[04:17] You can just defend it as if someone is questioning you. So the next time you're planning out a feature, the next time you're working on something, or even if you just want to rubber duck something, drop the plan and the system instructions, ask it to challenge you, and you'll surface a lot of things you probably weren't thinking of.