Planning AI Agent Sessions with Reasoning Models

The most difficult tasks will require the strongest AI models. And those models simply aren't available as AI agents inside of your tooling. So the best workflows are to drag and drop the files you need to be changed and analyzed inside of a reasoning model such as OpenAI's O1 Pro, then letting it create a giant plan for your dumber AI agents to work on. The key here is that instead of just copying and pasting the entire plan into Cursors Composer or your AI agent, it's to create a file inside of your project and ask your agent to tackle the plan one step at a time. This allows you ample opportunity between each step to test and verify the micro steps along the way. Otherwise, as I've experienced many times, a session can go way off track and you burn a lot of time and burn a lot of AI requests Cursors Composer and it starts churning through a plan Cursors Composer and making lots of unwanted changes. Cursors Composer and it's to guide an AI agent to hold its hand along the way, especially through the most complex tasks. Cursors Composer and the most complex tasks Cursors Composer and the most complex tasks require a step-by-step plan.

Transcript

[00:00] When you run into the most complex tasks, it's best to turn to the most powerful reasoning model you have available. In this instance, I had something that impacted my state machine, which included streaming text and pulling random data from the database. Now there's no perfect prompt for this, but you can essentially paste in the code you need it to work on, then ask it to write out a step-by-step solution. Essentially I ask it to list the full path of each file and then break it down into individual steps and the reason behind how it's going to solve each one. Then I can grab the entire result from this and copy and paste it.

[00:34] And the temptation would be to paste the entire thing into Composer, like Composer's agent, and just let it work through all of the tasks. I've had very terrible luck with this because as these lesser models start churning through tasks, the more context they build up, essentially the dumber they get. So I'll delete that and instead I'm going to use advance new file and in my root directory I'll create a folder named plans and a plan named luckyrefactor.md. Then I'm going to paste the content in here. Then I'll turn to Composer, and it's currently referencing the Open Editor.

[01:08] If it's not, you can say at luckyrefactor. And then I can simply say work on step 1. And this will allow the agent to stay focused on the first step and work it through to completion. Then once it gets past that step, you can run your tests, you can click around your site, you can verify that everything works, and if this isn't working you can continue working within this composer session. Otherwise you can press command-n, then say start work on step 2, hit submit, and then you have these different sessions in Composer that are focusing on the different steps inside of your plan.

[01:41] This allows you to better isolate what's going on so that when things go wrong you have a specific session where you can focus and test the various changes.