Generate Mermaid Flow Chart Diagrams with Gemini 2.0 from Rough Screen Recordings

Any app of a decent size has event flows and messages being passed around. At a certain point it becomes complicated to explain it through prose and you want to turn towards diagrams. Now setting aside the time to create diagrams that accurately describe what's happening in your codebase can be quite time consuming. Luckily, with today's AI tools, we can now just do basic walkthroughs with rough recordings over codebase dictating what's going on along the way and then upload that to tools like AI Studio and then ask it to generate diagrams in a text-based format such as Mermaid that will then generate the flowcharts and diagrams for us. So this video walks us through this process of recording a rough video, uploading it to AI Studio and then asking AI Studio to generate the proper Mermaid diagram syntax.

Share with a coworker

Transcript

[00:00] I recorded a four minute video of me bumbling through describing how input events and input messages travel through ScriptKit. So this is an unedited rough cut of just fumbling through files and talking about what I see. From there I drag and dropped it on to AI Studio, we're on the 2.0 flash model, and that clocked in at about 70, 000 tokens. Now the goal is to create a flow chart using Mermaid to diagram what I was talking about. So analyze this video with the various input events and messages and create the necessary Mermaid syntax to create a flow chart to diagram how input events travel all the way from a user, all the way down to a script.

[00:38] So we have a visual representation of how input events travel throughout ScriptKit. And I'll let this paste in. I'll hit run here. Then after roughly four minutes it will output some mermaid code that I can copy and paste into the flowchart. It looks like we did get a syntax error.

[00:54] So let's go ahead and just copy and paste this back. We'll let this run again and two minutes later. Let's give it another shot. I'll copy this and paste that again and now we have a diagram. So let's check it out.

[01:05] I'm going to swap over to light mode since the dark was very hard to read and we can see we have user input on the keyboard, enter input file, on key down handler, modifiers, yes, check for shortcuts, shortcut match, prevent default returned, not match, is there shift, yes, add shift, send the channel input through IPC, are there cache choices, if so clear the keyword, if not invoke the search, and this is all looking pretty solid to me where it ends all the way down into the script with an onInputHandler if you overrode that and this receives the message in the onInputHandler. So from about four minutes of effort of recording this video, which again is an extremely rough cut with tons of mistakes and zero edits. And then waiting for a few minutes for the first pass, then copying, pasting an error for the second pass, we were able to get a workable diagram that shows the flow of an event through our application. And it also gives us a solid conversation to have with the AI if we want to think about how these events flow through the application in the future.