YouTube Deep SummaryYouTube Deep Summary

Star Extract content that makes a tangible impact on your life

Video thumbnail

Claude Code + GitHub WORKFLOW for Complex Apps

Greg Baugues β€’ 2025-06-26 β€’ 18:40 minutes β€’ YouTube

πŸ“š Chapter Summaries (10)

πŸ€– AI-Generated Summary:

Unlocking New Superpowers: AI-Assisted Coding Workflow with Cloud Code and GitHub

In recent weeks, I’ve been experimenting with a powerful AI-assisted coding workflow using Cloud Code and GitHub to build a new web application. This workflow has truly unlocked new superpowers for me as a developer, streamlining how I plan, create, test, and deploy software. In this blog post, I’ll walk you through the workflow, explain why it’s effective, and share practical tips on how you can implement it in your own projects.


The High-Level Workflow: Plan, Create, Test, Deploy

The workflow is elegantly simple yet powerful, revolving around the four classic phases of the software development life cycle:

  1. Plan: Create and refine GitHub issues to clearly define atomic, manageable tasks.
  2. Create: Use Cloud Code’s custom slash commands to generate code that addresses the issues.
  3. Test: Run automated tests, including UI tests powered by Puppeteer, to ensure quality.
  4. Deploy: Commit and push changes to GitHub, open pull requests (PRs), and merge after review to deploy via platforms like Render.

By leveraging GitHub flow β€” a tried-and-true workflow designed for small teams β€” and integrating AI-powered coding assistants, this process makes it feasible for a β€œteam” of one human and one AI to build complex applications efficiently.


Creating and Refining GitHub Issues: Your Project’s Backbone

The first step is to capture all work as GitHub issues. I started by dictating initial requirements and then worked with Claude Code to translate them into issues. However, I quickly learned the importance of granularity and specificity in these issues. The more atomic and well-defined the issues, the better Claude Code could handle them.

This phase reminded me of my managerial days, as I found myself writing detailed specs, reviewing code, and leaving feedback for improvements β€” essentially playing the role of an engineering manager. This approach ensures that the AI-generated code is aligned with your vision and standards.


Setting Up a Solid Foundation: Testing and Continuous Integration

Before diving into rapid development, it’s crucial to establish:

  • A robust test suite to verify that new changes don’t break existing functionality.
  • Continuous Integration (CI) using GitHub Actions to run tests and linters automatically on every commit.
  • Puppeteer integration to simulate user interactions and test UI changes in a real browser environment.

Using frameworks like Rails (with its MVC architecture and integrated testing) makes it easier for AI coding agents to work on modular code sections rather than sprawling, monolithic files.


Custom Slash Commands: Automating the Plan-Create-Test-Deploy Cycle

Cloud Code slash commands are prompt templates with command-line arguments that instruct the AI on how to handle each issue. My main /process-issue command breaks down into:

  • Plan: The AI reviews the GitHub issue, searches previous related work and pull requests, and creates a detailed plan with atomic tasks using β€œscratchpads” (dedicated planning files).
  • Create: The AI writes code addressing the plan.
  • Test: The AI runs tests to verify its work.
  • Deploy: The AI commits changes, opens a PR, and optionally requests or performs code reviews.

This structured approach ensures clarity and accountability throughout the development cycle.


The Human-AI Partnership: Code Review and Responsibility

One common concern about AI-assisted coding is trust β€” how do you know what the AI wrote is correct? The answer remains the same as with any developer: you must review the code.

I’ve found it helpful to:

  • Read through pull requests carefully.
  • Optionally have Claude Code perform a PR review using a separate slash command, emulating expert styles like Sandy Metz’s principles for maintainable code.
  • Rely heavily on tests to catch regressions and unexpected issues.

While I sometimes let Claude commit code directly, I make sure tests pass and the changes look good before merging.


Managing Context: The Importance of /clear

After completing and merging an issue, I always run /clear in Cloud Code to wipe the AI’s context window. This forces Claude to start fresh on the next issue, relying solely on the issue description, scratchpads, and repository history β€” no leftover β€œworking memory.”

This practice helps:

  • Maintain focus on the current issue.
  • Reduce token usage.
  • Improve AI performance and accuracy.

Using Claude in GitHub Actions vs. Cloud Code Console

Anthropic recently launched Claude integration via GitHub Actions, allowing you to tag Claude directly on GitHub. While this is convenient for small tweaks and copy changes, I prefer using Claude Code in the console for more significant development work because:

  • GitHub Actions usage incurs metered billing, even on premium plans.
  • The console provides better insight and control.
  • For large code changes, the console-based approach is more efficient and manageable.

Running Parallel Agents with Work Trees

Work trees let you run multiple instances of Claude on different branches simultaneously, similar to multitabling poker. However, I encountered some challenges:

  • Permission approvals need to be repeated for each new Claude session.
  • Managing multiple work trees can feel clunky and increase babysitting overhead.
  • For my project, sequential work on a single instance sufficed.

Still, as projects grow or teams scale, work trees offer a way to increase parallelism in AI-assisted development.


Final Thoughts

This AI-assisted workflow combining Cloud Code, GitHub, and Puppeteer has revolutionized how I build software. It marries the power of classic software development principles with cutting-edge AI coding assistance to create a cycle of continuous, manageable progress.

If you want to get started, focus on:

  • Writing clear, atomic GitHub issues.
  • Setting up a solid test suite and continuous integration.
  • Creating custom slash commands to automate planning, coding, testing, and deployment.
  • Embracing your role as the reviewer and planner to guide the AI effectively.

For more insights, I recommend checking out my related video on Claude Code pro tips and reading Thomas Tacic’s excellent post on AI-assisted coding skepticism.


References and Resources


Harness the power of AI in your development process β€” it might just unlock new superpowers for you too!


πŸ“ Transcript Chapters (10 chapters):

πŸ“ Transcript (503 entries):

## Overview of the AI Coding workflow [00:00] In this video, I want to talk to you about the workflow that I've been using with Cloud Code and GitHub to build a new web app over the last couple weeks. And I feel like it's sort of unlocked new superpowers for me. So, first let me just go through the workflow at a high level just to give you the taste in case you don't have much time. And then we're going to circle back. We're going to talk about the why you might need a workflow like this. And then we'll dive into each of the steps in a little bit more detail. So, here's how it works. I create GitHub issues for all the work I want to have done on the app. In cloud code, I have a detailed slash command with instructions on how to process issues. At a high level, I want it to first plan its work using scratch pads to break down the big issue into small atomic tasks. Then once it's planned its work, it can create the code. After it's created the code, it can then test its work. It can do this in two different ways. One, running the test suite. And second, it can use Puppeteer to click in a browser if it's made any UI changes. Then once it has tested its work, it will commit its work to GitHub and open up a pull request which is then reviewed. Sometimes I review that PR, sometimes I have Claude Code review the PR with a different slash command that I've written. Also, I have continuous integration set up on GitHub via GitHub actions so that anytime a commit is made, we run the test suite and we run a llinter and we check to make sure that it is safe to merge the commits into the main branch. And then in cloud code, I use /clear to wipe away the context window. And then I have it tackle the next issue and repeat the cycle. Now, I ## Software Development Life Cycle [01:39] don't want to pretend like I've created the wheel here because what I've just described to you could be summed up as a cycle of plan, create, test, deploy, which are generally considered to be the four phases of the software development life cycle. So, why do you need a cycle like this if you have such powerful software coding agents? Well, the software industry has known for a long time that writing code is just one phase of what's required to ship and maintain complex software. Turns out that some of the processes and systems that we built to manage the creation of software work really well with these AI coding assistants and in particular cloud code. Now to be even more specific, the workflow I've just described is based heavily upon GitHub flow, which is a workflow first published by Scott Shaon, who is one of the co-founders of GitHub. Published this about 13 14 years ago when GitHub was just about 35 employees. So this is a workflow that's well known that works really great for small teams. Say if your team was, I don't know, approximately the size of one human and one AI coding assistant. Let's go back through and talk about each of those four phases in a little bit more detail. Plan, create, test, and deploy. Uh, let's start off with creating the ## Creating and Refining GitHub Issues [03:01] issues. When I very first started working on this app, I started that with a dictation session via Super Whisper. And then I just worked with Claude to turn that into a requirements document. And then once I had those steps, I told Claude Code to create GitHub issues from there. Now, you also need a way for Claude Code to interact with GitHub. And Enthropic's recommended way of doing so is to install the GitHub CLI. And this allows Cloud Code to run GH via Bash to interact with GitHub. For some reason, you can't install that CLI. You could use the MCP server, but the CLI is a recommended way of doing so. Now, I would say the first mistake I really made here was that I had it create those issues. It's probably about 30 or 40 issues, and then I just had to start working on them. It was overly optimistic of me to assume that we could go straight from the GitHub issues that it created to writing software. In reality, my job perhaps got a little bit less fun because instead of writing code now, I really needed to go and make sure that I was being very specific in those issues and really refining them. And I'd say the more granular, the more specific, the more atomic those issues got, the better results I had. And I had a couple false starts where I kind of had to throw the whole project away and really go back and spend time in GitHub and say, "Okay, what do we do first? What do we do second? And how do we break this down and keep it really tightly scoped so that we're setting ourselves up for success?" In fact, it's kind of funny. I was at Twilio for 9 years. I was a manager for a lot of that time. And I feel like I got a little burned out on being a manager and I have really been enjoying writing code over the last couple years. And these last couple weeks, I feel like I had to put my manager hat back on. I've written very little code myself and instead I've spent most of my time writing really detailed specs, reviewing code that was written by someone else, leaving comments and saying things like, "H, this is not quite good enough. Please try again." Or, "Actually, I thought I wanted that, but that now that I see it, that's not quite what I want." Or, "Throw away all your work and uh I don't actually want this at all." Uh and so if you want to like roleplay as an engineering manager, uh this process is actually a pretty good way to do that. The first couple issues that we worked on were setting up the test suite and continuous integration. Most of my work that I've done over the last 10 years has been in Python, but anytime I'm building a more complex web app and need a users table, I find myself starting to reach back for Rails. I also think there's something about the MVC framework which is not unique to Rails. Django has this too and lots of frameworks use the model viewcontroller framework. But I think there's something about modularizing your codebase that makes it easier for coding agents to work with because they can focus on code that's related to one idea as opposed to say a main.py or an index.js that's a thousand lines long. Rails has really nicely integrated testing framework and it was really important to me from the ## Setting Up Your Foundation [05:54] beginning to get my test suite up and running so that I could set up GitHub's continuous integration so that I could have my tests run automatically every time Claude Code was pushing commits. Now along the same lines, I also set up the Puppeteer local MCP server and Puppeteer allows Claude Code to use a web browser to test the local changes to your app. I've actually found this to be really useful as I've started in on redesigning the app. It's also good for testing to see if buttons work or forms work. It's actually very surprising and very satisfying to watch cloud code uh click around in a browser to test the work that it's already done. So, I'd say before you can really get moving with rapid iterative feature development, you need some really well- definfined issues. You need your app set up on a GitHub repository and you need continuous integration set up with a really good test suite and Puppeteer helps a lot as well. But once you have that foundation in place, now you're ready to go. All right, so I have some issues here. Let me talk through what happens when I have Claude Code work on an issue. Most important thing here is you're going to create a slashcomand. You can do this in thecloud/comands directory. A slash command is basically a prompt template and you can add ## Plan: Custom Slash Commands [07:10] command line arguments to that. So the argument that we're going to be passing into this one is the issue number. Now for my /command for processing issues, I started with the one that came from the anthropic post on best practices for agent decoding. That was a post written by Boris who is the original creator of cloud code. And I started there and then I just iterated over time. I added more to it. And you can see I broken up into four parts. plan, create code, test, and deploy. And uh plan is the biggest one. You know, it's perhaps the most important. I'm telling Cloud Code to use the GitHub CLI to view the issue. Uh I also then ask it to go dig up some prior art on this. So, uh I do have it use what's called scratchpad. So, it basically has a directory in the codebase where Claude code can plan all of its work. And I ask it to search those scratch pads for uh previous work related to this issue. I ask it to look through PRs, previous PRs in GitHub to see if it can find other work that's been done on this issue so it can figure out what's been done and why. I use here the think harder uh prompt to trigger thinking mode. Uh Anthropic has several of these. So you can do think hard, think harder. I think you can do think hardest and ultraink. I cannot tell you why I've settled in on think harder. It seems to be working well. Um maybe I need to bump this up to ultraink in the future. I don't know. Uh, but the key here is that I want it to break the issue down into small, manageable tasks. Then I ask it to write that plan on a new scratchpad and to include a link to the issue there. Now, Claude Code's going to write the code and after it's written some code, it's going to commit the code. Or is it? I think one of the biggest questions that's going to come out of this workflow is, do you have Claude code write the commit for you or is it your responsibility to do that? I ## Create, Test, Deploy [08:59] have been convicted by Thomas Tacic. He wrote this post a few weeks ago called All of My AI skeptic friends are nuts. It was super popular. It's probably the best piece of writing that I've read on AI assisted coding. The link's in the description here. I encourage you just to like read it. It's an amazing piece of writing. Uh and there's a section he's going through all of the uh criticisms or objections from his AI skeptic friends about why you shouldn't use AI assisted coding. So the objection here is but you have no idea what the code is. And Thomas replies, "Are you a vibe coding YouTuber?" Maybe. Uh, can you not read code? If so, astute point. Otherwise, what the [Β __Β ] is wrong with you? You've always been responsible for what you merge domain. You were 5 years ago and you are tomorrow, whether or not you use an LLM. If you build something with an LLM that people will depend on, read the code. In fact, you'll probably do more than that. You'll spend 5 to 10 minutes knocking it back into your own style. And in fact, as I talk to uh engineer friends who are working at large companies using claude code there, they will actually not even let claude do the commit even though it's really great at writing commit messages, but instead they will open up all of its changes in an IDE such as cursor, review them all. I've not really been doing either of those things on this project. I started I I really did start there and I was like being very diligent opening up all the code and cursor. Uh, at some point I have to admit I started getting lazy. So maybe I've fallen back into the vibe coding YouTuber genre, I guess. Uh, but uh, I have been letting Claude do all of the commits and then I do try to read the PRs. Although I will say, and we'll get to this in a second, sometimes I just have Claude read the PR. Uh, but let me tell you what makes me feel a little bit a little bit better about having Claude do that, and that's tests. So when I started this project, I wanted to be really sure that I had a good test suite because I do feel like in other projects such as like the games I built for my daughters, I often run into issue where things are working pretty good and then Claude makes a change. Sometimes a seemingly simple or benign change and it breaks all the stuff. I'm not looking for necessarily 100% code coverage, but I do want to have high confidence that Claude can work on one feature without breaking the stuff it's done before. All right. Finally, we have planned. We have created code. We have tested the code. Now, it is time to deploy. I personally deploy to render. I like it for a lot of the stuff I've been building lately, both in the Python and Rails apps. Uh, render will look for pushes to your main branch of GitHub and then automatically deploy your new app. So in this workflow, merging a branch into the main branch in GitHub is the same approximately as deploying to production. And so the way that we set up a branch to merge it into main is by opening up a pull request. So you as the human here working with the AI, let's assume that you have had Claude make the commits and then let's assume that you have had Claude open the PR. This is the place where you really can get in and review the changes that it's made and you can leave comments on the changes that Claude has made and then you can go back into the console and ask Claude to view those comments and to make changes based on them. You can also set up a separate slash command to ask Claude to do a uh PR review for you. Now, if you do have a slash command for doing a PR review, what I would encourage you to do is to open up cloud code in a completely new shell and then to run it fresh and so that it is not doesn't have the context pollution of the work that it's already done. I have a a slash command for doing PR reviews uh where I ask it to review it in the style of Sandy Mets. Sandy Mets is one of my heroes from the Rails world. She has some great principles for writing beautiful maintainable code. When I have Claude review the code in the style of Sandy Mets, it reveals places where we can make things more maintainable or more readable that I would have missed and certainly that Claude missed on its first pass. Now, I I will admit there's been more than a few times over the last couple weeks when I've had Claude write the code. I've had Claude do the PR review. Uh I've ensured that the test pass and I'm like, "Looks good to me." And I click the button to merge the poll request. So again, this the video is not ## Your job vs. AI's job [13:29] intended to be prescriptive about the workflow, but I think the high-level bits here make a lot of sense. And then you got to figure out where in those individual steps of of the plan, the create, the test, and deploy are you going to get hyper involved as the human? And for me personally, I have been hyper involved in the planning phase. And I found it really difficult to delegate anything other than just like cleaning up my ideas or my pros to Claude. I think the planning is where I've been spending a whole lot of time and then I personally for this app and the size of the app and size of the codebase and all have been able to delegate a lot of the creating testing and deploying or the reviewing of the the coding etc to Claude. All right. So finally now that I have merged my PR here's what I do. I go back to claude and I run /cle. This completely wipes away the context window. I am not compacting the window. I am clearing the window. The idea here is that each issue should contain all of the information that Claude needs to perform that work. It should be able to ## Context Management with /clear [14:32] work on the issue from a cold start. And thanks to the scratch pads and thanks to its ability to review PRs and all the previous work that's been done on the codebase, that issue should be descriptive enough for it to tackle it with no working memory. And this also frees up your context window. It will help you get better results while using fewer tokens. Now, let me address a quick question because you probably saw that Anthropic launched uh Claude via GitHub actions and this is a really cool feature that lets you just tag Claude in your directly from GitHub and have it work on some stuff. Um, so I have been ## Claude via GitHub Actions [15:07] playing with that a little bit. The primary reason why I'm not using that is because as of today, um, that usage of the GitHub actions is built with metered billing against your API. even if you're on a Claude Max plan. So, I have upgraded now to the $200 a month Claude Max plan. I am finding it is totally worth it to get the Claw 4 Opus use. Um, I've just been thrilled with the value I'm getting there, but then I was kind of bummed to then get a $50 API bill from Anthropic after I had been using uh tagging Claude in GitHub. And so, I was like, man, if I'm already getting unlimited access, uh, I might as well just do it in the console. And candidly, I think I'm getting much better uh insight and results from using claude code in the console. And so I actually talked to a friend Martin who works at Anthropic and his suggestion was use uh Claude in the GitHub actions when you're say doing a PR review and there's a small change perhaps a copy change or just like something tiny that needs to be tweaked but you don't necessarily want to go into the codebase and do it yourself. It's really good for those smaller fixes, but you probably don't want to be using GitHub actions for really large meaningful changes to your codebase. Uh, finally, let me just talk about work trees because, uh, Anthropic talks about this quite a bit. The best ## Work Trees: Run Parallel Agents [16:28] analogy that I have for work trees would be multitabling and poker. You know, you start playing online poker on a single table and then you realize you're just kind of clicking buttons every once in a while. you could probably play two table at a time and then at some point you've bought a bigger monitor and you're like playing four or eight tables at a time. That's sort of what running clawed work trees feels like. Uh instead of different poker tables up, you're just tabbing between different tabs in the terminal. And generally I think that the industry as a whole is excited about uh running coding agents in parallel or in the background. And work trees is the method that you can use with GitHub to run multiple instances of Claude working on multiple issues at the same time. I personally ran into two issues with it. The first is because I'm just getting started building this app. There's so much work that just simply needs to be done iteratively. There aren't a lot of features that can be developed in parallel where the code bases don't touch each other. Um, I found the interface for working with work trees to be a little bit clunky. The general idea behind a work tree is that you create copies of your git repo in separate subdirectories and then you have one version of claude running in, you know, subdirectory A on let's just call it branch A and then you have another one running on branch B and they're running in parallel in two different directories on your computer. Um, the issue that I had was that when I spun up a new version of Claude, like a new Claude session, I didn't have the same permissions that I had already approved on that first session of Claude. And so, every time I created a new branch, I was having to approve all the permissions again. And I just felt like I was having to babysit it a lot more. And then what happens is after you have finished work on that issue or that branch, you're supposed to delete that directory and then create a new work tree again. And so every time you're creating a new work tree, you're reapproving those permissions. And it just felt like I was doing more babysitting and more cleaning up merge conflicts than it was really worth it. Uh I found that just working with a single cloud instance is sufficient for me. Now, if you made it this far, you'd probably also enjoy the video I did on claude code pro tips. So check that one out.