YouTube Deep SummaryYouTube Deep Summary

Star Extract content that makes a tangible impact on your life

Video thumbnail

Claude Code + GitHub WORKFLOW for Complex Apps

Greg Baugues β€’ 2025-06-26 β€’ 18:40 minutes β€’ YouTube

πŸ“š Chapter Summaries (10)

πŸ€– AI-Generated Summary:

Key Takeaways & Insights

  • The video presents a practical workflow combining Cloud Code with GitHub to develop web apps, centered on the classic software development life cycle: plan, create, test, and deploy.
  • Leveraging AI coding assistants like Claude Code can significantly enhance productivity, especially when integrated with issue tracking, CLI tools, and continuous integration.
  • The importance of granular, well-defined GitHub issues is emphasized to enable effective AI-driven development and reduce rework.
  • Testing is criticalβ€”both automated test suites and UI testing with Puppeteerβ€”to maintain confidence in AI-generated code and prevent regressions.
  • Human involvement is essential mainly in planning and reviewing phases, reinforcing that AI assists but does not replace the developer’s responsibility for quality.
  • The workflow is heavily inspired by GitHub Flow, a well-known, proven methodology adaptable for a single developer plus AI assistant.
  • Using scratchpads as working memory for Claude Code helps with organization, reference previous work, and breaking down complex issues.
  • Deployments are automated via GitHub merges triggering platforms like Render, simplifying continuous deployment.
  • The speaker prefers running Claude Code locally through console slash commands over GitHub Actions due to cost and context quality considerations.
  • Parallel work trees for multitasking multiple Claude sessions are conceptually useful but practically cumbersome due to permission reapprovals and complexity, making single-instance workflows preferable currently.

Actionable Strategies

  • Start by creating detailed, atomic GitHub issues representing discrete tasks; refine these issues iteratively to improve clarity and scope.
  • Use dictation tools and AI (Claude) to convert raw requirements into a structured requirements document and then into GitHub issues.
  • Install the GitHub CLI to enable Cloud Code to interact with GitHub repositories via command line.
  • Establish a robust test suite and continuous integration (GitHub Actions) early in the project to automatically validate commits and enforce code quality.
  • Set up Puppeteer integrated with a local MCP server to enable AI-driven automated UI testing by simulating browser interactions.
  • Create a Cloud Code slash command that accepts an issue number and orchestrates these phases:
     1. Plan: Use scratchpads and GitHub CLI to research the issue, review prior PRs, and break the issue into smaller tasks.
     2. Create: Generate code for the atomic tasks defined in the plan.
     3. Test: Run the test suite and Puppeteer UI tests to verify code correctness.
     4. Deploy: Commit code, open a pull request, review, and merge to trigger deployment.
  • Perform PR reviews either manually or via a dedicated slash command that instructs Claude Code to review code in the style of a respected engineer (e.g., Sandy Mets) to identify maintainability improvements.
  • After merging, clear Cloud Code’s context window with the /clear command to ensure fresh context for the next issue and optimize token usage.
  • Delegate heavily in the create, test, and deploy phases while maintaining close human involvement in planning and requirements refinement.
  • Use Claude Code’s ability to browse previous PRs and scratchpads to maintain continuity and avoid redundant work.
  • Prefer running Claude Code in the console with the Max API plan to manage costs and maintain better control over context and interactions.
  • Consider using GitHub Actions with Claude for small fixes or copy edits but avoid it for large, complex code changes due to metered billing and limited context.

Specific Details & Examples

  • The workflow is based on GitHub Flow, created ~13-14 years ago by Scott Shaon at GitHub.
  • Initial project setup involved 30-40 GitHub issues created via Claude Code but required significant issue refinement to be effective.
  • The speaker has 10+ years experience primarily in Python and often resorts to Rails for complex web apps due to its MVC structure and integrated testing framework.
  • Puppeteer is used to simulate browser clicks and test UI changes automatically.
  • Continuous integration is done via GitHub Actions running test suites and linters on every commit.
  • The speaker uses Render.com for automatic deployment triggered by merges to the main branch.
  • Referenced a popular post by Thomas Tacic titled β€œAll of My AI Skeptic Friends Are Nuts,” advocating responsible AI-assisted coding and code review.
  • PR reviews can be done by Claude Code in the style of Sandy Mets, a respected Rails engineer known for maintainable code principles.
  • Challenges with Git work trees include repeated permission approvals and extra babysitting overhead, leading to preference for a single Claude instance workflow.
  • Mentioned tools/resources:
     – GitHub CLI for GitHub integration
     – Cloud Code (Anthropic) with slash commands
     – Puppeteer for UI testing
     – Render.com for deployment
     – Super Whisper for dictation
     – Cursor IDE for code review

Warnings & Common Mistakes

  • Avoid assuming that AI-generated GitHub issues are immediately ready for coding; take time to refine and break down issues into very specific, atomic tasks.
  • Beware of delegating planning entirely to AI; human involvement in clarifying requirements and prioritization is crucial.
  • Don’t blindly trust AI-generated code without reviewβ€”always examine PRs and test results before merging.
  • Vibe coding (blindly accepting AI commits without review) can lead to problems; maintain discipline in code review and testing.
  • Using GitHub Actions for Claude on large code changes can incur unexpected API billing costs, even with a Max plan.
  • Work trees can be cumbersome due to repeated permission requests and managing multiple repo copies, potentially slowing down development.
  • Don’t compact Cloud Code’s context window; prefer clearing it to avoid context pollution and token inefficiency.
  • Avoid large monolithic files; modular codebases (e.g., MVC frameworks) facilitate better AI assistance.

Resources & Next Steps

  • Read Thomas Tacic’s article β€œAll of My AI Skeptic Friends Are Nuts” for perspectives on AI-assisted coding.
  • Explore GitHub Flow as a foundational workflow for collaborative and AI-assisted development.
  • Use GitHub CLI (https://cli.github.com/) for seamless GitHub integration.
  • Set up Puppeteer (https://pptr.dev/) for automated UI testing.
  • Use Render.com for easy continuous deployment.
  • Check out Claude Code Pro Tips video for deeper insights on using Claude effectively.
  • Consider setting up dedicated slash commands in Cloud Code tailored to your workflow for planning, testing, and reviewing.
  • Keep refining issue granularity and ensure each issue is fully self-contained for AI to work effectively from a cold start.
  • Experiment with PR review commands modeled on expert engineers’ styles to improve code quality.
  • Follow-up by watching related content on AI-assisted coding workflows and best practices.

Main Topics

  • AI-assisted software development workflow integrating Cloud Code with GitHub
  • Planning and refining GitHub issues for AI coding agents
  • Using GitHub CLI for AI interaction with repositories
  • Automated testing: test suites and Puppeteer UI tests
  • Continuous integration with GitHub Actions
  • Code review strategies including AI-assisted PR reviews
  • Deployment automation with Render linked to GitHub merges
  • Managing Cloud Code context and scratchpads for efficient AI work
  • Cost and practical considerations using Claude via console vs GitHub Actions
  • Challenges and usage of Git work trees for parallel AI coding sessions
  • Balancing human involvement and AI assistance in software development process

πŸ“ Transcript Chapters (10 chapters):

πŸ“ Transcript (503 entries):

In this video, I want to talk to you about the workflow that I've been using with Cloud Code and GitHub to build a new web app over the last couple weeks. And I feel like it's sort of unlocked new superpowers for me. So, first let me just go through the workflow at a high level just to give you the taste in case you don't have much time. And then we're going to circle back. We're going to talk about the why you might need a workflow like this. And then we'll dive into each of the steps in a little bit more detail. So, here's how it works. I create GitHub issues for all the work I want to have done on the app. In cloud code, I have a detailed slash command with instructions on how to process issues. At a high level, I want it to first plan its work using scratch pads to break down the big issue into small atomic tasks. Then once it's planned its work, it can create the code. After it's created the code, it can then test its work. It can do this in two different ways. One, running the test suite. And second, it can use Puppeteer to click in a browser if it's made any UI changes. Then once it has tested its work, it will commit its work to GitHub and open up a pull request which is then reviewed. Sometimes I review that PR, sometimes I have Claude Code review the PR with a different slash command that I've written. Also, I have continuous integration set up on GitHub via GitHub actions so that anytime a commit is made, we run the test suite and we run a llinter and we check to make sure that it is safe to merge the commits into the main branch. And then in cloud code, I use /clear to wipe away the context window. And then I have it tackle the next issue and repeat the cycle. Now, I don't want to pretend like I've created the wheel here because what I've just described to you could be summed up as a cycle of plan, create, test, deploy, which are generally considered to be the four phases of the software development life cycle. So, why do you need a cycle like this if you have such powerful software coding agents? Well, the software industry has known for a long time that writing code is just one phase of what's required to ship and maintain complex software. Turns out that some of the processes and systems that we built to manage the creation of software work really well with these AI coding assistants and in particular cloud code. Now to be even more specific, the workflow I've just described is based heavily upon GitHub flow, which is a workflow first published by Scott Shaon, who is one of the co-founders of GitHub. Published this about 13 14 years ago when GitHub was just about 35 employees. So this is a workflow that's well known that works really great for small teams. Say if your team was, I don't know, approximately the size of one human and one AI coding assistant. Let's go back through and talk about each of those four phases in a little bit more detail. Plan, create, test, and deploy. Uh, let's start off with creating the issues. When I very first started working on this app, I started that with a dictation session via Super Whisper. And then I just worked with Claude to turn that into a requirements document. And then once I had those steps, I told Claude Code to create GitHub issues from there. Now, you also need a way for Claude Code to interact with GitHub. And Enthropic's recommended way of doing so is to install the GitHub CLI. And this allows Cloud Code to run GH via Bash to interact with GitHub. For some reason, you can't install that CLI. You could use the MCP server, but the CLI is a recommended way of doing so. Now, I would say the first mistake I really made here was that I had it create those issues. It's probably about 30 or 40 issues, and then I just had to start working on them. It was overly optimistic of me to assume that we could go straight from the GitHub issues that it created to writing software. In reality, my job perhaps got a little bit less fun because instead of writing code now, I really needed to go and make sure that I was being very specific in those issues and really refining them. And I'd say the more granular, the more specific, the more atomic those issues got, the better results I had. And I had a couple false starts where I kind of had to throw the whole project away and really go back and spend time in GitHub and say, "Okay, what do we do first? What do we do second? And how do we break this down and keep it really tightly scoped so that we're setting ourselves up for success?" In fact, it's kind of funny. I was at Twilio for 9 years. I was a manager for a lot of that time. And I feel like I got a little burned out on being a manager and I have really been enjoying writing code over the last couple years. And these last couple weeks, I feel like I had to put my manager hat back on. I've written very little code myself and instead I've spent most of my time writing really detailed specs, reviewing code that was written by someone else, leaving comments and saying things like, "H, this is not quite good enough. Please try again." Or, "Actually, I thought I wanted that, but that now that I see it, that's not quite what I want." Or, "Throw away all your work and uh I don't actually want this at all." Uh and so if you want to like roleplay as an engineering manager, uh this process is actually a pretty good way to do that. The first couple issues that we worked on were setting up the test suite and continuous integration. Most of my work that I've done over the last 10 years has been in Python, but anytime I'm building a more complex web app and need a users table, I find myself starting to reach back for Rails. I also think there's something about the MVC framework which is not unique to Rails. Django has this too and lots of frameworks use the model viewcontroller framework. But I think there's something about modularizing your codebase that makes it easier for coding agents to work with because they can focus on code that's related to one idea as opposed to say a main.py or an index.js that's a thousand lines long. Rails has really nicely integrated testing framework and it was really important to me from the beginning to get my test suite up and running so that I could set up GitHub's continuous integration so that I could have my tests run automatically every time Claude Code was pushing commits. Now along the same lines, I also set up the Puppeteer local MCP server and Puppeteer allows Claude Code to use a web browser to test the local changes to your app. I've actually found this to be really useful as I've started in on redesigning the app. It's also good for testing to see if buttons work or forms work. It's actually very surprising and very satisfying to watch cloud code uh click around in a browser to test the work that it's already done. So, I'd say before you can really get moving with rapid iterative feature development, you need some really well- definfined issues. You need your app set up on a GitHub repository and you need continuous integration set up with a really good test suite and Puppeteer helps a lot as well. But once you have that foundation in place, now you're ready to go. All right, so I have some issues here. Let me talk through what happens when I have Claude Code work on an issue. Most important thing here is you're going to create a slashcomand. You can do this in thecloud/comands directory. A slash command is basically a prompt template and you can add command line arguments to that. So the argument that we're going to be passing into this one is the issue number. Now for my /command for processing issues, I started with the one that came from the anthropic post on best practices for agent decoding. That was a post written by Boris who is the original creator of cloud code. And I started there and then I just iterated over time. I added more to it. And you can see I broken up into four parts. plan, create code, test, and deploy. And uh plan is the biggest one. You know, it's perhaps the most important. I'm telling Cloud Code to use the GitHub CLI to view the issue. Uh I also then ask it to go dig up some prior art on this. So, uh I do have it use what's called scratchpad. So, it basically has a directory in the codebase where Claude code can plan all of its work. And I ask it to search those scratch pads for uh previous work related to this issue. I ask it to look through PRs, previous PRs in GitHub to see if it can find other work that's been done on this issue so it can figure out what's been done and why. I use here the think harder uh prompt to trigger thinking mode. Uh Anthropic has several of these. So you can do think hard, think harder. I think you can do think hardest and ultraink. I cannot tell you why I've settled in on think harder. It seems to be working well. Um maybe I need to bump this up to ultraink in the future. I don't know. Uh, but the key here is that I want it to break the issue down into small, manageable tasks. Then I ask it to write that plan on a new scratchpad and to include a link to the issue there. Now, Claude Code's going to write the code and after it's written some code, it's going to commit the code. Or is it? I think one of the biggest questions that's going to come out of this workflow is, do you have Claude code write the commit for you or is it your responsibility to do that? I have been convicted by Thomas Tacic. He wrote this post a few weeks ago called All of My AI skeptic friends are nuts. It was super popular. It's probably the best piece of writing that I've read on AI assisted coding. The link's in the description here. I encourage you just to like read it. It's an amazing piece of writing. Uh and there's a section he's going through all of the uh criticisms or objections from his AI skeptic friends about why you shouldn't use AI assisted coding. So the objection here is but you have no idea what the code is. And Thomas replies, "Are you a vibe coding YouTuber?" Maybe. Uh, can you not read code? If so, astute point. Otherwise, what the [Β __Β ] is wrong with you? You've always been responsible for what you merge domain. You were 5 years ago and you are tomorrow, whether or not you use an LLM. If you build something with an LLM that people will depend on, read the code. In fact, you'll probably do more than that. You'll spend 5 to 10 minutes knocking it back into your own style. And in fact, as I talk to uh engineer friends who are working at large companies using claude code there, they will actually not even let claude do the commit even though it's really great at writing commit messages, but instead they will open up all of its changes in an IDE such as cursor, review them all. I've not really been doing either of those things on this project. I started I I really did start there and I was like being very diligent opening up all the code and cursor. Uh, at some point I have to admit I started getting lazy. So maybe I've fallen back into the vibe coding YouTuber genre, I guess. Uh, but uh, I have been letting Claude do all of the commits and then I do try to read the PRs. Although I will say, and we'll get to this in a second, sometimes I just have Claude read the PR. Uh, but let me tell you what makes me feel a little bit a little bit better about having Claude do that, and that's tests. So when I started this project, I wanted to be really sure that I had a good test suite because I do feel like in other projects such as like the games I built for my daughters, I often run into issue where things are working pretty good and then Claude makes a change. Sometimes a seemingly simple or benign change and it breaks all the stuff. I'm not looking for necessarily 100% code coverage, but I do want to have high confidence that Claude can work on one feature without breaking the stuff it's done before. All right. Finally, we have planned. We have created code. We have tested the code. Now, it is time to deploy. I personally deploy to render. I like it for a lot of the stuff I've been building lately, both in the Python and Rails apps. Uh, render will look for pushes to your main branch of GitHub and then automatically deploy your new app. So in this workflow, merging a branch into the main branch in GitHub is the same approximately as deploying to production. And so the way that we set up a branch to merge it into main is by opening up a pull request. So you as the human here working with the AI, let's assume that you have had Claude make the commits and then let's assume that you have had Claude open the PR. This is the place where you really can get in and review the changes that it's made and you can leave comments on the changes that Claude has made and then you can go back into the console and ask Claude to view those comments and to make changes based on them. You can also set up a separate slash command to ask Claude to do a uh PR review for you. Now, if you do have a slash command for doing a PR review, what I would encourage you to do is to open up cloud code in a completely new shell and then to run it fresh and so that it is not doesn't have the context pollution of the work that it's already done. I have a a slash command for doing PR reviews uh where I ask it to review it in the style of Sandy Mets. Sandy Mets is one of my heroes from the Rails world. She has some great principles for writing beautiful maintainable code. When I have Claude review the code in the style of Sandy Mets, it reveals places where we can make things more maintainable or more readable that I would have missed and certainly that Claude missed on its first pass. Now, I I will admit there's been more than a few times over the last couple weeks when I've had Claude write the code. I've had Claude do the PR review. Uh I've ensured that the test pass and I'm like, "Looks good to me." And I click the button to merge the poll request. So again, this the video is not intended to be prescriptive about the workflow, but I think the high-level bits here make a lot of sense. And then you got to figure out where in those individual steps of of the plan, the create, the test, and deploy are you going to get hyper involved as the human? And for me personally, I have been hyper involved in the planning phase. And I found it really difficult to delegate anything other than just like cleaning up my ideas or my pros to Claude. I think the planning is where I've been spending a whole lot of time and then I personally for this app and the size of the app and size of the codebase and all have been able to delegate a lot of the creating testing and deploying or the reviewing of the the coding etc to Claude. All right. So finally now that I have merged my PR here's what I do. I go back to claude and I run /cle. This completely wipes away the context window. I am not compacting the window. I am clearing the window. The idea here is that each issue should contain all of the information that Claude needs to perform that work. It should be able to work on the issue from a cold start. And thanks to the scratch pads and thanks to its ability to review PRs and all the previous work that's been done on the codebase, that issue should be descriptive enough for it to tackle it with no working memory. And this also frees up your context window. It will help you get better results while using fewer tokens. Now, let me address a quick question because you probably saw that Anthropic launched uh Claude via GitHub actions and this is a really cool feature that lets you just tag Claude in your directly from GitHub and have it work on some stuff. Um, so I have been playing with that a little bit. The primary reason why I'm not using that is because as of today, um, that usage of the GitHub actions is built with metered billing against your API. even if you're on a Claude Max plan. So, I have upgraded now to the $200 a month Claude Max plan. I am finding it is totally worth it to get the Claw 4 Opus use. Um, I've just been thrilled with the value I'm getting there, but then I was kind of bummed to then get a $50 API bill from Anthropic after I had been using uh tagging Claude in GitHub. And so, I was like, man, if I'm already getting unlimited access, uh, I might as well just do it in the console. And candidly, I think I'm getting much better uh insight and results from using claude code in the console. And so I actually talked to a friend Martin who works at Anthropic and his suggestion was use uh Claude in the GitHub actions when you're say doing a PR review and there's a small change perhaps a copy change or just like something tiny that needs to be tweaked but you don't necessarily want to go into the codebase and do it yourself. It's really good for those smaller fixes, but you probably don't want to be using GitHub actions for really large meaningful changes to your codebase. Uh, finally, let me just talk about work trees because, uh, Anthropic talks about this quite a bit. The best analogy that I have for work trees would be multitabling and poker. You know, you start playing online poker on a single table and then you realize you're just kind of clicking buttons every once in a while. you could probably play two table at a time and then at some point you've bought a bigger monitor and you're like playing four or eight tables at a time. That's sort of what running clawed work trees feels like. Uh instead of different poker tables up, you're just tabbing between different tabs in the terminal. And generally I think that the industry as a whole is excited about uh running coding agents in parallel or in the background. And work trees is the method that you can use with GitHub to run multiple instances of Claude working on multiple issues at the same time. I personally ran into two issues with it. The first is because I'm just getting started building this app. There's so much work that just simply needs to be done iteratively. There aren't a lot of features that can be developed in parallel where the code bases don't touch each other. Um, I found the interface for working with work trees to be a little bit clunky. The general idea behind a work tree is that you create copies of your git repo in separate subdirectories and then you have one version of claude running in, you know, subdirectory A on let's just call it branch A and then you have another one running on branch B and they're running in parallel in two different directories on your computer. Um, the issue that I had was that when I spun up a new version of Claude, like a new Claude session, I didn't have the same permissions that I had already approved on that first session of Claude. And so, every time I created a new branch, I was having to approve all the permissions again. And I just felt like I was having to babysit it a lot more. And then what happens is after you have finished work on that issue or that branch, you're supposed to delete that directory and then create a new work tree again. And so every time you're creating a new work tree, you're reapproving those permissions. And it just felt like I was doing more babysitting and more cleaning up merge conflicts than it was really worth it. Uh I found that just working with a single cloud instance is sufficient for me. Now, if you made it this far, you'd probably also enjoy the video I did on claude code pro tips. So check that one out.