Brought to you by:WorkOS—Make your app enterprise-ready todayMetaview—The agentic recruiting platform for winning teamsBryce Rattner Keithley spent her career in talent and recruiting and had never written a line of code. Then she used AI to build Daily Hundred, a fitness app with custom AI-generated videos of animals doing exercises, and shipped it to the App Store. In this episode, Bryce shares the exact workflow she used with Replit, Claude, Gemini, Higgsfield, and Kling; why being non-technical became an advantage; and what her journey reveals about how AI is changing who gets to build software.You can build and ship a production iPhone app with zero technical background. Bryce spent her entire career in talent and recruiting, had never written code, and still managed to build Daily Hundred—a fitness app with custom AI-generated videos—and get it approved in the App Store. The entire process took a few months of weekend work.The workflow that worked: Claude as architect, Claude Code as engineer, Terminal as executor. Bryce used regular Claude as her “friend in the cockpit” to plan what to do and how to approach problems. Claude would tell her when to use Claude Code to write actual code. She’d bring the code back to Claude for confirmation, then Claude would tell her what to paste into Terminal. This three-step dance—plan, execute, deploy—let her ship production code without having to know exactly how it all worked.Screenshots and iteration are your best debugging tools. When AI wasn’t understanding what Bryce wanted, she’d either get more literal in her descriptions, completely restart the prompt (not just edit it), or send screenshots showing what she was seeing. Sometimes she’d even draw what she wanted or photograph her own starting position to give the AI a visual reference. The key was trying different approaches rather than getting stuck in one failed pattern.The role of technical expertise is fundamentally changing. Bryce observed that engineers who come into technical interviews focused only on finding a working solution fastest are missing the point—“the robots can find a working solution faster than they can.” The human role has shifted to something broader: understanding the full suite of tools, knowing when to use AI versus when to step in personally, and bringing taste and judgment to the process. What got people here won’t get them there.Hiring for adaptability and openness matters more than ever. In Bryce’s view, people who get territorial about what they used to do or what other people used to do will struggle with relevance. The winners will be those with “the humility and the curiosity to work with others in ways that you haven’t before” and who recognize that “people can contribute in ways that they haven’t before.” The best idea should win, regardless of where it comes from.Brought to you by:Mercury—Radically different banking loved by over 300K entrepreneursClaire Vo breaks down one of her favorite Codex features: /goal. In this solo episode, she shows how Goals turn AI from a tool you have to constantly babysit into an agent that can work for hours on multi-step tasks. She walks through real examples, including eliminating Sentry errors, cleaning nearly 4,000 emails, and organizing Linear tasks, and shares the six-part framework to write Goals that actually run.Goals enable AI to work autonomously for hours without supervision. Claire ran a goal in Codex that worked for five hours and 45 minutes—the longest she’s ever had an AI agent run successfully. Unlike standard prompts that require turn-by-turn interaction, Goals create a loop where the AI works, verifies, checks, and continues until it hits the defined outcome.The difference between a prompt and a Goal is fundamental. A prompt is an instruction of what to do (“Rewrite this code”). A Goal is a description of what a good outcome looks like and how to get there (“Reduce P95 checkout latency below a defined threshold while keeping the correctness suite green”).Claire eliminated hundreds of error logs by pointing Goals at her Sentry data. She gave Codex access to every trace of invalid operations, then set a goal: categorize each issue, fix it, then replay all historical examples until every error is solved. The result: zero errors remaining, and instead of bandaid fixes scattered throughout the code, she got a systematic, intelligent framework.Goals work incredibly well for non-technical tasks. Claire cleaned 3,900 emails down to 68 in under four hours by setting a simple goal: categorize all emails, unsubscribe from unnecessary ones, and clean up the inbox. The AI read every email, created labels, clicked unsubscribe links, and left her with only the emails requiring judgment.Strong Goals have six key components: outcome (what should be true when done), verification (how to test it), constraints (what can’t regress), boundaries (what tools and files to use), iteration policy (how to decide what to try next), and stopping conditions (when to ask for help). Product managers who’ve written good OKRs will recognize this framework immediately.Working with Goals feels like managing a colleague, not babysitting a tool. You assign a task, the AI goes away for the time required (whether that’s 30 minutes or five hours), and comes back with completed work for you to review. Claire found herself “twiddling her thumbs” because so much of the work was now handled autonomously.Goals aren’t token-cheap, but they’re worth it. Claire’s email cleanup used about 6 million tokens over four hours. But the alternative—manually categorizing thousands of emails or chasing down hundreds of error logs—would take far longer and be far more tedious.Claire put Anthropic’s new Opus 4.8 model through real coding, design, and strategy tests across Claude Code and Claude Cowork. She shares where the model shines, where it breaks down, how it compares to Opus 4.7, and what builders should know before using it in production.The voice and ergonomics are excellent. Opus 4.8 is easy to read, doesn’t have “slop tells,” is token-efficient, and feels conversational without being annoying. It talks enough but not too much, and with fast mode enabled, the experience is snappy. The writing quality is strong and the model follows instructions well.Anthropic is shipping new features alongside Opus 4.8 that expand agentic capabilities. Claude Code now has dynamic workflows that let you spin off hundreds of parallel sub-agents. Both Claude.ai and Cowork now offer effort control from low to max, giving users more control over how deeply the model thinks through problems.Use Opus 4.8 for greenfield prototypes and design work, but test carefully for production codebases. The model excels at one-shot features, has improved design aesthetics (no more italicized emphasis words), and is good at tool use. But for existing codebases, edge cases, and strategy work requiring numerical analysis, you’ll need careful prompting and should double-check anywhere the model expresses high confidence.The model hallucinates when it gets stuck, which is a significant regression. Claire experienced straight-up hallucinations multiple times—something she hadn’t seen in a very long time with modern models. When debugging, Opus 4.8 would make up explanations based on hypotheses rather than actual data. It would confidently state things like “No, I didn’t search GitHub” or “No, I didn’t actually validate that bug” when asked to verify its work.Opus 4.8 struggles to orient itself in existing codebases. When Claire asked it to rebase branches and fix conflicts in her production codebase, it required cycle after cycle of fixes because it kept shipping edge-case bugs. The model couldn’t understand the elevation at which it should be operating or how to properly insert itself into existing code.The model isn’t ambitious enough for truly agentic work. Claire asked it to suggest fun things to build that would impress a 9-year-old, pushing it to explore the edges of agentic coding. While it shipped working code, the results were serviceable but not impressive—not the 10x agentic coding experience she expected from a state-of-the-art model.For business strategy work, Opus 4.7 significantly outperforms Opus 4.8. Claire tested both models on the same strategy prompt, giving them access to three months of business context. Opus 4.7 delivered numbers-anchored, structured analysis rooted in real data. Opus 4.8 was hand-wavy, over-rotated on small data points, and had a harder time discovering relevant information.If you’re enjoying these episodes, reply and let me know what you’d love to learn more about: AI workflows, hiring, growth, product strategy—anything.Catch you next week,LennyP.S. Want every new episode delivered the moment it drops? Hit “Follow” on your favorite podcast app.
🎙️ How I AI: Building an iPhone app with zero technical skills, Codex Goals explained & xxx
Your weekly listens from How I AI, part of the Lenny's Podcast Network









