37 Comments
User's avatar
Kamil Banc's avatar

I’m about to launch version 2.0 of Right-Click Prompt. This is going to be a commercial release and I’m obviously a little worried because I completely vibe coded it and I don’t know what I’m doing. 😂 would love to get your take on this fiasco when it happens

Expand full comment
Jenny Ouyang's avatar

Sounds like a solid masterpiece 🤩 Is it open to the public yet?

Expand full comment
Kamil Banc's avatar

No not yet

Expand full comment
Jenny Ouyang's avatar

Let me know when it’s ready and I’d be very happy to roast it 😆

Expand full comment
Giuseppe Santoro 🚢's avatar

Thanks for the mention

Expand full comment
Jenny Ouyang's avatar

They were truly amazing!

Expand full comment
Diana O.'s avatar

I'm glad I read this before launching my beta test next monday 😅.

Expand full comment
Jenny Ouyang's avatar

This is awesome! Let me know how it goes, and if you need more visibility, feel free to list yourself as a vibe coding builder on vibecoding.builders :)

Expand full comment
Diana O.'s avatar

Your answer means a lot to me. I feel supported. Thanks a lot!

Expand full comment
David's avatar
7dEdited

Pro tip:

Whenever you have personal workflows like this, you should endeavor to capture them into a Claude skill (sorry, I guess Im Anthropic centric and I dont think OpenAI or Gemini have the equiv., which might be an argument for others to use Claude instead.. LATE BREAKING = OpenAI is now adding skills, see Simon Willison's Dec 17 Substack). To start with, just prompt Claude "Make the following into a skill" and paste this whole blog post in there and then take the skill file and put it into your ~/.claude/skills/ so you have a permanent and repeatable workflow, and then start to update that file with time so you accumulate your learnings.

Maybe this is a small startup idea, sell these skill files for various efforts (code reviews, deployments, different testing, ....). Anthropic/Claude actually has a "plugin marketplace" feature so users and easily import skills and such from URLs, see the /plugin slash command in CC.

Pro pro tip:

If you have Python (backend) code, look into using the Hypothesis package for testing. Hypothesis is a property-based testing library for Python. Instead of writing specific test cases, you describe properties your code should satisfy, and Hypothesis automatically generates hundreds of random inputs to find edge cases that break those properties.

https://hypothesis.readthedocs.io/

Furthermore, you can use Schemathesis. Schemathesis automatically generates API tests from your OpenAPI/Swagger spec. It uses Hypothesis under the hood to create randomized requests that probe edge cases, invalid inputs, and unexpected combinations your API should handle.

https://schemathesis.readthedocs.io/en/stable/

These are low cost, high yield efforts and are probably things that should be added to that skills.md file

Expand full comment
Jenny Ouyang's avatar

David, this is such a thorough way to think about automating workflows! I’m with you, Claude is my favorite too, whether it’s the model, the app, or the CLI 😄 I really should brush up my own workflows and publish more of them into Claude’s plugin marketplace.

Your pro pro tip intrigued me. Hypothesis sounds perfect for enterprise-scale codebases. Schemathesis seems even more powerful, though I wonder what kind of resource costs come with that level of automated fuzzing. Do you actively use either of them in your stack?

Expand full comment
David's avatar
5dEdited

As I understand it, Claude's Marketplace, at this point, isnt really an "app store" kind of place, just a way to publish skills and such so that others can pull them in, I think answering the question of, how could an enterprise have a distribution mechanism so everyone can pull in like some skill.md file so the organization is in sync with some workflow. But I think we might be getting the first glimpses into where Anthropic is going here because they could establish their own "app store" like server host where people can publish to and have a web front end for discovery (or alternately, someone could bootstrap this effort and beat them to the punch).

I learned about Hypthesis from a PyBay video recently and was like, "oh wow, I should have been doing this all along". Ive tried it out, but havent deployed it, but its one of these things that once you see and and grok it, you put into your back pocket so when you need it, its there. And "fuzz" was a good description, but think instead of just random fuzzing, its fuzzing with appropriate data that stresses the system. For numbers, tests at upper and lower and certain break points, NAN/infinity (I think).

Expand full comment
Laura Ferraz Baick's avatar

The "worked on my machine" trap is so real, especially when you're moving fast with AI. Our team shipped features that passed every logical check but completely fell apart when someone used an older phone or clicked twice instead of once.

Expand full comment
Jenny Ouyang's avatar

Auugh yes! It’s wild how many different ways there is to break a perfect app 😂

Expand full comment
Suhrab Khan's avatar

Your breakdown is gold. Smoke testing isn’t glamorous, but it’s the difference between a launch day nightmare and a confident rollout. Every AI-built app benefits from this structured, human-first approach.

Expand full comment
Jenny Ouyang's avatar

Thank you Suhrab! Much appreciate it!

Expand full comment
Neural Foundry's avatar

Brilliant breakdown of systematic testing without over-engineering. The "three times" rule for happy path testing is spot on, state bugs always hide past the first run. DevTools mobile view vs actual device testing is probably the most underrated gap in most peoples workflow, seen too many "works on my machine" disasters from skipping real hardware. Fresh eyes testing phase is gold for catching UX blind spots.

Expand full comment
Jenny Ouyang's avatar

Thank you so much!!

Expand full comment
Neural Foundry's avatar

Brilliant breakdown of systematic testing without over-engineering. The "three times" rule for happy path testing is spot on, state bugs always hide past the first run. DevTools mobile view vs actual device testing is probably the most underrated gap in most peoples workflow, seen too many "works on my machine" disasters from skipping real hardware. Fresh eyes testing phase is gold for catching UX blind spots.

Expand full comment
Elena Calvillo at Product's avatar

I like that you mention the importance of smoke tests, which many overlook. Smoke tests are an essential part of the SDLC and the best way to ensure a successful launch (well, most of the times). This is especially important when managing many teams that are building different things; otherwise, it would be chaotic!

Thanks for mentioning it, Jenny! I can't believe we are halfway through the aiadventchallenge.com, and so much is going on! 🎄

Expand full comment
Jenny Ouyang's avatar

Thanks for reading Elena! I look forward to interviewing you about this AI advent challenge!

Expand full comment
Elena Calvillo at Product's avatar

Can’t wait for that interview to happen! 🫶🏻

Expand full comment
Hodman Murad's avatar

This checklist is going straight into my workflow!

Expand full comment
Jenny Ouyang's avatar

Thank you Hodman 🙌

Expand full comment
Sam Illingworth's avatar

Thanks Jennifer, this is such a great post. I hadn't heard the term smoke test before, but it makes complete sense that this is what we should be doing before we launch our products.

I wonder though, is there something we should be doing with the other testers rather than just ourselves, so that we can get beyond our own blinkered approach to how different users might be using the product?

For me, this is really reminiscent of when I design tabletop games and I invite different users, or rather players, to play test the game until it breaks. They find new ways to play the game than I would ever have imagined. In a way, I guess this is an analogue version of the smoke test.

Expand full comment
Jenny Ouyang's avatar

Thanks Sam! Wow, your example is so interesting, it really is an analogue version of smoke testing, but it actually goes even further. It sounds like your play-testers are reshaping the product itself. That goes beyond surface testing into something more transformational.

And from that stance, I think there’s a blurry line between testing for breakage and listening for insight. In many cases, what starts as “let’s make sure it works” quickly turns into “wait, maybe it shouldn’t work that way at all.” I think that’s what’s implied in your story too, it’s not just validation, it’s creative redirection.

For me, that’s the most revealing part of handing a product to someone else with zero context. The way they pause, click the “wrong” thing, or hesitate, that confusion is a usability bug. Not in the technical sense, but in the behavioral sense: it’s where system assumptions and human expectations collide. And that’s where the real learning begins.

Expand full comment
John Brewton's avatar

Smoke testing is the fastest way to protect trust before you invite real users in.

Expand full comment
Jenny Ouyang's avatar

Yeah… I’m so grateful that my users are still with me after all those breaks :)

Expand full comment
Richard's avatar

Now that we’re in the vibe coding era this is more important than ever

Expand full comment
Jenny Ouyang's avatar

Definitely! Thanks for reading Richard!

Expand full comment
Double ID's avatar

What I appreciate here isn’t the tactic, but the restraint behind it.

This isn’t really about smoke testing. It’s about refusing to confuse momentum with signal.

Most people use “vibe” as permission to avoid reality. This reframes it as something that still has to answer to it.

That distinction alone can save months of misdirected effort.

🙌🤝🙌

Expand full comment
Jenny Ouyang's avatar

So true... thank you for pointing that out. Yes, most people use "vibe" as excuse to ignore reality, but it doesn't have to be.

Expand full comment
Dheeraj Sharma's avatar

Extremely detailed as usual!! Full of useful tips and guidance. It seems right on time for me as well just to ensure I am not leaving any point before shipping my first app in first week of 2026

Expand full comment
Jenny Ouyang's avatar

This is awesome! I would love to learn more about your first app!

Expand full comment
Karen Spinner's avatar

I already do most of what you describe, but it’s great to see it organized and packaged like this! Especially love the checklist! ❤️🙏

Expand full comment
Jenny Ouyang's avatar

Haha I totally had your testing rituals in mind while writing this one 😄 glad the checklist landed!

Expand full comment