Why Your AI-Built App Breaks — Mistakes Mirror Playground

🧠 01

CATEGORY 01

Prompt Chaos

Mistakes that come from vague, oversized, or unscoped prompting.

0 / 6 answered

⚠️ High +7 PTS

Vague prompts produce vague apps

"I wrote: "build me a dashboard.""

Do you do this?

⚡ Impact

The AI invents product decisions you never made — wrong users, wrong data, wrong success criteria.

🔥 Why it breaks

Without a target, the agent guesses, and every guess is a decision you have to either accept or rebuild later.

🛠️ Quick fix

1 Define the user, the data, the action, and what "done" looks like.
2 Replace the one-line prompt with a short feature spec.
3 Push back when the agent makes assumptions you did not approve.

🏆 Senior move

Treat every prompt as a spec. If the agent has to guess, the spec is too vague.

🚨 Critical +10 PTS

Asking for the whole SaaS in one prompt

"I asked AI to build the whole SaaS at once."

Do you do this?

⚡ Impact

You get a huge unreviewable diff that nobody — including you — can verify, debug, or trust.

🔥 Why it breaks

Large diffs hide bad decisions, mismatched assumptions, and broken pieces inside something that almost works.

🛠️ Quick fix

1 Break the app into small implementation units.
2 Build one feature, one spec, one diff.
3 Verify each unit before the next.

🏆 Senior move

One unit at a time, one verifiable result, one reviewable diff.

💡 Medium +4 PTS

No definition of done

"I told the AI to build it — but never said what "done" means."

Do you do this?

⚡ Impact

The feature looks finished, but behavior is wrong, missing, or different from what you actually wanted.

🔥 Why it breaks

Without a checklist, both you and the agent declare victory the moment something runs.

🛠️ Quick fix

1 Add a "Verify when done" checklist to every spec.
2 Test the listed behaviors before closing the unit.
3 Update the progress tracker with what was actually verified.

🏆 Senior move

Done means verified — not running, not compiled, verified.

⚠️ High +7 PTS

The agent forgets everything between sessions

"The AI forgot what we built yesterday."

Do you do this?

⚡ Impact

The agent rewrites or contradicts decisions, breaks patterns, and recreates files you already had.

🔥 Why it breaks

Without context files and a progress tracker, every session starts from zero.

🛠️ Quick fix

1 Create the six context files: project-overview, architecture, code-standards, ai-workflow-rules, ui-context, progress-tracker.
2 Update progress-tracker.md after every meaningful change.
3 Have the agent read those files before writing any code.

🏆 Senior move

Context is memory. The progress tracker is the memory of yesterday's decisions.

💡 Medium +4 PTS

Stuck in a fix-this regression loop

"Every fix breaks something else."

Do you do this?

⚡ Impact

The codebase enters a regression cycle where each patch creates a new bug somewhere unrelated.

🔥 Why it breaks

You are treating symptoms, not root causes — and the agent has no scope discipline to limit the blast radius.

🛠️ Quick fix

1 Stop and identify the actual root cause before changing code.
2 Define the smallest scoped fix and protect unrelated files.
3 Roll back if the diff keeps growing.

🏆 Senior move

Scope every fix. If you cannot describe the root cause in one sentence, do not patch yet.

💡 Medium +4 PTS

Letting AI return prose for machine data

"I let the AI return prose for something my code consumes."

Do you do this?

⚡ Impact

Automation breaks unpredictably the moment the model rewords its answer.

🔥 Why it breaks

Free-form text is a contract you cannot rely on; small phrasing changes cascade through your pipeline.

🛠️ Quick fix

1 Use a strict JSON or schema-validated output for any machine-consumed result.
2 Validate the schema and fail loudly when it does not match.
3 Add small evals so future model changes do not silently break it.

🏆 Senior move

Treat AI output as untrusted until validated against a schema.

🚀 02

CATEGORY 02

Localhost Lies

Mistakes that happen when a local demo turns into a public app.

0 / 8 answered

🚨 Critical +10 PTS

Localhost is not a real URL

"I sent a localhost link to someone and expected it to work."

Do you do this?

⚡ Impact

Nobody else can open the app. You have not deployed anything yet.

🔥 Why it breaks

localhost points to the current machine. On someone else's laptop it means their laptop, not yours.

🛠️ Quick fix

1 Deploy to a real host: Vercel, Netlify, Railway, Render, or Fly.io.
2 Replace hardcoded localhost API URLs with environment variables.
3 Test the deployed URL from a different device or browser.

🏆 Senior move

Local proves possibility. Deployment proves it can leave your laptop.

⚠️ High +7 PTS

Deployed app still calls localhost

"My deployed frontend still calls localhost for the API."

Do you do this?

⚡ Impact

The page loads in production but every API call fails. Users see a blank or broken app.

🔥 Why it breaks

The browser tries to call the user's own machine on port 3000 — which has nothing on it.

🛠️ Quick fix

1 Move the API base URL into an environment variable.
2 Set the variable separately for local, preview, and production.
3 Search the codebase for localhost before every deploy.

🏆 Senior move

Environment-driven URLs always — no "it works on my machine" hardcoding.

⚠️ High +7 PTS

Missing env vars in production

"It works locally but the deploy crashes at boot."

Do you do this?

⚡ Impact

The app fails to start in production because a required key, URL, or secret is undefined.

🔥 Why it breaks

Local .env files are never deployed. Production needs its own variables, set on the host.

🛠️ Quick fix

1 List every required variable in a single config file or schema.
2 Validate them at boot — fail loudly with the missing key name.
3 Set the production values on the host before deploying.

🏆 Senior move

Env validation at startup. No silent undefined.

⚠️ High +7 PTS

Preview worked, production broke

"Preview was fine but production crashed."

Do you do this?

⚡ Impact

Users hit a broken release because preview and production were not actually equivalent.

🔥 Why it breaks

Preview often points at a different database, secrets, or domain than production — divergence hides bugs.

🛠️ Quick fix

1 Treat preview as a real environment with its own variables and database.
2 Test the same flows in preview and production before each release.
3 Promote a single build artifact from preview to production where possible.

🏆 Senior move

Three real environments: local, preview, production — each with its own config.

💡 Medium +4 PTS

Built successfully, crashes when used

"It built successfully but crashes the moment a user touches it."

Do you do this?

⚡ Impact

Deploys go green, but the first real request fails — usually on missing runtime config.

🔥 Why it breaks

The build only checks compilation. It does not exercise runtime config, database access, or external services.

🛠️ Quick fix

1 Validate environment variables on app boot.
2 Add a smoke test that hits one real endpoint after deploy.
3 Use health checks that exercise the database and key services.

🏆 Senior move

Green build is necessary, not sufficient. Smoke-test the real flow.

💡 Medium +4 PTS

Server runs but nobody can reach it

"Logs say the server is up, but no one can connect."

Do you do this?

⚡ Impact

The container or VM is technically running, but bound to the wrong interface or port.

🔥 Why it breaks

Defaults like 127.0.0.1 only accept connections from inside the container.

🛠️ Quick fix

1 Bind the server to 0.0.0.0 inside containerized or hosted environments.
2 Match the port the host expects (often $PORT).
3 Test connectivity from outside the container, not just inside.

🏆 Senior move

Read your host's deploy doc once. Bind correctly, expose correctly.

💡 Medium +4 PTS

Static host for a dynamic app

"I deployed a backend to a static-only host."

Do you do this?

⚡ Impact

API routes never run. The frontend works; everything that needs server logic 404s.

🔥 Why it breaks

Static hosting only serves files. It does not run server functions, edge functions, or persistent processes.

🛠️ Quick fix

1 Pick a host that supports your runtime: Node, edge, serverless functions, containers.
2 Or split into static frontend + a separate API host.
3 Verify a server route works in production before adding more.

🏆 Senior move

Match hosting to runtime. Static, serverless, and persistent are different products.

⚠️ High +7 PTS

No rollback plan

"I deployed something bad and could not undo it."

Do you do this?

⚡ Impact

A broken release stays live until you manually rebuild and redeploy from a working commit.

🔥 Why it breaks

Without versioned deployments, "go back" requires guessing what was last good.

🛠️ Quick fix

1 Use a host with versioned deployments and one-click rollback.
2 Tag releases in Git so you can redeploy a known-good commit.
3 Test rollback once before you actually need it.

🏆 Senior move

Every deploy assumes a rollback exists. If it does not, the deploy is not safe.

🔐 03

CATEGORY 03

Secret Leaks

Mistakes that expose keys, credentials, or external systems.

0 / 7 answered

🚨 Critical +10 PTS

API key shipped in the frontend bundle

"The AI put my API key directly in React code."

Do you do this?

⚡ Impact

Anyone viewing the page can pull the key out of the JavaScript bundle and use it.

🔥 Why it breaks

Frontend code runs on the user's browser. There are no real secrets there.

🛠️ Quick fix

1 Rotate the exposed key immediately.
2 Move the API call to a backend route that holds the key.
3 Inspect the built bundle to confirm no keys remain.

🏆 Senior move

Secrets stay on the server. The browser never holds the credentials.

🚨 Critical +10 PTS

Secret committed to Git

"I pushed a .env file or secret to GitHub."

Do you do this?

⚡ Impact

The credential is now in Git history. Even if you delete the file, it lives on in old commits.

🔥 Why it breaks

Public or scanned repos leak credentials within minutes — bots check constantly.

🛠️ Quick fix

1 Rotate the leaked key immediately.
2 Remove it from history with git filter-repo or BFG (or accept history rewrite).
3 Add .env to .gitignore and enable secret scanning.

🏆 Senior move

Treat any committed secret as already compromised. Rotate first, clean history second.

🚨 Critical +10 PTS

Service-role key in the browser

"I used a service-role or admin key in frontend code."

Do you do this?

⚡ Impact

Users can bypass row-level security and access or mutate data they should not see.

🔥 Why it breaks

Service-role keys are designed to skip access control. They belong only to trusted server code.

🛠️ Quick fix

1 Rotate the service-role key.
2 Use the public/anon key in the frontend, with row-level security enabled.
3 Move privileged operations to a backend route that holds the service key.

🏆 Senior move

Two key tiers: anon for the browser, service for the backend. Never mix.

⚠️ High +7 PTS

Over-broad OAuth scopes

"I asked for full Slack, GitHub, or Google permissions just to make integration easy."

Do you do this?

⚡ Impact

If your app is compromised, the attacker inherits everything you asked for — far beyond the feature's actual need.

🔥 Why it breaks

OAuth scopes are a blast-radius decision. Broad scopes mean broad consequences.

🛠️ Quick fix

1 List the minimum scopes the feature actually requires.
2 Re-request consent with the narrowed scope set.
3 Audit existing tokens and revoke ones with excessive permissions.

🏆 Senior move

Least-privilege scopes. Ask for read-only first; escalate only when the feature demands it.

⚠️ High +7 PTS

Agent has no permission layer

"The AI agent can call any tool with no checks."

Do you do this?

⚡ Impact

The agent may send, charge, delete, or deploy without anyone reviewing the action.

🔥 Why it breaks

Tool calls are real side effects. Without a permission layer, the agent's mistakes become production incidents.

🛠️ Quick fix

1 Define which tools require human approval.
2 Add an approval step in the workflow before risky tools fire.
3 Log every tool call with inputs, outputs, and approver.

🏆 Senior move

Approvals on side effects. Read-only tools auto-fire; write tools wait for a human.

⚠️ High +7 PTS

Treating external content as trusted instructions

"I let the agent treat scraped pages, emails, or files as instructions."

Do you do this?

⚡ Impact

Malicious content can hijack the agent — exfiltrate data, send messages, or call dangerous tools.

🔥 Why it breaks

Prompt injection works because models cannot reliably distinguish data from instructions.

🛠️ Quick fix

1 Treat all external content as untrusted data, not instructions.
2 Wrap and label external text clearly in the prompt.
3 Add input/output guardrails and review tool calls triggered by external sources.

🏆 Senior move

Untrusted in, validated out. The agent never acts on raw external text.

⚠️ High +7 PTS

Coding agent runs unsandboxed on your machine

"My coding agent has full shell, file, and network access on my laptop."

Do you do this?

⚡ Impact

A bad command, malicious dependency, or prompt injection can read secrets, delete files, or call out to attackers.

🔥 Why it breaks

Unsandboxed agents have the same blast radius as you do.

🛠️ Quick fix

1 Run the agent in an isolated workspace (container, VM, dev container).
2 Limit network access and protect secret files.
3 Approve risky commands instead of auto-running them.

🏆 Senior move

Agents work in a sandbox by default. Production access is granted, not assumed.

🛡️ 04

CATEGORY 04

Fake Auth

Mistakes where login exists but real access control does not.

0 / 6 answered

🚨 Critical +10 PTS

A login page is not real auth

"The login UI works, so I assumed auth is done."

Do you do this?

⚡ Impact

The backend may still be wide open — anyone calling the API directly bypasses your login form entirely.

🔥 Why it breaks

Auth lives on the server, not on the screen. UI is decoration around it.

🛠️ Quick fix

1 Protect API routes with server-side session or token checks.
2 Verify protected routes return 401 when called without a valid session.
3 Test from a tool like curl, not just the browser.

🏆 Senior move

If a route is not protected on the server, it is not protected.

🚨 Critical +10 PTS

Permission checks only in the UI

"I just hide the button if the user is not allowed."

Do you do this?

⚡ Impact

Any user can call the API directly and perform the action you tried to hide in the UI.

🔥 Why it breaks

Frontend code is not a security boundary. The browser is fully under the user's control.

🛠️ Quick fix

1 Re-check permissions on the backend for every protected action.
2 Return 403 from the API when not allowed, regardless of UI state.
3 Test by calling the API directly without the UI.

🏆 Senior move

UI hides things for usability. The backend enforces them for safety.

🚨 Critical +10 PTS

No ownership checks on records

"Users can access any record by changing the ID in the URL."

Do you do this?

⚡ Impact

Data leaks across users. Person A loads person B's invoice, profile, or chat by guessing IDs.

🔥 Why it breaks

Authentication checks who you are. Authorization checks what you can touch — those are different.

🛠️ Quick fix

1 On every read and write, check that the record belongs to the current user or team.
2 Use random, non-sequential IDs to make guessing harder.
3 Test the API as user B with user A's IDs.

🏆 Senior move

Every query includes ownership. "Whose data is this" is part of the query, not a separate check.

⚠️ High +7 PTS

Tested with only one user account

"I only ever tested with my own account."

Do you do this?

⚡ Impact

Cross-user bugs — leaks, permission gaps, role mistakes — survive into production unseen.

🔥 Why it breaks

Single-user testing cannot reveal what happens between users.

🛠️ Quick fix

1 Create at least two test users plus an anonymous case.
2 Test that user A cannot see or change user B's data.
3 Test what unauthenticated requests get.

🏆 Senior move

Two users plus anonymous, on every protected feature.

⚠️ High +7 PTS

Admin role hardcoded to your email

"I used `email === "me@example.com"` for admin access."

Do you do this?

⚡ Impact

When a teammate joins, leaves, or your email changes, admin access breaks or sticks around incorrectly.

🔥 Why it breaks

Hardcoded checks are not access control — they are a sticky note pretending to be a system.

🛠️ Quick fix

1 Add a roles or permissions table.
2 Assign roles to users in the database.
3 Replace email checks with role checks everywhere.

🏆 Senior move

Access control lives in data, not in code branches.

⚠️ High +7 PTS

Client sees internal data

"Every role can see fields meant for admins or staff only."

Do you do this?

⚡ Impact

Confidential pricing, internal notes, costs, or PII leak into the wrong UI.

🔥 Why it breaks

Returning the whole object to every role is the easy default — and the wrong one.

🛠️ Quick fix

1 Define what each role is allowed to read.
2 Filter response fields by role on the server.
3 Never trust the client to hide sensitive fields.

🏆 Senior move

Shape the API response per role, not per UI screen.

🗄️ 05

CATEGORY 05

Database Regret

Mistakes in schema, access policies, migrations, and data safety.

0 / 7 answered

⚠️ High +7 PTS

Accepted AI-generated schema without review

"The AI generated the database tables and I just accepted them."

Do you do this?

⚡ Impact

The data model breaks as the app grows — wrong relationships, missing fields, painful migrations.

🔥 Why it breaks

Schema decisions outlive features. A bad model cascades into bad APIs and bad UI for months.

🛠️ Quick fix

1 Sketch the entities, relationships, and ownership before code.
2 Review every AI-proposed schema and challenge it.
3 Document the data model in architecture.md.

🏆 Senior move

You own the data model. The agent helps you implement it, not invent it.

🚨 Critical +10 PTS

No row-level security or backend authorization

"The frontend reads and writes the database directly with no policies."

Do you do this?

⚡ Impact

Any user can read or modify any row. The app's privacy is whatever the client decides to ask for.

🔥 Why it breaks

Without RLS or backend authorization, every record is effectively public.

🛠️ Quick fix

1 Enable row-level security on every exposed table.
2 Or move database access behind a backend that authorizes per request.
3 Verify with a second user account that data is properly isolated.

🏆 Senior move

Default-deny on the database. Open up access intentionally, per table, per role.

⚠️ High +7 PTS

Schema changes made manually

"I changed the schema by hand in the database UI."

Do you do this?

⚡ Impact

Local and production drift apart. Every deploy is a guess about what columns exist where.

🔥 Why it breaks

Without migration files, schema changes are tribal knowledge — and tribal knowledge does not deploy.

🛠️ Quick fix

1 Use a migration tool (Prisma, Drizzle, Alembic, Flyway, etc.).
2 Commit migrations to Git.
3 Run them automatically on deploy.

🏆 Senior move

Schema lives in migrations. The database is just where they end up.

⚠️ High +7 PTS

Local SQLite file in a serverless app

"I shipped SQLite as my production database on a serverless host."

Do you do this?

⚡ Impact

Data does not persist between cold starts, or different instances see different data.

🔥 Why it breaks

Serverless containers are ephemeral. A local file is local to the container, which can vanish.

🛠️ Quick fix

1 Use a managed Postgres, MySQL, or hosted SQLite that supports your runtime.
2 Verify data persistence across deployments and cold starts.
3 Plan backups and connection pooling for the new database.

🏆 Senior move

Pick a database that matches your runtime's persistence model.

💡 Medium +4 PTS

No indexes on common queries

"It's fast on my test data, so I didn't add indexes."

Do you do this?

⚡ Impact

Real production data slows queries to seconds — or causes timeouts under load.

🔥 Why it breaks

Without indexes, the database scans the whole table for every filter or join.

🛠️ Quick fix

1 Add indexes on columns used in WHERE, JOIN, and ORDER BY.
2 Inspect query plans for slow endpoints.
3 Re-test performance with realistic data volume.

🏆 Senior move

Indexes are part of the schema, not a performance afterthought.

⚠️ High +7 PTS

No backups or untested backups

"The host probably backs it up, right?"

Do you do this?

⚡ Impact

A bad migration, accidental delete, or disk failure can wipe out user data with no recovery.

🔥 Why it breaks

"Probably" is not a backup strategy. Untested backups often fail to restore.

🛠️ Quick fix

1 Confirm and configure automated backups for your database.
2 Set the retention window your business actually needs.
3 Test a real restore at least once.

🏆 Senior move

A backup you have not restored is a guess. Test the restore.

💡 Medium +4 PTS

Validation only in the UI

"The AI validates fields only in the form, not in the database."

Do you do this?

⚡ Impact

Bad data slips in via API calls, scripts, or other clients — and stays forever.

🔥 Why it breaks

Forms are one of many ways to write data. Without DB constraints, every other path is unprotected.

🛠️ Quick fix

1 Add NOT NULL, UNIQUE, FK, and CHECK constraints to enforce invariants in the database.
2 Validate on the server before writes.
3 Treat the form as the friendly layer, not the boundary.

🏆 Senior move

The database is the last line of defense for data integrity.

💳 06

CATEGORY 06

Payment Traps

Mistakes around webhooks, idempotency, and subscription state.

0 / 5 answered

🚨 Critical +10 PTS

Success URL unlocks premium

"After Stripe redirects to the success URL, I unlock premium features."

Do you do this?

⚡ Impact

Anyone who knows the success URL can grant themselves a paid plan without paying.

🔥 Why it breaks

Redirect URLs are not proof of payment. They can be opened directly, bookmarked, or guessed.

🛠️ Quick fix

1 Grant access only after a verified webhook event from your payment provider.
2 Check the event signature and the customer's actual subscription state.
3 Make the success page a UX confirmation, not a permission grant.

🏆 Senior move

Money confirms via webhook. The success page just says "thanks."

🚨 Critical +10 PTS

Webhook endpoint accepts anything

"My webhook endpoint accepts any POST as a real event."

Do you do this?

⚡ Impact

Anyone can fake a payment event and unlock features, refunds, or admin actions.

🔥 Why it breaks

Webhook URLs are reachable from the internet. Without signature verification, they trust the world.

🛠️ Quick fix

1 Verify the signature on every webhook using the provider's secret.
2 Reject unsigned or invalid requests.
3 Rotate webhook secrets if they have ever been logged.

🏆 Senior move

Treat webhook bodies as untrusted until the signature is verified.

⚠️ High +7 PTS

No idempotency on payment actions

"When the webhook retries, my code creates duplicate records or charges."

Do you do this?

⚡ Impact

Users get double-charged, double-granted access, or double-emailed when retries happen.

🔥 Why it breaks

Webhooks retry by design. Without idempotency, every retry is a new side effect.

🛠️ Quick fix

1 Store processed event IDs and skip duplicates.
2 Use idempotency keys for outbound charges and writes.
3 Make handlers safe to run twice.

🏆 Senior move

Idempotent by default for any external trigger or retry.

⚠️ High +7 PTS

Test and live keys mixed

"Stripe worked in test mode but live mode broke."

Do you do this?

⚡ Impact

Real charges fail, or test events leak into production state.

🔥 Why it breaks

Test and live are separate worlds with separate webhooks, secrets, and product IDs.

🛠️ Quick fix

1 Use entirely separate keys, secrets, and product IDs per environment.
2 Validate which mode the app is in at boot.
3 Run a real end-to-end live transaction in staging before launch.

🏆 Senior move

Test mode and live mode are different products. Treat them that way.

⚠️ High +7 PTS

Subscription state out of sync

"User paid, but the app still says they are on the free plan."

Do you do this?

⚡ Impact

Customers pay and lose access — or stop paying and keep access. Both are bad.

🔥 Why it breaks

Without webhook-driven state, your app's subscription field is just a stale copy.

🛠️ Quick fix

1 Listen to subscription created, updated, canceled, and renewed events.
2 Update your local subscription state from those events.
3 Reconcile on login as a safety net.

🏆 Senior move

Subscription state mirrors the payment provider, not the other way around.

🧪 07

CATEGORY 07

Happy-Path Illusions

Mistakes where the demo works but real users break it.

0 / 6 answered

💡 Medium +4 PTS

Happy-path only testing

"I tested the demo once and called it done."

Do you do this?

⚡ Impact

Real users hit empty states, errors, slow networks, and edge cases — and the app falls over.

🔥 Why it breaks

The happy path is the smallest slice of real usage.

🛠️ Quick fix

1 Test empty, loading, error, retry, and "too much data" states.
2 Try the app with a slow connection.
3 Test with realistic data, not just clean demo data.

🏆 Senior move

Build the unhappy paths on purpose. They are most of production.

💡 Medium +4 PTS

Never tested on mobile

"It looks fine on my laptop, so I shipped."

Do you do this?

⚡ Impact

Mobile users see broken layouts, unreachable buttons, and forms that cannot be submitted.

🔥 Why it breaks

Most users browse on phones. "Looks fine on my laptop" is a sample of one.

🛠️ Quick fix

1 Open the app on your phone over real mobile data.
2 Test responsive breakpoints in DevTools.
3 Verify tappable areas, scrolling, and keyboard behavior.

🏆 Senior move

Mobile is the default viewport. Desktop is a wider variant.

💡 Medium +4 PTS

Never tested with bad input

"I only entered valid data into my forms."

Do you do this?

⚡ Impact

Real users paste, fat-finger, or attack inputs — and the app crashes or accepts garbage.

🔥 Why it breaks

Validation only against valid data tests nothing. Bad input is the test.

🛠️ Quick fix

1 Try empty, very long, special characters, and clearly invalid values.
2 Add server-side validation in addition to client-side.
3 Show clear errors instead of silent failures.

🏆 Senior move

Validate on the server. Treat client validation as UX, not a guarantee.

💡 Medium +4 PTS

Never tested an expired session

"I never signed out, never let a session expire."

Do you do this?

⚡ Impact

When sessions expire in production, users see broken pages, lost work, or confusing errors.

🔥 Why it breaks

Expired sessions are a common state your code rarely handles by default.

🛠️ Quick fix

1 Manually expire a session and walk through the app.
2 Handle 401 responses by redirecting to login.
3 Preserve in-progress work where possible across re-auth.

🏆 Senior move

Expired and refreshed sessions are part of the supported lifecycle.

💡 Medium +4 PTS

Never tested rapid double-click

"I clicked submit once and called it tested."

Do you do this?

⚡ Impact

Real users double-click, retry, and refresh — creating duplicate records, payments, or emails.

🔥 Why it breaks

Without idempotency or UI guards, every duplicate click is a duplicate action.

🛠️ Quick fix

1 Disable submit buttons after the first click until the response returns.
2 Use idempotency keys on backend writes that must not duplicate.
3 Test with rapid clicks in DevTools.

🏆 Senior move

Assume every action is triggered twice. Make it safe to be triggered twice.

⚠️ High +7 PTS

No evals for AI features

"The prompt looked good once, so I shipped it."

Do you do this?

⚡ Impact

Model updates, prompt edits, or tool changes silently regress behavior — and you find out from users.

🔥 Why it breaks

AI features look stable until inputs vary. Without evals, regressions are invisible.

🛠️ Quick fix

1 Pick 10–30 representative inputs with expected behaviors.
2 Run them automatically on every prompt or model change.
3 Watch trends, not single results.

🏆 Senior move

Eval suites are tests for AI features. Treat them like unit tests.

👁️ 08

CATEGORY 08

Silent Failures

Mistakes where you cannot see what your app or agent is doing.

0 / 6 answered

⚠️ High +7 PTS

No logs in production

"When something fails, I have no idea why."

Do you do this?

⚡ Impact

Debugging becomes guessing. Real incidents take hours instead of minutes.

🔥 Why it breaks

Without logs, you only know what users tell you — which is usually "it broke."

🛠️ Quick fix

1 Add structured logs with timestamps and request IDs.
2 Log key events: incoming requests, errors, external calls.
3 Send logs to a host you can actually search.

🏆 Senior move

Logs are the first feature of production-readiness.

⚠️ High +7 PTS

No traces for agent runs

"The agent did something, but I cannot see what."

Do you do this?

⚡ Impact

When an agent takes a wrong action, you cannot reconstruct why.

🔥 Why it breaks

Agents make many model calls, tool calls, and handoffs. Without traces, the inner loop is opaque.

🛠️ Quick fix

1 Trace model calls, tool calls, handoffs, and approvals.
2 Persist traces with run IDs.
3 Inspect failed runs end to end before patching.

🏆 Senior move

If you cannot trace the agent, you cannot improve it.

💡 Medium +4 PTS

No health checks

"Deploys say success while the app is actually broken."

Do you do this?

⚡ Impact

Bad releases stay live until users complain.

🔥 Why it breaks

"Build succeeded" only proves the build, not the app.

🛠️ Quick fix

1 Add a /health route that exercises the database and key dependencies.
2 Hook it into your host's health check.
3 Alert when it goes red.

🏆 Senior move

Health checks measure the app, not the build.

💡 Medium +4 PTS

No cost monitoring on AI calls

"My AI bill exploded out of nowhere."

Do you do this?

⚡ Impact

Runaway costs, surprise bills, and no idea which feature is responsible.

🔥 Why it breaks

AI calls compound: retries, tool loops, large contexts. Costs blow up quickly.

🛠️ Quick fix

1 Track tokens and calls per feature.
2 Cache where you can; cap retries and context size.
3 Alert on daily spend thresholds.

🏆 Senior move

Spend is a metric. Watch it like latency.

💡 Medium +4 PTS

No rate-limit handling

"It works for me, fails for many users."

Do you do this?

⚡ Impact

External APIs return 429s under load, breaking flows that worked single-user.

🔥 Why it breaks

Rate limits are a feature, not an error. Without backoff and queues, they cascade.

🛠️ Quick fix

1 Add exponential backoff on retryable errors.
2 Queue work that exceeds per-second limits.
3 Surface partial progress instead of failing the whole job.

🏆 Senior move

Treat rate limits as part of the API contract, not an exception.

💡 Medium +4 PTS

No alerting

"Users tell me about outages before my system does."

Do you do this?

⚡ Impact

Slow incident response, lost trust, and a backlog of unseen errors.

🔥 Why it breaks

Without alerts, observability data sits unread.

🛠️ Quick fix

1 Set alerts on error rates, downtime, and key business metrics.
2 Route them to a channel you actually watch.
3 Tune thresholds so they signal, not spam.

🏆 Senior move

Alerts on outcomes you care about. Quiet otherwise.

🌿 09

CATEGORY 09

No Review, No Rollback

Mistakes in commits, reviews, branches, and dependencies.

0 / 6 answered

⚠️ High +7 PTS

Coding without commits

"I keep changing files without committing."

Do you do this?

⚡ Impact

When something breaks, you cannot roll back to a known-good state.

🔥 Why it breaks

Without commits, every change is permanent and entangled.

🛠️ Quick fix

1 Commit after each working unit, not at end of day.
2 Use clear, scoped commit messages.
3 Push often so work is not stuck on your laptop.

🏆 Senior move

Every working unit is a commit. The agent helps you keep them small.

⚠️ High +7 PTS

Accepting all AI changes blindly

"I clicked accept on everything."

Do you do this?

⚡ Impact

Hidden bad changes, broken patterns, and unwanted edits ride along into the codebase.

🔥 Why it breaks

Accept-all reverses the architect-engineer relationship: the agent decides, you ratify.

🛠️ Quick fix

1 Review the diff before accepting.
2 Reject changes outside the spec scope.
3 Ask the agent to redo when the diff is wrong, not patch what it produced.

🏆 Senior move

Diff review is the moment you stay in charge.

💡 Medium +4 PTS

Massive PRs nobody can review

"The agent changed 40 files in one PR."

Do you do this?

⚡ Impact

Reviews become rubber stamps. Bugs hide in the noise.

🔥 Why it breaks

Beyond a certain size, humans stop reading and start trusting.

🛠️ Quick fix

1 Split work into small units before implementation.
2 Ask the agent to keep diffs scoped to one unit.
3 Reject PRs that grew beyond their stated scope.

🏆 Senior move

Small PRs you read line by line; big PRs you only pretend to.

⚠️ High +7 PTS

Direct push to main

"I push straight to the production branch."

Do you do this?

⚡ Impact

Bad code goes live with no second pair of eyes.

🔥 Why it breaks

Branch protection exists because humans (and agents) make mistakes.

🛠️ Quick fix

1 Protect the main branch.
2 Require PRs and basic checks (build, tests) before merge.
3 Use preview deploys to verify before merging.

🏆 Senior move

Main is the final state. It is reached, not edited.

💡 Medium +4 PTS

Letting AI install random packages

"The agent installed packages I have never heard of."

Do you do this?

⚡ Impact

Supply-chain risk: malicious or abandoned packages enter your build.

🔥 Why it breaks

Many packages have similar names. Some are typosquats. Some are abandoned. Some are malicious.

🛠️ Quick fix

1 Review every new dependency before installing.
2 Run audit tools and pin versions.
3 Prefer well-known, maintained packages.

🏆 Senior move

Dependencies are decisions. Review them like code changes.

📌 Low +2 PTS

No changelog or release notes

"I have no idea what changed between yesterday and today."

Do you do this?

⚡ Impact

Debugging regressions becomes archeology.

🔥 Why it breaks

Without a record, every "when did this break" turns into a git bisect.

🛠️ Quick fix

1 Tag releases.
2 Keep a short changelog or notes per release.
3 Update progress-tracker.md after each unit.

🏆 Senior move

The progress tracker is your changelog.

🤖 10

CATEGORY 10

Agent Autonomy Danger

Mistakes specific to agentic apps: approvals, guardrails, state.

0 / 6 answered

⚠️ High +7 PTS

A chatbot pretending to be a workflow

"The agent decides what to do at every step."

Do you do this?

⚡ Impact

Behavior is unpredictable. Edge cases produce wildly different paths every run.

🔥 Why it breaks

Open-ended agency is hard to test, hard to constrain, and hard to debug.

🛠️ Quick fix

1 Replace open-ended agency with a state machine of explicit steps.
2 Use the model only where decisions are actually fuzzy.
3 Write tests for the workflow paths you support.

🏆 Senior move

Workflows for the predictable parts. Agency only where it earns its keep.

🚨 Critical +10 PTS

No approval gates on risky actions

"The agent can send, charge, delete, or deploy without asking."

Do you do this?

⚡ Impact

A small mistake becomes a real-world incident with money or data on the line.

🔥 Why it breaks

Side effects are not reversible the way edits are. Agents need help knowing where the line is.

🛠️ Quick fix

1 Identify the risky tools: send, charge, delete, deploy, post.
2 Require human approval before they fire.
3 Show the action's payload before approval, not after.

🏆 Senior move

Approvals on side effects, every time, until the workflow has earned trust.

⚠️ High +7 PTS

Tool calls trusted automatically

"The agent's tool inputs are passed through with no checks."

Do you do this?

⚡ Impact

Bad inputs mutate systems, leak data, or trigger expensive actions.

🔥 Why it breaks

Tool inputs come from model output, which can hallucinate or be steered by prompt injection.

🛠️ Quick fix

1 Validate inputs before running the tool.
2 Validate outputs before returning them to the model or user.
3 Add allowlists for high-risk parameters.

🏆 Senior move

Tool wrappers enforce contracts. The model proposes; the wrapper validates.

💡 Medium +4 PTS

No durable task state

"An agent run got interrupted and the work disappeared."

Do you do this?

⚡ Impact

Long-running tasks fail mid-way and either repeat work or lose progress.

🔥 Why it breaks

Agents that hold state in memory die with their process.

🛠️ Quick fix

1 Persist task state at meaningful checkpoints.
2 Use a queue or workflow engine for durable execution.
3 Make tasks resumable from the last checkpoint.

🏆 Senior move

Long tasks are workflows, not one-shot calls.

💡 Medium +4 PTS

No audit log for agent decisions

"We don't know who approved what or why."

Do you do this?

⚡ Impact

When something goes wrong, there is no record of the decision trail.

🔥 Why it breaks

Agentic systems make many small decisions. Without an audit trail, accountability is impossible.

🛠️ Quick fix

1 Log approvals, tool calls, and state transitions with run IDs.
2 Record the inputs and outputs of risky actions.
3 Keep logs long enough for incident review.

🏆 Senior move

Audit logs make agentic systems reviewable, not just runnable.

💡 Medium +4 PTS

Too many agents too early

"I added 10 agents for what is really a simple app."

Do you do this?

⚡ Impact

More complexity, more failure modes, more cost — and no extra reliability.

🔥 Why it breaks

Multi-agent systems multiply latency and bugs without multiplying capability.

🛠️ Quick fix

1 Start with one workflow.
2 Introduce specialist agents only when a clear need appears.
3 Measure whether each agent earns its keep.

🏆 Senior move

One agent until proven necessary. Coordination is its own bug surface.

⚡ YOUR RESULT

Your Vibe Risk Score

You're not bad at building — AI just made the demo easy and hid the real engineering work.

Answer some cards to see your score

Click "Yes, I do this" on what looks familiar above. Your score updates live.

Score persists locally on this device. Reset to start over.

Your AI-built appworks on localhost. Does it actually workfor real users?

Prompt Chaos

Vague prompts produce vague apps

Asking for the whole SaaS in one prompt

No definition of done

The agent forgets everything between sessions

Stuck in a fix-this regression loop

Letting AI return prose for machine data

Localhost Lies

Localhost is not a real URL

Deployed app still calls localhost

Missing env vars in production

Preview worked, production broke

Built successfully, crashes when used

Server runs but nobody can reach it

Static host for a dynamic app

No rollback plan

Secret Leaks

API key shipped in the frontend bundle

Secret committed to Git

Service-role key in the browser

Over-broad OAuth scopes

Agent has no permission layer

Treating external content as trusted instructions

Coding agent runs unsandboxed on your machine

Fake Auth

A login page is not real auth

Permission checks only in the UI

No ownership checks on records

Tested with only one user account

Admin role hardcoded to your email

Client sees internal data

Database Regret

Accepted AI-generated schema without review

No row-level security or backend authorization

Schema changes made manually

Local SQLite file in a serverless app

No indexes on common queries

No backups or untested backups

Validation only in the UI

Payment Traps

Success URL unlocks premium

Webhook endpoint accepts anything

No idempotency on payment actions

Test and live keys mixed

Subscription state out of sync

Happy-Path Illusions

Happy-path only testing

Never tested on mobile

Never tested with bad input

Never tested an expired session

Never tested rapid double-click

No evals for AI features

Silent Failures

No logs in production

No traces for agent runs

No health checks

No cost monitoring on AI calls

No rate-limit handling

No alerting

No Review, No Rollback

Coding without commits

Accepting all AI changes blindly

Massive PRs nobody can review

Direct push to main

Letting AI install random packages

No changelog or release notes

Agent Autonomy Danger

A chatbot pretending to be a workflow

No approval gates on risky actions

Tool calls trusted automatically

No durable task state

No audit log for agent decisions

Too many agents too early

Your Vibe Risk Score

Your AI-built app
works on localhost. Does it actually work
for real users?