ArchitectureSystem-WideComplex · 1-3 weeksArchitecture Review & Refactor
Get a structured review of a refactor plan with breaking-change and risk assessment.
When to use
Use this when refactoring a module to support a new requirement and you want an architectural sanity check before shipping.
I'm refactoring [module] to support [new requirement]. Can you review this architecture pattern:
1. Current structure: [describe]
2. Proposed changes: [describe]
3. Impact areas: [list]
Help me identify breaking changes, performance implications, and database migration strategy. Output: detailed refactor plan with risk assessment.
Attach when sending
- Current architecture diagram or summary
- Proposed changes
- List of impacted modules
TipReplace bracketed items [like_this] with your specific details before sending. The more context you paste alongside, the sharper the answer.
IntegrationSingle ModuleMedium · 1-3 daysThird-Party API Integration
Plan a robust third-party API integration with retries, circuit breakers, and schema changes.
When to use
Use this when integrating Stripe, Razorpay, Twilio, OpenAI, Salesforce, or any external API that has rate limits and webhooks.
I need to integrate [API_NAME] into [feature]. Help me:
1. Design the API client wrapper
2. Handle error cases and rate limiting
3. Plan database schema changes
4. Add retry logic and circuit breaker
Provide: code structure, test cases, and deployment checklist.
Attach when sending
- API provider docs link
- Operations you need (create, refund, webhook handling)
- Existing HTTP client patterns in your codebase
TipAlways reconcile state with the provider on a schedule. Webhooks will get lost. The reconciliation job is your truth.
OptimizationCross-ModuleComplex · 1-3 weeksPerformance Optimization Spec
Turn a latency or resource problem into a prioritised optimization plan.
When to use
Use this when you have measurements showing a real bottleneck (latency, memory, CPU) and need a roadmap of fixes ranked by impact.
We're hitting [N ms latency / high memory usage / Y% CPU] in [component/endpoint]. Current metrics:
- Baseline: [measurement]
- Target: [goal]
- Constraints: [browser/db limits]
Give me a performance optimization plan with: bottleneck analysis, solutions ranked by impact, implementation steps, and monitoring strategy.
Attach when sending
- Profiler output, APM screenshot, or flamegraph
- Current vs target metrics
- Stack details (framework, database, hosting)
TipMeasure twice. A 50ms win that does not move a real user-perceived metric is not worth shipping.
ArchitectureSystem-WideComplex · 1-3 weeksDatabase Schema & Migration
Plan a zero-downtime database schema change with backfill and rollback.
When to use
Use this when you need to add, remove, or restructure columns or tables on a production database that cannot tolerate downtime.
I need to add [feature] which requires restructuring [table]. Current schema:
[paste schema]
Proposed changes:
[describe]
Help me: design zero-downtime migration, backfill strategy, rollback plan, and handle data consistency during transition. Output: SQL migration script with comments.
Attach when sending
- Current schema (DDL or ORM definition)
- Sample row
- Database engine and approximate row count
TipUse the expand-contract pattern. Add columns first, dual-write, backfill, switch reads, then remove old columns in a later release.
ArchitectureSingle ModuleMedium · 1-3 daysState Management Design
Pick the right state pattern and migrate without introducing two sources of truth.
When to use
Use this when your component tree has too many context providers, prop-drilling is painful, or your Redux store has grown faster than the team can maintain.
Our [React/Vue/etc] component tree is getting complex. I need state management for [feature]:
- Current state flow: [describe]
- Pain points: [list]
- Scale: [N components, X data sources]
Recommend a pattern (Context/Redux/Zustand/etc) with code skeleton. Include: actions, reducers, and testing approach.
Attach when sending
- A representative component file
- Your current store or context
- The data flow that hurts the most
TipMixing state libraries during the transition is acceptable. Mixing sources of truth for the same value is not.
ArchitectureCross-ModuleMedium · 1-3 daysError Handling Strategy
Unify error handling across services with a consistent logging, retry, and UX strategy.
When to use
Use this when error handling is scattered, when users see raw stack traces, or when the team cannot agree on what to log vs alert vs swallow.
Our error handling is fragmented across [N] modules. I need a unified strategy:
- Current errors: [list key error types]
- Desired behavior: [describe logging/alerting/retry]
- Frontend: [error display requirements]
Design: error class hierarchy, middleware pattern, and user-facing error messages.
Attach when sending
- A sample stack trace from each major module
- How errors are currently displayed to users
- Existing logging/monitoring setup
TipDistinguish three layers: developer errors (bugs, never reach users), domain errors (expected, user-facing), and infrastructure errors (transient, retry).
TestingSingle ModuleMedium · 1-3 daysTesting Strategy for Feature
Decide the test pyramid for a new feature: what to unit test, integrate, or e2e.
When to use
Use this when starting a new feature and you want a thought-out testing plan before you write a single test.
I'm building [feature]. Help me plan testing:
- User flows: [describe main paths]
- Edge cases: [list known edge cases]
- Dependencies: [external APIs / databases / services]
- Current coverage: [percentage]
Output: test pyramid (unit/integration/e2e split), test cases, mocking strategy, and CI/CD hooks.
Attach when sending
- Feature spec or PRD
- Existing test setup
- Timeline and team size
TipThe most valuable test is usually the one that catches the regression you are most afraid of. Write that one first.
ArchitectureSystem-WideMedium · 1-3 daysBackward Compatibility & Deprecation
Plan a breaking change with deprecation warnings, versioning, and a removal timeline.
When to use
Use this when you need to change an API or interface that other teams or external clients depend on.
We're changing the [API endpoint / database table / function signature]. Impact:
- Consumers: [internal/external]
- Breaking: [yes/no, list breaking changes]
- Timeline: [when can we fully break?]
Design: deprecation warnings, versioning strategy, client migration path, and timeline for removal.
Attach when sending
- Old and new shape side by side
- List of consumers and how to reach them
- Telemetry on who is still using the old version
TipAdd telemetry to the old version on day one of deprecation. You cannot remove what you cannot prove is unused.
ArchitectureSingle ModuleQuick Win · 1-3 hoursFeature Flag Implementation
Wire up feature flags for gradual rollout with monitoring and rollback triggers.
When to use
Use this when shipping a feature behind a kill switch, doing a gradual percentage rollout, or running an A/B test.
I need feature flags for [feature] to enable gradual rollout:
- Rollout: [10% -> 50% -> 100% over X days]
- Rollback: [conditions that trigger rollback]
- Analytics: [metrics to track]
Help with: flag naming, percentage targeting, config structure, and monitoring setup.
Attach when sending
- Feature flag provider (LaunchDarkly, Unleash, homegrown)
- Rollback trigger conditions
- Metrics that define success
TipEvery flag should have a removal date. Long-lived flags accumulate into "permanent if statements" that nobody understands.
OptimizationSingle ModuleMedium · 1-3 daysCache Strategy Design
Design a multi-layer cache with TTLs, invalidation, and stale-while-revalidate.
When to use
Use this when reads dominate your load, when database queries are repeated unnecessarily, or when you want to add a CDN/edge layer.
I need caching for [endpoint/query]. Context:
- Data freshness: [must be < X seconds]
- Write frequency: [N writes per hour]
- Read volume: [N reads per second]
- Cache size constraint: [Y MB]
Design: cache layers (browser/CDN/redis), invalidation strategy, and stale-while-revalidate logic.
Attach when sending
- The hot endpoint or query
- Read/write ratio
- Available cache backend
TipCache failures should be invisible. Wrap the cache read in try/catch and fall through to source-of-truth on any error.
ArchitectureSystem-WideComplex · 1-3 weeksEvent-Driven Architecture
Move a synchronous flow to events with a queue, idempotent consumers, and dead-letter handling.
When to use
Use this when synchronous chains have grown long, when you need to decouple producers from consumers, or when you need an audit log of business events.
We're moving [domain] to event-driven. Current flow:
[describe synchronous process]
Required events: [list]
Consumers: [list services that need these events]
Design: event schema, producer/consumer pattern, message queue choice (Kafka/RabbitMQ/SQS), failure handling, and dead-letter queue strategy.
Attach when sending
- The synchronous flow we want to break apart
- Database tables involved
- Existing infra (queues, brokers)
TipDomain events describe what already happened ("OrderPlaced"), not commands ("PlaceOrder"). Imperative names mean you are doing async messaging, not event-driven.
ArchitectureSystem-WideComplex · 1-3 weeksMulti-Tenant Implementation
Pick a multi-tenancy model (shared schema, schema-per-tenant, DB-per-tenant) and design the data layer.
When to use
Use this when adding multi-tenancy to a single-tenant product, or re-evaluating your model because of isolation, performance, or compliance pressure.
We're adding multi-tenancy to [app]. Current setup:
- Single database: [yes/no]
- Shared vs isolated data: [describe]
- Number of customers: [N]
- Scaling goals: [M customers in X years]
Design: data isolation strategy, query patterns, database selection, and row-level security.
Attach when sending
- Current schema
- Tenant size distribution
- Compliance requirements (HIPAA, SOC 2, DPDP, residency)
TipPick the smallest model that meets your isolation requirements. Most SaaS under 1,000 tenants does fine with shared schema + row-level security.
ArchitectureSystem-WideComplex · 1-3 weeksMicroservice Decomposition
Extract a service from a monolith with bounded contexts, data ownership, and a strangler-fig plan.
When to use
Use this when a module has grown into its own bounded context, has different scaling needs, or team ownership boundaries no longer fit.
I'm extracting [domain] into a microservice. Current monolith:
- Database: [structure]
- Dependencies: [list services it talks to]
- Load: [X requests/sec]
- Deployment: [current frequency]
Plan: service boundaries, data ownership, API design, communication patterns (sync/async), and monitoring.
Attach when sending
- Module-to-module dependency map
- Database ERD or table list
- Deployment platform
TipThe hardest part is usually shared transactions. List every cross-module transaction before you commit to a cut-over date.
ArchitectureSystem-WideComplex · 1-3 weeksData Pipeline Design
Design ETL with the right tool, batch vs streaming, error handling, and idempotency.
When to use
Use this when moving data between systems on a schedule or in real-time, with transformations, deduplication, and at-least-once delivery.
I need to build a data pipeline for [use case]:
- Source: [data origin]
- Frequency: [real-time / batch timing]
- Volume: [N records per X time]
- Processing: [transformations needed]
- Destination: [where data goes]
Design: ETL architecture, tool selection (Airflow/Spark/etc), error handling, and idempotency.
Attach when sending
- Source data schema and sample
- Destination schema
- Volume and SLA
TipMake every step idempotent. Pipelines re-run. If a step is not idempotent, double-effects are guaranteed eventually.
OptimizationSingle ModuleQuick Win · 1-3 hoursCode Generation & Scaffolding
Write a generator for repeated boilerplate (API clients, types, scaffolded resources).
When to use
Use this when you copy the same file pattern to add new resources and the pattern is stable enough to encode.
I'm building code generation for [artifact type: API client / types / etc]. Inputs:
- Source: [OpenAPI spec / GraphQL schema / database]
- Output: [target language / framework]
- Constraints: [customization points]
Design: generator architecture, template strategy, and versioning approach.
Attach when sending
- Two or three existing examples of the pattern
- The files that get duplicated
- How often a new instance is added
TipScaffolding makes repetition cheaper but does not remove duplication. If the same fix has to be applied to 20 generated files, you needed abstraction, not a generator.
OptimizationSystem-WideMedium · 1-3 daysMonitoring & Observability Setup
Wire up logs, metrics, traces, and alerts so production stops being a black box.
When to use
Use this when production incidents take hours to diagnose, or when you have console.log everywhere and nothing structured.
I need to instrument [feature/service] for observability:
- Current monitoring: [describe gaps]
- SLOs: [latency targets / uptime %]
- Alerting: [what matters to stakeholders]
- Tools: [prometheus/datadog/etc]
Design: metrics to emit, logging strategy, distributed tracing, and alert thresholds.
Attach when sending
- Current logging/monitoring setup
- Most recent incident and how you debugged it
- Monthly budget
TipIf you measure everything, you measure nothing. Start with RED (rate, errors, duration) for services and USE (utilisation, saturation, errors) for infrastructure.
ArchitectureSingle ModuleMedium · 1-3 daysSecurity Review & Hardening
Run OWASP-style threat modeling on a new feature before it ships.
When to use
Use this when shipping anything that touches auth, payments, file uploads, user content, third-party data, or a regulated context.
I'm adding [feature] that handles [sensitive data]. Security requirements:
- Compliance: [GDPR/HIPAA/SOC2/etc]
- Threat model: [describe attackers]
- Data: [what gets stored / transmitted]
- Users: [authentication method]
Conduct: threat modeling, recommend controls, and produce security checklist.
Attach when sending
- Feature PRD or design doc
- New endpoints, inputs, and data flows
- New dependencies if any
TipAuthorization bugs (IDOR, missing ownership check) are the most common production breach pattern. Verify ownership on every fetch and every write.
OptimizationSingle ModuleQuick Win · 1-3 hoursDatabase Query Optimization
Read an EXPLAIN and produce an indexed, rewritten version of a slow query.
When to use
Use this when you have a specific query that is the bottleneck (or one of the top queries by total time in your APM).
This query is slow: [paste SQL / ORM query]. Context:
- Current latency: [N ms]
- Target latency: [M ms]
- Table size: [X rows]
- Index strategy: [describe current]
Analyze: execution plan, recommend indexes, query rewrites, and denormalization if needed.
Attach when sending
- Exact query
- EXPLAIN ANALYZE output
- Table size and current indexes
TipA composite index on (filter_column, sort_column) often beats two single-column indexes for queries that filter and sort together.
MigrationSystem-WideComplex · 1-3 weeksDatabase Migration (Major Version)
Plan a migration between database systems with parallel run, validation, and rollback.
When to use
Use this when moving between database vendors, between major versions, or from self-managed to a managed service.
We're migrating from [DB A] to [DB B]. Context:
- Data size: [N GB]
- Downtime tolerance: [X seconds]
- Current load: [N req/sec]
- Compliance: [data residency / encryption]
Plan: parallel run strategy, cutover timing, validation checks, and rollback procedure.
Attach when sending
- Source and target database systems
- Schema and approximate data volume
- Downtime tolerance
TipDual-write during migration sounds simple and is not. CDC + shadow reads is usually safer for systems above modest volume.
ArchitectureCross-ModuleMedium · 1-3 daysAPI Versioning & Evolution
Pick a versioning scheme and deprecation policy that survives real client diversity.
When to use
Use this when you need to make breaking API changes and you have clients you cannot force-update.
Our API needs versioning. Current state:
- Consumers: [internal/external count]
- Change frequency: [how often breaking changes happen]
- Support window: [how many versions to support]
- Protocol: [REST/GraphQL/gRPC]
Design: versioning scheme, deprecation timeline, backward compatibility matrix, and migration tools.
Attach when sending
- Current API surface
- Sample breaking change
- Client distribution
TipDate-based versioning works great for fast-evolving SaaS (Stripe-style). URL versioning is fine for stable APIs. Header versioning is usually a trap.
TestingSystem-WideComplex · 1-3 weeksLoad Testing & Capacity Planning
Design realistic load tests and derive infrastructure capacity from results.
When to use
Use this before a launch, a marketing campaign, or when you have a real capacity question like "can we handle 10x?"
I need to capacity plan for [feature]. Expected load:
- Peak: [N req/sec]
- Concurrent users: [M]
- Data growth: [X GB per Y time]
- SLA: [latency / availability targets]
Design: load test scenarios, tools (k6/JMeter/etc), identify bottlenecks, and infrastructure needs.
Attach when sending
- Current production metrics (RPS, latency)
- Expected peak load
- Critical user paths
TipLoad tests without realistic think-time and session distribution overestimate capacity by 2-5x. Model the user, not the synthetic stream.
ArchitectureSingle ModuleQuick Win · 1-3 hoursDependency Management Strategy
Plan a major dependency upgrade with breaking-change analysis and a staged rollout.
When to use
Use this when a major dependency is behind by one or more major versions, when EOL is approaching, or when you need a feature gated behind an upgrade.
Our dependencies are out of date and fragmented:
- Node/Python/etc version: [current]
- Breaking changes: [list major upgrades needed]
- Security: [known vulnerabilities]
Create: upgrade strategy with breaking change analysis, testing plan, and rollout timeline.
Attach when sending
- Current and target versions
- Changelog link
- Number of usages in codebase
TipUpgrade one minor at a time if a major bump fails. Going N to N+1 to N+2 reveals where the real breakage is.
ArchitectureSystem-WideComplex · 1-3 weeksAsync Job Processing System
Build a job queue with retries, idempotency, and observability.
When to use
Use this when you need to run async work (emails, billing, exports, syncs) and your "fire-and-forget" pattern is starting to lose jobs.
I need async jobs for [use case]: [describe sync process]. Requirements:
- Latency: [when should jobs complete]
- Reliability: [retry behavior]
- Visibility: [user should see progress: yes/no]
- Concurrency: [N parallel jobs]
Design: job queue, worker architecture, error handling, and monitoring.
Attach when sending
- List of jobs you run (or want to)
- Volume and timing characteristics
- Deployment platform
TipOn Vercel or Netlify, prefer a managed queue (Inngest, Trigger.dev, QStash) over rolling your own. Serverless functions and long-running workers do not mix well.
ArchitectureSingle ModuleMedium · 1-3 daysRate Limiting & Throttling
Choose an algorithm, design global vs per-user limits, and communicate them to clients.
When to use
Use this when adding API rate limits, protecting against abuse, or enforcing per-plan quotas.
I need rate limiting for [API/feature]. Context:
- Users: [authenticated / anonymous]
- Limits: [requests per X time]
- Burst: [allow temporary spikes: yes/no]
- Global vs per-user: [describe]
Design: algorithm (token bucket/leaky bucket), Redis strategy, and user communication.
Attach when sending
- Endpoint to protect
- Existing auth model (per-user vs IP)
- Available cache/store (Redis, Upstash)
TipToken bucket allows controlled bursts and is what real APIs use. Fixed-window counters look simple and create edge-of-window thundering herds.
OptimizationSystem-WideMedium · 1-3 daysCI/CD Pipeline Enhancement
Speed up CI, harden deploys, and add canary or feature-flag rollouts.
When to use
Use this when CI takes longer than your patience, when deploys are scary, or when you want preview environments and parallel tests.
Our CI/CD is slow and flaky. Current state:
- Pipeline duration: [N minutes]
- Failure rate: [X%]
- Parallelization: [describe]
- Environments: [dev/staging/prod]
Optimize: test parallelization, caching strategy, failure analysis, and deployment automation.
Attach when sending
- Pipeline definition file
- Average CI duration and slow stages
- Team size and deploy frequency
TipThe biggest CI win is usually caching dependencies properly. Profile your pipeline before adding more parallelism.
ArchitectureSystem-WideComplex · 1-3 weeksContainerization & Orchestration
Containerize a service with sane resource limits, health checks, and orchestrator manifests.
When to use
Use this when moving from VMs to containers, when adopting Kubernetes, or when current deploys are unrepeatable.
I'm containerizing [service]. Current deployment:
[describe monolith / vm setup]
Requirements:
- Orchestration: [Kubernetes / Docker Swarm / other]
- Scale: [min N, max M replicas]
- Updates: [blue-green / rolling / canary]
Design: Dockerfile optimization, resource limits, health checks, and deployment manifests.
Attach when sending
- Current deployment process
- Resource usage from production (CPU, memory)
- Target orchestrator
TipMulti-stage Dockerfiles cut image size dramatically. Build deps live in the first stage, only the runtime artifact ships in the final image.
OptimizationSingle ModuleMedium · 1-3 daysInstrumentation & Profiling
Capture profiles in production safely and turn them into ranked fixes.
When to use
Use this when you have a slow function or endpoint and you have not yet profiled it. Stop guessing.
I need to profile [component] to find bottlenecks. Context:
- Baseline metric: [latency / memory / CPU]
- Acceptable range: [target]
- Environment: [dev / staging / production]
- Tools available: [list]
Plan: profiling strategy, instrumentation points, analysis approach, and optimization roadmap.
Attach when sending
- The slow function or endpoint
- Captured profile if available
- Current vs target latency
TipProfile in production traffic or production-like load. Synthetic micro-benchmarks lie.
IntegrationCross-ModuleComplex · 1-3 weeksReal-Time Feature Implementation
Pick the right transport (WebSocket, SSE, long-poll, managed) and design reconnection + ordering.
When to use
Use this when adding live notifications, chat, presence, collaborative editing, or live dashboards.
I'm adding real-time [feature: notifications / updates / collaboration]. Requirements:
- Latency: [< X ms]
- Users: [M concurrent]
- Data: [what changes are broadcast]
- Protocol: [WebSocket / Server-Sent Events / MQTT]
Design: message broker, client architecture, reconnection handling, and conflict resolution.
Attach when sending
- Existing architecture
- Sample event payload
- Peak concurrency and message rate
TipA managed service (Pusher, Ably, Supabase Realtime) is almost always cheaper than self-hosting WebSocket infra under 10,000 concurrent connections.
IntegrationCross-ModuleComplex · 1-3 weeksMobile Backend Design
Build a Backend-for-Frontend for mobile with offline sync, push, and forced-update flows.
When to use
Use this when launching a mobile app against a web-shaped API, or when offline-first requires server-driven UI.
We're launching a [iOS/Android/React Native] mobile app against our existing [REST/GraphQL] API. The mobile app needs:
- Specific differences in shape: [describe]
- Offline-first requirements: [yes/no, level]
- Push notifications: [yes/no]
- App size and battery sensitivity: [normal / aggressive]
Design: BFF protocol choice, payload aggregation, versioning + forced-update, caching/sync, offline queueing, and push notification delivery.
Attach when sending
- Current web API surface
- Mobile feature spec
- Target devices and network conditions
TipA BFF is worth it once mobile and web diverge by more than 30%. Below that, version the web API for both.
MigrationCross-ModuleComplex · 1-3 weeksLegacy Code Refactoring
Plan a strangler-fig refactor of a fragile module with characterisation tests at each step.
When to use
Use this when you need to refactor a critical module everyone is afraid to touch, with thin tests, on the critical path.
I need to refactor [module_name] which has these problems: [list_problems]. Constraints:
- Test coverage: [percentage]
- Production traffic: [yes/no]
- Cannot ship a multi-week stop-the-world rewrite
- The module is called from: [list_callers]
Plan: characterisation tests, seams to introduce, strangler-fig sequence of 5-10 shippable commits, rollback at each step, and feature-flag strategy.
Attach when sending
- The module to refactor
- List of callers and how they use it
- Current test coverage
TipCharacterisation tests first. They pin the current behaviour even if it is buggy. You decide later whether to keep the bug.