Architecture That Actually Ships
- Use one primary model and one budget model.
- Keep prompts in versioned files.
- Add retry + timeout + fallback at API boundaries.
- Cache deterministic or low-variance prompts aggressively.
For founders, indie hackers, and teams under 20 engineers.
Start with a premium model for customer-facing outputs and only optimize cost after usage patterns stabilize for 2-4 weeks.