Skip to Content

Cost Optimization

How can I reduce text API cost?

Use smaller models for draft generation, shorten prompts, trim long conversation history and cache reusable answers when possible.

How can I reduce image or video cost?

Validate prompts on cheaper models or smaller settings first, then run the final high-quality generation once the workflow is stable.

How do I monitor cost effectively?

Split keys by project, review key-level usage regularly and compare model usage patterns before changing routing or defaults.

Practical Checklist

  1. Use a cheaper fast model as the default
  2. Switch to stronger models only for hard tasks
  3. Set max_tokens conservatively
  4. Cache repeated outputs
  5. Review project-level usage weekly

The easiest win is usually model routing: keep a small model for draft work and call stronger models only on the final step.