Cost Optimization

How can I reduce text API cost?

Use smaller models for draft generation, shorten prompts, trim long conversation history and cache reusable answers when possible.

How can I reduce image or video cost?

Validate prompts on cheaper models or smaller settings first, then run the final high-quality generation once the workflow is stable.

How do I monitor cost effectively?

Split keys by project, review key-level usage regularly and compare model usage patterns before changing routing or defaults.

Practical Checklist

Use a cheaper fast model as the default
Switch to stronger models only for hard tasks
Set max_tokens conservatively
Cache repeated outputs
Review project-level usage weekly

The easiest win is usually model routing: keep a small model for draft work and call stronger models only on the final step.