There is a newer version of the record available.

Published January 24, 2026 | Version 2026.1.5
Software Open

TensorZero

Authors/Creators

Description

[!CAUTION] Breaking Changes

  • TensorZero will normalize the reported usage from different model providers. Moving forward, input_tokens and output_tokens include all token variations (provider prompt caching, reasoning, etc.), just like OpenAI. Tokens cached by TensorZero remain excluded. You can still access the raw usage reported by providers with include_raw_usage.

[!WARNING] Planned Deprecations

  • Migrate include_original_response to include_raw_response. For advanced variant types, the former only returned the last model inference, whereas the latter returns every model inference with associated metadata.
  • Migrate allow_auto_detect_region = true to region = "sdk" when configuring AWS model providers. The behavior is identical.
  • Provide the proper API base rather than the full endpoint when configuring custom Anthropic providers. Example:
    • Before: api_base = "https://YOUR-RESOURCE-NAME.services.ai.azure.com/anthropic/v1/messages"
    • Now: api_base = "https://YOUR-RESOURCE-NAME.services.ai.azure.com/anthropic/v1/"

Bug Fixes

  • Fix a regression that triggered incorrect warnings about usage reporting for streaming inferences with Anthropic models.
  • Fix a bug in the TensorZero Python SDK that discarded some request fields in certain multi-turn inferences with tools.

New Features

  • Improve error handling across many areas: TensorZero UI, JSON deserialization, AWS providers, streaming inferences, timeouts, etc.
  • Support Valkey (Redis) for improving performance of rate limiting checks (recommended at 100+ QPS).
  • Support reasoning_effort for Gemini 3 models (mapped to thinkingLevel).
  • Improve handling of Anthropic reasoning models in TensorZero JSON functions. Moving forward, json_mode = "strict" will use the beta structured outputs feature; json_mode = "on" still uses the legacy assistant message prefill.
  • Improve handling of reasoning content in the OpenRouter and xAI model providers.
  • Add extra_headers support for embedding models. (thanks @jonaylor89!)
  • Support dynamic credentials for AWS Bedrock and AWS SageMaker model providers.

& multiple under-the-hood and UI improvements (thanks @ndoherty-xyz)!

Notes

If you use this software, please cite it as below.

Files

tensorzero/tensorzero-2026.1.5.zip

Files (32.4 MB)

Name Size Download all
md5:195017efd87dbe1fda786ed50459dd07
32.4 MB Preview Download

Additional details

Related works