Published August 28, 2025 | Version v0.12.1
Software Open

Xilinx/brevitas: Release v0.12.1

  • 1. @Apple
  • 2. AMD
  • 3. DeepRender
  • 4. Zama.ai
  • 5. @AMD
  • 6. Paderborn University
  • 7. UC San Diego
  • 8. @zama-ai
  • 9. FlowZed
  • 10. National University of Science and Technology
  • 11. KRI @ Northeastern University
  • 12. @Point72
  • 13. @AMD Research Labs
  • 14. @Quansight

Description

Highlights

  • New / Updated PTQ Algorithms:
    • Qronos support https://github.com/Xilinx/brevitas/pull/1311
    • Fixes / improvements to rotation equalization https://github.com/Xilinx/brevitas/pull/1310, https://github.com/Xilinx/brevitas/pull/1312
    • DDP-like bias correction for SDXL (experimental) https://github.com/Xilinx/brevitas/pull/1342
  • Improved layer support:
    • Quantization of SDPA without FX https://github.com/Xilinx/brevitas/pull/1299
  • New export flows:
    • Initial SHARK Export support: https://github.com/Xilinx/brevitas/pull/1300
    • Initial GGUF Export: https://github.com/Xilinx/brevitas/pull/1291
  • Improved examples:
    • Qronos examples (paper) https://github.com/Xilinx/brevitas/pull/1311
    • "Benchmark" experiments for stable diffusion, imagenet https://github.com/Xilinx/brevitas/pull/1281
    • Post-training model expansion examples (paper) https://github.com/Xilinx/brevitas/pull/1355
  • Allow signed scales https://github.com/Xilinx/brevitas/pull/1308
  • QONNX export with dynamo=True https://github.com/Xilinx/brevitas/pull/1234

What's Changed

  • Fix (setup): solve incompatibility between isort and yapf by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1293
  • Feat (graph/hadamard): 152 had support by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1295
  • feat (ex/llm): fully parametrise attention quantization by @nickfraser in https://github.com/Xilinx/brevitas/pull/1287
  • Feat (ex/llm): "auto" dtype by @pablomlago in https://github.com/Xilinx/brevitas/pull/1301
  • Setup: update transformers version by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1304
  • Feat (brevitas_examples/llm): quant SDPA without FX by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1299
  • Fix (brevitas_examples/llm): update README and yaml by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1305
  • Setup: temporary pin pytest by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1307
  • Feat (brevitas_examples/llm): configurable expansion step by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1280
  • Versioning support in documentation by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1298
  • Feat (graph/rotate): improve R2 region in SDPA by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1310
  • Docs: improve docs build by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1314
  • Fix (graph/rotation): rotation on subset of channels for SDPA by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1312
  • Feat (brevitas_examples/llm): GGUF export by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1291
  • Setup: bump torch version by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1205
  • Fix (ex/imagenet): add forward pass in imagenet ptq example by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1316
  • Feat (qronos): initial implementation of Qronos by @i-colbert in https://github.com/Xilinx/brevitas/pull/1311
  • Fix (graph/equalize): correct class check during rotation merging by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1317
  • docs (core): Typo fix in docstring by @nickfraser in https://github.com/Xilinx/brevitas/pull/1318
  • Fix (llm): removing duplicate set_seed function by @i-colbert in https://github.com/Xilinx/brevitas/pull/1319
  • Fix (ex/common): save scales during optimization by @pablomlago in https://github.com/Xilinx/brevitas/pull/1313
  • Fix (llm): compatibility with non-uniform RMSNorm shapes by @i-colbert in https://github.com/Xilinx/brevitas/pull/1324
  • Fix (graph/gpxq): fix memory leak with weight_orig by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1325
  • Shark LLM export by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1300
  • Feat (scaling): rescaled min-max scaling and zero point by @i-colbert in https://github.com/Xilinx/brevitas/pull/1320
  • Fix (gguf): derived modify_tensors() returns generator by @i-colbert in https://github.com/Xilinx/brevitas/pull/1329
  • Fix (gpxq): device management of weight_orig for GPxQ by @i-colbert in https://github.com/Xilinx/brevitas/pull/1330
  • Fix (gguf): resolving zero point permutation issue with LlamaModel by @i-colbert in https://github.com/Xilinx/brevitas/pull/1332
  • Feat (examples): refactor imagenet and stable_diffusion entrypoints by @pablomlago in https://github.com/Xilinx/brevitas/pull/1281
  • Feat: skipping rotation optimization with load_checkpoint by @i-colbert in https://github.com/Xilinx/brevitas/pull/1331
  • Feat (core): Remove assumptions on positiveness of scales by @pablomlago in https://github.com/Xilinx/brevitas/pull/1308
  • Fix (export/qonnx): Add export support with dynamo=True by @nickfraser in https://github.com/Xilinx/brevitas/pull/1234
  • Fix (core/ops_ste): preserve dtype during clamp by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1340
  • Fix (eq/rotation): find value if passed as kwarg by @nickfraser in https://github.com/Xilinx/brevitas/pull/1338
  • Feat (ex): tests Stable Diffusion and ImageNet by @pablomlago in https://github.com/Xilinx/brevitas/pull/1339
  • Feat (ex/sdxl): DDP-like bias correction for SDXL by @pablomlago in https://github.com/Xilinx/brevitas/pull/1342
  • Feat (graph): Minor refactoring layerwise_layer_handler by @pablomlago in https://github.com/Xilinx/brevitas/pull/1335
  • Fix (torch_utils): remove deprecated functions by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1344
  • Fix (setup): test against latest 2.1 torch by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1348
  • Rotation fix by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1334
  • Docs (qronos): adding docs and configs by @i-colbert in https://github.com/Xilinx/brevitas/pull/1326
  • Feat: Added ONNX export to BNN-PYNQ example by @nickfraser in https://github.com/Xilinx/brevitas/pull/916
  • Fix (deps): set accelerate<1.10 by @nickfraser in https://github.com/Xilinx/brevitas/pull/1352
  • Fix (copyright): Fix some missing copyright headers by @nickfraser in https://github.com/Xilinx/brevitas/pull/1353
  • Fix (ex/llm/benchmark): import error by @nickfraser in https://github.com/Xilinx/brevitas/pull/1356
  • Setup: temporarily pin diffusers version by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1360
  • Fix (core/scaling): handle edge cases with signed scale by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1347
  • Fix(core/stochastic_round): adjust stochastic round device by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1359
  • Feat (papers): expansion paper configs by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1355
  • Fix (docs): correct link to docs in initial README by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/1361
  • Fix (setup.py): Update author and contact in setup.py by @nickfraser in https://github.com/Xilinx/brevitas/pull/1365
  • Docs (README): Update maximum PyTorch version by @nickfraser in https://github.com/Xilinx/brevitas/pull/1366
  • Docs (getting started): Typo fix in example by @nickfraser in https://github.com/Xilinx/brevitas/pull/1367
  • requirements: Updated PyTorch, python versions by @nickfraser in https://github.com/Xilinx/brevitas/pull/1370
  • Feat (core/scaling): Add option to restrict the output of (scale / threshold) by @nickfraser in https://github.com/Xilinx/brevitas/pull/1369
  • deps (ex) update accelerate version by @nickfraser in https://github.com/Xilinx/brevitas/pull/1371

Full Changelog: https://github.com/Xilinx/brevitas/compare/v0.12.0...v0.12.1

Files

Xilinx/brevitas-v0.12.1.zip

Files (30.5 MB)

Name Size Download all
md5:e0908c5fd1600f681fdaea59987b7da0
30.5 MB Preview Download

Additional details

Related works

Is supplement to
Software: https://github.com/Xilinx/brevitas/tree/v0.12.1 (URL)

Software