Published October 3, 2022
| Version v0.6.11
Software
Open
rwightman/pytorch-image-models: v0.6.11 Release
Creators
- 1. Weights & Biases
- 2. independent
- 3. Technical University of Darmstadt
- 4. @huggingface
- 5. Stony Brook Medicine
- 6. MIT
- 7. Kaggle Competition Master
- 8. @toss
- 9. Neuro Event Labs Oy
Description
Changes Since 0.6.7 Sept 23, 2022
- CLIP LAION-2B pretrained B/32, L/14, H/14, and g/14 image tower weights as vit models (for fine-tune)
- Hugging Face
timm
docs home now exists, look for more here in the future - Add BEiT-v2 weights for base and large 224x224 models from https://github.com/microsoft/unilm/tree/master/beit2
- Add more weights in
maxxvit
series incl apico
(7.5M params, 1.9 GMACs), twotiny
variants:maxvit_rmlp_pico_rw_256
- 80.5 @ 256, 81.3 @ 320 (T)maxvit_tiny_rw_224
- 83.5 @ 224 (G)maxvit_rmlp_tiny_rw_256
- 84.2 @ 256, 84.8 @ 320 (T)
- MaxVit window size scales with img_size by default. Add new RelPosMlp MaxViT weight that leverages this:
maxvit_rmlp_nano_rw_256
- 83.0 @ 256, 83.6 @ 320 (T)
- CoAtNet (https://arxiv.org/abs/2106.04803) and MaxVit (https://arxiv.org/abs/2204.01697)
timm
original models- both found in
maxxvit.py
model def, contains numerous experiments outside scope of original papers - an unfinished Tensorflow version from MaxVit authors can be found https://github.com/google-research/maxvit
- both found in
- Initial CoAtNet and MaxVit timm pretrained weights (working on more):
coatnet_nano_rw_224
- 81.7 @ 224 (T)coatnet_rmlp_nano_rw_224
- 82.0 @ 224, 82.8 @ 320 (T)coatnet_0_rw_224
- 82.4 (T) -- NOTE timm '0' coatnets have 2 more 3rd stage blockscoatnet_bn_0_rw_224
- 82.4 (T)maxvit_nano_rw_256
- 82.9 @ 256 (T)coatnet_rmlp_1_rw_224
- 83.4 @ 224, 84 @ 320 (T)coatnet_1_rw_224
- 83.6 @ 224 (G)- (T) = TPU trained with
bits_and_tpu
branch training code, (G) = GPU trained
- GCVit (weights adapted from https://github.com/NVlabs/GCVit, code 100%
timm
re-write for license purposes) - MViT-V2 (multi-scale vit, adapted from https://github.com/facebookresearch/mvit)
- EfficientFormer (adapted from https://github.com/snap-research/EfficientFormer)
- PyramidVisionTransformer-V2 (adapted from https://github.com/whai362/PVT)
- 'Fast Norm' support for LayerNorm and GroupNorm that avoids float32 upcast w/ AMP (uses APEX LN if available for further boost)
- ConvNeXt atto weights added
convnext_atto
- 75.7 @ 224, 77.0 @ 288convnext_atto_ols
- 75.9 @ 224, 77.2 @ 288
- More custom ConvNeXt smaller model defs with weights
convnext_femto
- 77.5 @ 224, 78.7 @ 288convnext_femto_ols
- 77.9 @ 224, 78.9 @ 288convnext_pico
- 79.5 @ 224, 80.4 @ 288convnext_pico_ols
- 79.5 @ 224, 80.5 @ 288convnext_nano_ols
- 80.9 @ 224, 81.6 @ 288
- Updated EdgeNeXt to improve ONNX export, add new base variant and weights from original (https://github.com/mmaaz60/EdgeNeXt)
- Add freshly minted DeiT-III Medium (width=512, depth=12, num_heads=8) model weights. Thanks Hugo Touvron!
Files
rwightman/pytorch-image-models-v0.6.11.zip
Files
(1.4 MB)
Name | Size | Download all |
---|---|---|
md5:163cdca1e7e3a4e6c46ecd9f6d97f633
|
1.4 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/rwightman/pytorch-image-models/tree/v0.6.11 (URL)