Ross Wightman
f5ca4141f7
Adjust arg order for recent vit model args, add a few comments
3 years ago
Ross Wightman
41dc49a337
Vision Transformer refactoring and Rel Pos impl
3 years ago
Ross Wightman
1618527098
Add layer scale and parallel blocks to vision_transformer
3 years ago
Ross Wightman
0862e6ebae
Fix correctness of some group matching regex (no impact on result), some formatting, missed forward_head for resnet
3 years ago
Ross Wightman
372ad5fa0d
Significant model refactor and additions:
...
* All models updated with revised foward_features / forward_head interface
* Vision transformer and MLP based models consistently output sequence from forward_features (pooling or token selection considered part of 'head')
* WIP param grouping interface to allow consistent grouping of parameters for layer-wise decay across all model types
* Add gradient checkpointing support to a significant % of models, especially popular architectures
* Formatting and interface consistency improvements across models
* layer-wise LR decay impl part of optimizer factory w/ scale support in scheduler
* Poolformer and Volo architectures added
3 years ago
Ross Wightman
5f81d4de23
Move DeiT to own file, vit getting crowded. Working towards fixing #1029 , make pooling interface for transformers and mlp closer to convnets. Still working through some details...
3 years ago
Ross Wightman
95cfc9b3e8
Merge remote-tracking branch 'origin/master' into norm_norm_norm
3 years ago
Ross Wightman
abc9ba2544
Transitioning default_cfg -> pretrained_cfg. Improving handling of pretrained_cfg source (HF-Hub, files, timm config, etc). Checkpoint handling tweaks.
3 years ago
Ross Wightman
07379c6d5d
Add vit_base2_patch32_256 for a model between base_patch16 and patch32 with a slightly larger img size and width
3 years ago
Ross Wightman
010b486590
Add Dino pretrained weights (no head) for vit models. Add support to tests and helpers for models w/ no classifier (num_classes=0 in pretrained cfg)
3 years ago
Ross Wightman
e967c72875
Update REAMDE.md. Sneak in g/G (giant / gigantic?) ViT defs from scaling paper
3 years ago
Ross Wightman
656757d26b
Fix MobileNetV2 head conv size for multiplier < 1.0. Add some missing modification copyrights, fix starting date of some old ones.
3 years ago
Martins Bruveris
5220711d87
Added B/8 models to ViT.
3 years ago
Thomas Viehmann
f805ba86d9
use .unbind instead of explicitly listing the indices
3 years ago
Ross Wightman
78933122c9
Fix silly typo
3 years ago
Ross Wightman
708d87a813
Fix ViT SAM weight compat as weights at URL changed to not use repr layer. Fix #825 . Tweak optim test.
3 years ago
Ying Jin
20b2d4b69d
Use bicubic interpolation in resize_pos_embed()
3 years ago
Ross Wightman
6d8272e92c
Add SAM pretrained model defs/weights for ViT B16 and B32 models.
3 years ago
Ross Wightman
85f894e03d
Fix ViT in21k representation (pre_logits) layer handling across old and new npz checkpoints
3 years ago
Ross Wightman
b41cffaa93
Fix a few issues loading pretrained vit/bit npz weights w/ num_classes=0 __init__ arg. Missed a few other small classifier handling detail on Mlp, GhostNet, Levit. Should fix #713
3 years ago
Ross Wightman
9c9755a808
AugReg release
3 years ago
Ross Wightman
b319eb5b5d
Update ViT weights, more details to be added before merge.
3 years ago
Ross Wightman
b9cfb64412
Support npz custom load for vision transformer hybrid models. Add posembed rescale for npz load.
3 years ago
Ross Wightman
8880f696b6
Refactoring, cleanup, improved test coverage.
...
* Add eca_nfnet_l2 weights, 84.7 @ 384x384
* All 'non-std' (ie transformer / mlp) models have classifier / default_cfg test added
* Fix #694 reset_classifer / num_features / forward_features / num_classes=0 consistency for transformer / mlp models
* Add direct loading of npz to vision transformer (pure transformer so far, hybrid to come)
* Rename vit_deit* to deit_*
* Remove some deprecated vit hybrid model defs
* Clean up classifier flatten for conv classifiers and unusual cases (mobilenetv3/ghostnet)
* Remove explicit model fns for levit conv, just pass in arg
3 years ago
Ross Wightman
bfc72f75d3
Expand scope of testing for non-std vision transformer / mlp models. Some related cleanup and create fn cleanup for all vision transformer and mlp models. More CoaT weights.
4 years ago
Ross Wightman
30b9880d06
Minor adjustment, mutable default arg, extra check of valid len...
4 years ago
Alexander Soare
8086943b6f
allow resize positional embeddings to non-square grid
4 years ago
Ross Wightman
b2c305c2aa
Move Mlp and PatchEmbed modules into layers. Being used in lots of models now...
4 years ago
Ross Wightman
a0492e3b48
A few miil weights naming tweaks to improve compat with model registry and filtering wildcards.
4 years ago
talrid
19e1b67a84
old spaces
4 years ago
talrid
a443865876
update naming and scores
4 years ago
talrid
cf0e371594
84_0
4 years ago
talrid
0968bdeca3
vit, tresnet and mobilenetV3 ImageNet-21K-P weights
4 years ago
Ross Wightman
f606c45c38
Add Swin Transformer models from https://github.com/microsoft/Swin-Transformer
4 years ago
Ross Wightman
bf2ca6bdf4
Merge jax and original weight init
4 years ago
Ross Wightman
acbd698c83
Update README.md with updates. Small tweak to head_dist handling.
4 years ago
Ross Wightman
288682796f
Update benchmark script to add precision arg. Fix some downstream (DeiT) compat issues with latest changes. Bump version to 0.4.7
4 years ago
Ross Wightman
ea9c9550b2
Fully move ViT hybrids to their own file, including embedding module. Remove some extra DeiT models that were for benchmarking only.
4 years ago
Ross Wightman
a5310a3451
Merge remote-tracking branch 'origin/benchmark-fixes-vit_hybrids' into pit_and_vit_update
4 years ago
Ross Wightman
7953e5d11a
Fix pos_embed scaling for ViT and num_classes != 1000 for pretrained distilled deit and pit models. Fix #426 and fix #433
4 years ago
Ross Wightman
a760a4c3f4
Some ViT cleanup, merge distilled model with main, fixup torchscript support for distilled models
4 years ago
Ross Wightman
cf5fec5047
Cleanup experimental vit weight init a bit
4 years ago
Ross Wightman
cbcb76d72c
Should have included Conv2d layers in original weight init. Lets see what the impact is...
4 years ago
Ross Wightman
4de57ccf01
Add weight init scheme that's closer to JAX impl
4 years ago
Ross Wightman
45c048ba13
A few minor fixes and bit more cleanup on the huggingface hub integration.
4 years ago
Ross Wightman
d584e7f617
Support for huggingface hub via create_model and default_cfgs.
...
* improve consistency of model creation helper fns
* add comments to some of the model helpers
* support passing external default_cfgs so they can be sourced from hub
4 years ago
Ross Wightman
17cdee7354
Fix C&P patch_size error, and order of op patch_size arg resolution bug. Remove a test vit model.
4 years ago
Ross Wightman
0706d05d52
Benchmark models listed in txt file. Add more hybrid vit variants for testing
4 years ago
Ross Wightman
de97be9146
Spell out diff between my small and deit small vit models.
4 years ago
Ross Wightman
f0ffdf89b3
Add numerous experimental ViT Hybrid models w/ ResNetV2 base. Update the ViT naming for hybrids. Fix #426 for pretrained vit resizing.
4 years ago