Ross Wightman
e967c72875
Update REAMDE.md. Sneak in g/G (giant / gigantic?) ViT defs from scaling paper
3 years ago
Ross Wightman
656757d26b
Fix MobileNetV2 head conv size for multiplier < 1.0. Add some missing modification copyrights, fix starting date of some old ones.
3 years ago
Martins Bruveris
5220711d87
Added B/8 models to ViT.
3 years ago
Thomas Viehmann
f805ba86d9
use .unbind instead of explicitly listing the indices
3 years ago
Ross Wightman
78933122c9
Fix silly typo
3 years ago
Ross Wightman
708d87a813
Fix ViT SAM weight compat as weights at URL changed to not use repr layer. Fix #825 . Tweak optim test.
3 years ago
Ying Jin
20b2d4b69d
Use bicubic interpolation in resize_pos_embed()
3 years ago
Ross Wightman
6d8272e92c
Add SAM pretrained model defs/weights for ViT B16 and B32 models.
3 years ago
Ross Wightman
85f894e03d
Fix ViT in21k representation (pre_logits) layer handling across old and new npz checkpoints
3 years ago
Ross Wightman
b41cffaa93
Fix a few issues loading pretrained vit/bit npz weights w/ num_classes=0 __init__ arg. Missed a few other small classifier handling detail on Mlp, GhostNet, Levit. Should fix #713
3 years ago
Ross Wightman
9c9755a808
AugReg release
3 years ago
Ross Wightman
b319eb5b5d
Update ViT weights, more details to be added before merge.
3 years ago
Ross Wightman
b9cfb64412
Support npz custom load for vision transformer hybrid models. Add posembed rescale for npz load.
3 years ago
Ross Wightman
8880f696b6
Refactoring, cleanup, improved test coverage.
...
* Add eca_nfnet_l2 weights, 84.7 @ 384x384
* All 'non-std' (ie transformer / mlp) models have classifier / default_cfg test added
* Fix #694 reset_classifer / num_features / forward_features / num_classes=0 consistency for transformer / mlp models
* Add direct loading of npz to vision transformer (pure transformer so far, hybrid to come)
* Rename vit_deit* to deit_*
* Remove some deprecated vit hybrid model defs
* Clean up classifier flatten for conv classifiers and unusual cases (mobilenetv3/ghostnet)
* Remove explicit model fns for levit conv, just pass in arg
3 years ago
Ross Wightman
bfc72f75d3
Expand scope of testing for non-std vision transformer / mlp models. Some related cleanup and create fn cleanup for all vision transformer and mlp models. More CoaT weights.
4 years ago
Ross Wightman
30b9880d06
Minor adjustment, mutable default arg, extra check of valid len...
4 years ago
Alexander Soare
8086943b6f
allow resize positional embeddings to non-square grid
4 years ago
Ross Wightman
b2c305c2aa
Move Mlp and PatchEmbed modules into layers. Being used in lots of models now...
4 years ago
Ross Wightman
a0492e3b48
A few miil weights naming tweaks to improve compat with model registry and filtering wildcards.
4 years ago
talrid
19e1b67a84
old spaces
4 years ago
talrid
a443865876
update naming and scores
4 years ago
talrid
cf0e371594
84_0
4 years ago
talrid
0968bdeca3
vit, tresnet and mobilenetV3 ImageNet-21K-P weights
4 years ago
Ross Wightman
f606c45c38
Add Swin Transformer models from https://github.com/microsoft/Swin-Transformer
4 years ago
Ross Wightman
bf2ca6bdf4
Merge jax and original weight init
4 years ago
Ross Wightman
acbd698c83
Update README.md with updates. Small tweak to head_dist handling.
4 years ago
Ross Wightman
288682796f
Update benchmark script to add precision arg. Fix some downstream (DeiT) compat issues with latest changes. Bump version to 0.4.7
4 years ago
Ross Wightman
ea9c9550b2
Fully move ViT hybrids to their own file, including embedding module. Remove some extra DeiT models that were for benchmarking only.
4 years ago
Ross Wightman
a5310a3451
Merge remote-tracking branch 'origin/benchmark-fixes-vit_hybrids' into pit_and_vit_update
4 years ago
Ross Wightman
7953e5d11a
Fix pos_embed scaling for ViT and num_classes != 1000 for pretrained distilled deit and pit models. Fix #426 and fix #433
4 years ago
Ross Wightman
a760a4c3f4
Some ViT cleanup, merge distilled model with main, fixup torchscript support for distilled models
4 years ago
Ross Wightman
cf5fec5047
Cleanup experimental vit weight init a bit
4 years ago
Ross Wightman
cbcb76d72c
Should have included Conv2d layers in original weight init. Lets see what the impact is...
4 years ago
Ross Wightman
4de57ccf01
Add weight init scheme that's closer to JAX impl
4 years ago
Ross Wightman
45c048ba13
A few minor fixes and bit more cleanup on the huggingface hub integration.
4 years ago
Ross Wightman
d584e7f617
Support for huggingface hub via create_model and default_cfgs.
...
* improve consistency of model creation helper fns
* add comments to some of the model helpers
* support passing external default_cfgs so they can be sourced from hub
4 years ago
Ross Wightman
17cdee7354
Fix C&P patch_size error, and order of op patch_size arg resolution bug. Remove a test vit model.
4 years ago
Ross Wightman
0706d05d52
Benchmark models listed in txt file. Add more hybrid vit variants for testing
4 years ago
Ross Wightman
de97be9146
Spell out diff between my small and deit small vit models.
4 years ago
Ross Wightman
f0ffdf89b3
Add numerous experimental ViT Hybrid models w/ ResNetV2 base. Update the ViT naming for hybrids. Fix #426 for pretrained vit resizing.
4 years ago
Ross Wightman
5a8e1e643e
Initial Normalizer-Free Reg/ResNet impl. A bit of related layer refactoring.
4 years ago
Ross Wightman
bb50ac4708
Add DeiT distilled weights and distilled model def. Remove some redudant ViT model args.
4 years ago
Ross Wightman
c16e965037
Add some ViT comments and fix a few minor issues.
4 years ago
Ross Wightman
55f7dfa9ea
Refactor vision_transformer entrpy fns, add pos embedding resize support for fine tuning, add some deit models for testing
4 years ago
Ross Wightman
855d6cc217
More dataset work including factories and a tensorflow datasets (TFDS) wrapper
...
* Add parser/dataset factory methods for more flexible dataset & parser creation
* Add dataset parser that wraps TFDS image classification datasets
* Tweak num_classes handling bug for 21k models
* Add initial deit models so they can be benchmarked in next csv results runs
4 years ago
Ross Wightman
ce69de70d3
Add 21k weight urls to vision_transformer. Cleanup feature_info for preact ResNetV2 (BiT) models
4 years ago
Ross Wightman
231d04e91a
ResNetV2 pre-act and non-preact model, w/ BiT pretrained weights and support for ViT R50 model. Tweaks for in21k num_classes passing. More to do... tests failing.
4 years ago
Ross Wightman
b401952caf
Add newly added vision transformer large/base 224x224 weights ported from JAX official repo
4 years ago
Ross Wightman
61200db0ab
in_chans=1 working w/ pretrained weights for vision_transformer
4 years ago
Ross Wightman
f591e90b0d
Make sure num_features attr is present in vit models as with others
4 years ago