Ross Wightman
7a9c6811c9
Add eps arg to LayerNorm2d, add 'tf' (tensorflow) variant of trunc_normal_ that applies scale/shift after sampling (instead of needing to move a/b)
2 years ago
Ross Wightman
82c311d082
Add more experimental darknet and 'cs2' darknet variants (different cross stage setup, closer to newer YOLO backbones) for train trials.
2 years ago
Ross Wightman
07d0c4ae96
Improve repr for DropPath module
2 years ago
Ross Wightman
e27c16b8a0
Remove unecessary code for synbn guard
2 years ago
Ross Wightman
0da3c9ebbf
Remove SiLU layer in default args that breaks import on old old PyTorch
2 years ago
Ross Wightman
879df47c0a
Support BatchNormAct2d for sync-bn use. Fix #1254
2 years ago
Ross Wightman
4b30bae67b
Add updated vit_relpos weights, and impl w/ support for official swin-v2 differences for relpos. Add bias control support for MLP layers
3 years ago
jjsjann123
f88c606fcf
fixing channels_last on cond_conv2d; update nvfuser debug env variable
3 years ago
Ross Wightman
f670d98cb8
Make a few more layers symbolically traceable (remove from FX leaf modules)
...
* remove dtype kwarg from .to() calls in EvoNorm as it messed up script + trace combo
* BatchNormAct2d always uses custom forward (cut & paste from original) instead of super().forward. Fixes #1176
* BlurPool groups==channels, no need to use input.dim[1]
3 years ago
Ross Wightman
b049a5c5c6
Merge remote-tracking branch 'origin/master' into norm_norm_norm
3 years ago
Ross Wightman
9440a50c95
Merge branch 'mrT23-master'
3 years ago
Ross Wightman
372ad5fa0d
Significant model refactor and additions:
...
* All models updated with revised foward_features / forward_head interface
* Vision transformer and MLP based models consistently output sequence from forward_features (pooling or token selection considered part of 'head')
* WIP param grouping interface to allow consistent grouping of parameters for layer-wise decay across all model types
* Add gradient checkpointing support to a significant % of models, especially popular architectures
* Formatting and interface consistency improvements across models
* layer-wise LR decay impl part of optimizer factory w/ scale support in scheduler
* Poolformer and Volo architectures added
3 years ago
Ross Wightman
95cfc9b3e8
Merge remote-tracking branch 'origin/master' into norm_norm_norm
3 years ago
Ross Wightman
656757d26b
Fix MobileNetV2 head conv size for multiplier < 1.0. Add some missing modification copyrights, fix starting date of some old ones.
3 years ago
Ross Wightman
b27c21b09a
Update drop_path and drop_block (fast impl) to be symbolically traceable, slightly faster
3 years ago
Ross Wightman
214c84a235
Disable use of timm nn.Linear wrapper since AMP autocast + torchscript use appears fixed
3 years ago
Ross Wightman
a52a614475
Remove layer experiment which should not have been added
3 years ago
Ross Wightman
ab49d275de
Significant norm update
...
* ConvBnAct layer renamed -> ConvNormAct and ConvNormActAa for anti-aliased
* Significant update to EfficientNet and MobileNetV3 arch to support NormAct layers and grouped conv (as alternative to depthwise)
* Update RegNet to add Z variant
* Add Pre variant of XceptionAligned that works with NormAct layers
* EvoNorm matches bits_and_tpu branch for merge
3 years ago
Ross Wightman
d04f2f1377
Update drop_path and drop_block (fast impl) to be symbolically traceable, slightly faster
3 years ago
Ross Wightman
834a9ec721
Disable use of timm nn.Linear wrapper since AMP autocast + torchscript use appears fixed
3 years ago
Ross Wightman
78912b6375
Updated EvoNorm implementations with some experimentation. Add FilterResponseNorm. Updated RegnetZ and ResNetV2 model defs for trials.
3 years ago
talrid
c11f4c3218
support CNNs
3 years ago
mrT23
d6701d8a81
Merge branch 'rwightman:master' into master
3 years ago
Ross Wightman
480c676ffa
Fix FX breaking assert in evonorm
3 years ago
talrid
41559247e9
use_ml_decoder_head
3 years ago
Ross Wightman
93cc08fdc5
Make evonorm variables 1d to match other PyTorch norm layers, will break weight compat for any existing use (likely minimal, easy to fix).
3 years ago
Ross Wightman
af607b75cc
Prep a set of ResNetV2 models with GroupNorm, EvoNormB0, EvoNormS0 for BN free model experiments on TPU and IPU
3 years ago
Ross Wightman
c976a410d9
Add ResNet-50 w/ GN (resnet50_gn) and SEBotNet-33-TS (sebotnet33ts_256) model defs and weights. Update halonet50ts weights w/ slightly better variant in1k val, more robust to test sets.
3 years ago
Alexander Soare
b25ff96768
wip - pre-rebase
3 years ago
Alexander Soare
e051dce354
Make all models FX traceable
3 years ago
Alexander Soare
0149ec30d7
wip - attempting to rebase
3 years ago
Alexander Soare
bc3d4eb403
wip -rebase
3 years ago
Ross Wightman
2ddef942b9
Better fix for #954 that doesn't break torchscript, pull torch._assert into timm namespace when it exists
3 years ago
Ross Wightman
4f0f9cb348
Fix #954 by bringing traceable _assert into timm to allow compat w/ PyTorch < 1.8
3 years ago
Ross Wightman
b745d30a3e
Fix formatting of last commit
3 years ago
Ross Wightman
3478f1d7f1
Traceability fix for vit models for some experiments
3 years ago
Ross Wightman
f658a72e72
Cleanup re-use of Dropout modules in Mlp modules after some twitter feedback :p
3 years ago
Ross Wightman
c02334d9fa
Add weights for regnetz_d and haloregnetz_c, update regnetz_c weights. Add commented PyTorch XLA code for halo attention
3 years ago
Ross Wightman
02daf2ab94
Add option to include relative pos embedding in the attention scaling as per references. See discussion #912
3 years ago
Ross Wightman
e2b8d44ff0
Halo, bottleneck attn, lambda layer additions and cleanup along w/ experimental model defs
...
* align interfaces of halo, bottleneck attn and lambda layer
* add qk_ratio to all of above, control q/k dim relative to output dim
* add experimental haloregnetz, and trionet (lambda + halo + bottle) models
3 years ago
Ross Wightman
007bc39323
Some halo and bottleneck attn code cleanup, add halonet50ts weights, use optimal crop ratios
3 years ago
Ross Wightman
b1c2e3eb92
Match rel_pos_indices attr rename in conv branch
3 years ago
Ross Wightman
b49630a138
Add relative pos embed option to LambdaLayer, fix last transpose/reshape.
3 years ago
Ross Wightman
b81e79aae9
Fix bottleneck attn transpose typo, hopefully these train better now..
3 years ago
Ross Wightman
515121cca1
Use reshape instead of view in std_conv, causing issues in recent PyTorch in channels_last
3 years ago
Ross Wightman
5bd04714e4
Cleanup weight init for byob/byoanet and related
3 years ago
Ross Wightman
8642401e88
Swap botnet 26/50 weights/models after realizing a mistake in arch def, now figuring out why they were so low...
3 years ago
Ross Wightman
5f12de4875
Add initial AttentionPool2d that's being trialed. Fix comment and still trying to improve reliability of sgd test.
3 years ago
Ross Wightman
492c0a4e20
Update HaloAttn comment
3 years ago
Ross Wightman
3b9032ea48
Use Tensor.unfold().unfold() for HaloAttn, fast like as_strided but more clarity
3 years ago